Skip to content

Usage Tracking

techrevati.runtime.usage_tracking

Usage Tracking — Per-model cost estimation and budget tracking.

Pricing data is loaded from data/pricing.json. Callers may override or extend it at runtime via register_pricing() / load_pricing_from_file().

UsageBoundExceededError

Bases: Exception

Base class for cost/budget/limit overrun errors.

Catching this catches both BudgetExceededError (cost overrun) and UsageLimitExceededError (token / tool-call overrun) — a softer migration target than catching one of the two directly.

BudgetExceededError

BudgetExceededError(budget_usd, current_cost_usd)

Bases: UsageBoundExceededError

Raised when cumulative usage cost exceeds a configured budget.

Carries the offending budget and current cost so callers can decide how to recover (escalate to human, switch to cheaper model, abort).

UsageLimitExceededError

UsageLimitExceededError(limit_name, observed, ceiling)

Bases: UsageBoundExceededError

Raised when a non-cost usage dimension exceeds its configured cap.

Distinct from BudgetExceededError because the failure mode is different (we hit a per-session token / tool-call cap, not a $$ cap), and recovery options diverge: token caps usually mean "abort this loop", not "switch to a cheaper model".

UsageLimits dataclass

UsageLimits(request_tokens_max=None, response_tokens_max=None, total_tokens_max=None, tool_calls_max=None, cost_usd_max=None)

Per-session caps on each usage dimension.

Set any subset of the fields; None means no limit for that dimension. UsageTracker.check_limits evaluates the limits against cumulative usage after each turn and raises UsageLimitExceededError on the first overrun.

Mirrors Pydantic AI's UsageLimits shape so callers porting between the two get a familiar surface; the names match exactly on purpose.

ModelPricing dataclass

ModelPricing(input_per_million, output_per_million, cache_write_per_million=0.0, cache_read_per_million=0.0, cache_write_5min_per_million=0.0, cache_write_1h_per_million=0.0)

Per-million-token pricing for a model.

Cache pricing tiers reflect the 2026 ephemeral-cache shape used by major providers: 5-minute ephemeral cache writes are ~1.25x the input price, 1-hour writes are ~2x, reads are ~0.1x. Default cache_write_per_million is the historical / single-tier value and is used when the caller's UsageSnapshot.cache_ttl is None.

write_rate_for_ttl

write_rate_for_ttl(ttl)

Return the per-million write rate for the given TTL hint.

ttl=None → fall back to cache_write_per_million (legacy single-tier). "5m" and "1h" resolve to the 2026 ephemeral tiers; unknown values also fall back so a misconfigured UsageSnapshot.cache_ttl doesn't crash cost calculation.

PricingAlreadyRegisteredError

PricingAlreadyRegisteredError(model)

Bases: ValueError

Raised on re-registration when on_conflict='error'.

UsageSnapshot dataclass

UsageSnapshot(input_tokens=0, output_tokens=0, cache_write_tokens=0, cache_read_tokens=0, cache_ttl=None, tool_calls=0)

Token usage for a single turn.

cache_ttl is the optional ephemeral-cache hint ("5m" / "1h" / None) used to select between the cache-write pricing tiers. Leave it None for the legacy single-tier behavior.

UsageTracker dataclass

UsageTracker(turns=list())

Cumulative usage tracking with cost estimation.

check_limits

check_limits(limits)

Raise UsageLimitExceededError on the first cap overrun.

Order matches the dataclass declaration so a deterministic failure is reported even when multiple dimensions are over at once. cost_usd_max is handled here too so callers using UsageLimits can rely on a single check call; if both budget_usd (on the session) and cost_usd_max are configured, UsageLimits wins because it's the newer API.

has_pricing

has_pricing(model)

Check whether pricing is registered for a model.

Matches by exact name (case-insensitive) or longest-prefix, mirroring the resolution behavior of cost calculations. Returns False if the lookup falls back to the zero-cost default.

register_pricing

register_pricing(model, pricing, *, on_conflict='overwrite')

Register or override pricing for a model. Thread-safe.

Matches are case-insensitive; the key is normalized to lower-case.

on_conflict controls behavior when model is already in the pricing table:

  • "overwrite" (default, preserves 0.2.0 behavior) — replace the existing entry.
  • "error" — raise PricingAlreadyRegisteredError. Use this in startup wiring where double-registration signals a configuration bug.
  • "keep" — leave the existing entry; the new pricing is dropped silently. Useful for "register defaults if not present" patterns.

load_pricing_from_file

load_pricing_from_file(path)

Load and merge pricing entries from a JSON file.

Same schema as the bundled data/pricing.json. Existing entries are overwritten on conflict.