Usage Tracking¶
techrevati.runtime.usage_tracking ¶
Usage Tracking — Per-model cost estimation and budget tracking.
Pricing data is loaded from data/pricing.json. Callers may override or extend it at runtime via register_pricing() / load_pricing_from_file().
UsageBoundExceededError ¶
Bases: Exception
Base class for cost/budget/limit overrun errors.
Catching this catches both BudgetExceededError (cost overrun)
and UsageLimitExceededError (token / tool-call overrun) — a
softer migration target than catching one of the two directly.
BudgetExceededError ¶
Bases: UsageBoundExceededError
Raised when cumulative usage cost exceeds a configured budget.
Carries the offending budget and current cost so callers can decide how to recover (escalate to human, switch to cheaper model, abort).
UsageLimitExceededError ¶
Bases: UsageBoundExceededError
Raised when a non-cost usage dimension exceeds its configured cap.
Distinct from BudgetExceededError because the failure mode is
different (we hit a per-session token / tool-call cap, not a $$
cap), and recovery options diverge: token caps usually mean
"abort this loop", not "switch to a cheaper model".
UsageLimits
dataclass
¶
UsageLimits(request_tokens_max=None, response_tokens_max=None, total_tokens_max=None, tool_calls_max=None, cost_usd_max=None)
Per-session caps on each usage dimension.
Set any subset of the fields; None means no limit for that
dimension. UsageTracker.check_limits evaluates the limits
against cumulative usage after each turn and raises
UsageLimitExceededError on the first overrun.
Mirrors Pydantic AI's UsageLimits shape so callers porting
between the two get a familiar surface; the names match exactly
on purpose.
ModelPricing
dataclass
¶
ModelPricing(input_per_million, output_per_million, cache_write_per_million=0.0, cache_read_per_million=0.0, cache_write_5min_per_million=0.0, cache_write_1h_per_million=0.0)
Per-million-token pricing for a model.
Cache pricing tiers reflect the 2026 ephemeral-cache shape used
by major providers: 5-minute ephemeral cache writes are ~1.25x
the input price, 1-hour writes are ~2x, reads are ~0.1x. Default
cache_write_per_million is the historical / single-tier value
and is used when the caller's UsageSnapshot.cache_ttl is
None.
write_rate_for_ttl ¶
Return the per-million write rate for the given TTL hint.
ttl=None → fall back to cache_write_per_million (legacy
single-tier). "5m" and "1h" resolve to the 2026
ephemeral tiers; unknown values also fall back so a
misconfigured UsageSnapshot.cache_ttl doesn't crash cost
calculation.
PricingAlreadyRegisteredError ¶
Bases: ValueError
Raised on re-registration when on_conflict='error'.
UsageSnapshot
dataclass
¶
UsageSnapshot(input_tokens=0, output_tokens=0, cache_write_tokens=0, cache_read_tokens=0, cache_ttl=None, tool_calls=0)
Token usage for a single turn.
cache_ttl is the optional ephemeral-cache hint
("5m" / "1h" / None) used to select between the
cache-write pricing tiers. Leave it None for the
legacy single-tier behavior.
UsageTracker
dataclass
¶
Cumulative usage tracking with cost estimation.
check_limits ¶
Raise UsageLimitExceededError on the first cap overrun.
Order matches the dataclass declaration so a deterministic
failure is reported even when multiple dimensions are over at
once. cost_usd_max is handled here too so callers using
UsageLimits can rely on a single check call; if both
budget_usd (on the session) and cost_usd_max are
configured, UsageLimits wins because it's the newer API.
has_pricing ¶
Check whether pricing is registered for a model.
Matches by exact name (case-insensitive) or longest-prefix, mirroring the resolution behavior of cost calculations. Returns False if the lookup falls back to the zero-cost default.
register_pricing ¶
Register or override pricing for a model. Thread-safe.
Matches are case-insensitive; the key is normalized to lower-case.
on_conflict controls behavior when model is already in the
pricing table:
"overwrite"(default, preserves 0.2.0 behavior) — replace the existing entry."error"— raisePricingAlreadyRegisteredError. Use this in startup wiring where double-registration signals a configuration bug."keep"— leave the existing entry; the new pricing is dropped silently. Useful for "register defaults if not present" patterns.
load_pricing_from_file ¶
Load and merge pricing entries from a JSON file.
Same schema as the bundled data/pricing.json. Existing entries are overwritten on conflict.