Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Upsonic records every model call into a single centralized usage registry. Tokens, cost, requests, tool calls, and timing are written exactly once per call, keyed by a unique entry_id, and tagged with the scopes the call belongs to. Reading metrics anywhere in the framework is a derived view over those rows — no manual rollups, no double-counting on retry. You never interact with the registry directly. Instead, every surface that has metrics exposes a single read-only .usage property:
agent.usage       # rolled up across every model call this agent participated in
task.usage        # filtered to this task's scope
chat.usage        # filtered to this chat session
team.usage        # filtered across a team and its members
output.usage      # the per-run snapshot returned alongside an agent run
All five return the same shape — an AggregatedUsage view — so once you learn the fields, you can read them from anywhere.

The AggregatedUsage shape

AggregatedUsage is a read-only dataclass derived from the registry on each access.
FieldTypeDescription
input_tokensintPrompt/input tokens
output_tokensintCompletion/output tokens
total_tokensintinput_tokens + output_tokens
cache_read_tokensintTokens read from prompt cache
cache_write_tokensintTokens written to prompt cache
reasoning_tokensintChain-of-thought / reasoning tokens
requestsintNumber of model requests
tool_callsintNumber of tool calls
costfloat | NoneSum of cost_usd across contributing entries; None if nothing priced
durationfloatSum of per-call durations recorded on entries
model_execution_timefloatTime spent inside model calls
tool_execution_timefloatTime spent inside tool calls
upsonic_execution_timefloatFramework overhead = duration − model − tool
time_to_first_tokenfloat | NoneEarliest TTFT across contributing entries
entry_countintNumber of contributing UsageEntry rows
modelslist[str]Distinct models that contributed, first-seen order
u = agent.usage
print(u.input_tokens, u.output_tokens, u.cost, u.models)
print(u.to_dict())  # JSON-friendly flat dict for logs/dashboards
cost is None (rather than 0.0) when no contributing entry was priced. 0.0 means at least one entry was priced and the total came out free.

Scope tags

Every recorded UsageEntry carries scope tags so the same row can be filtered into multiple views:
TagSet byVisible as
chat_usage_idChat sessionchat.usage
agent_usage_idAgent instanceagent.usage
task_usage_idTasktask.usage
team_usage_idTeamteam.usage
workflow_usage_idStateGraph / workflow(registry queries)
system_usage_idSystem-level groupings(registry queries)
run_idPer-run identifieroutput.usage (one run)
user_idPer-user identifier(registry queries)
Scope tags are propagated through Python contextvars. Sub-pipeline LLM calls — memory summarization, reliability validator/editor, culture checks, policy enforcement, sub-agents — automatically inherit the parent’s tags, so their spend rolls up into agent.usage, chat.usage, etc., without any explicit propagation step.

Idempotency and retries

The registry is keyed by entry_id. Re-recording an entry with the same id replaces the prior row rather than adding a second one. Retried requests therefore never double-count, and there is no separate baseline/snapshot machinery to keep in sync.

Persistence

When you configure a storage backend on Chat, recorded entries are persisted alongside the conversation. Re-opening the same session_id re-hydrates the registry, so chat.usage continues from where it left off across processes and restarts. Supported backends: InMemory, JSON, SQLite, PostgreSQL, MongoDB, Redis.

Examples

Per-task vs. per-agent

from upsonic import Agent, Task

agent = Agent("anthropic/claude-sonnet-4-5")
t1 = Task("Say hello.")
t2 = Task("Say goodbye.")
agent.do(t1)
agent.do(t2)

print(t1.usage.total_tokens, t2.usage.total_tokens)  # per-task
print(agent.usage.total_tokens)                       # both tasks + any sub-pipeline calls

Chat sessions

import asyncio
from upsonic import Agent, Chat

async def main():
    chat = Chat(session_id="s1", user_id="u1", agent=Agent("anthropic/claude-sonnet-4-5"))
    await chat.invoke("Hello")
    await chat.invoke("How are you?")

    u = chat.usage
    if u.cost is not None:
        print(f"${u.cost:.4f} across {u.requests} requests using {u.models}")
    print(f"Wall-clock session length: {chat.duration:.1f}s")

asyncio.run(main())

Teams

from upsonic import Agent, Team

team = Team(agents=[Agent("openai/gpt-4o-mini"), Agent("anthropic/claude-sonnet-4-5")])
team.do("Plan and review a small feature.")

print(team.usage.to_dict())  # spend across every member + sub-pipeline call

Migration from the legacy surface

The following legacy surfaces have been removed in favour of .usage. If you have older code, the replacements are:
LegacyReplacement
task.price_id, task.get_total_cost(), task.total_input_token, task.total_output_tokentask.task_usage_id, task.usage.X
task.duration, task.model_execution_time, task.tool_execution_time, task.upsonic_execution_timetask.usage.X (per-call sums); for wall-clock use task.end_time - task.start_time
agent.cost (dict)agent.usage.to_dict()
chat.input_tokens, chat.output_tokens, chat.total_tokens, chat.total_cost, chat.total_requests, chat.total_tool_calls, chat.run_duration, chat.time_to_first_tokenchat.usage.X
chat.get_usage(), chat.get_session_metrics(), chat.get_session_summary()chat.usage; for message count len(chat.all_messages); for wall-clock chat.duration
SessionMetrics dataclasschat.usage + chat.duration + len(chat.all_messages)
UPSONIC_LEGACY_USAGE env flag(removed; the unified registry is always on)