> ## Documentation Index
> Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Context Compression

> Automatic context window management for long-running agent conversations

The `Agent` class includes built-in context management that automatically handles context window overflow during long conversations. When enabled, the middleware monitors token usage and applies reduction strategies before the context exceeds the model's limit.

## How It Works

Context management applies three strategies in order when the context window is exceeded:

1. **Prune old tool calls** — Removes old tool call/return pairs, keeping only the most recent ones.
2. **LLM summarization** — Summarizes older messages into condensed, structured messages via the LLM while keeping recent messages verbatim.
3. **Context full response** — If the context is still full after all strategies, returns a fixed message indicating the context limit has been reached.

## Usage

```python theme={null}
from upsonic import Agent, Task

agent = Agent(
    model="openai/gpt-4o-mini",
    context_management=True,           # Enabled by default
    context_management_keep_recent=5,   # Number of recent messages to always preserve
    context_management_model="anthropic/claude-sonnet-4-5"   # Model for context managing
)

# Task with potentially long context
long_text = "..." * 100000

task = Task(f"Summarize this text: {long_text}")
result = agent.do(task)
print(result)
```

## Parameters

| Parameter                        | Type   | Default | Description                                                                                    |
| -------------------------------- | ------ | ------- | ---------------------------------------------------------------------------------------------- |
| `context_management`             | `bool` | `True`  | Enable or disable automatic context management.                                                |
| `context_management_keep_recent` | `int`  | `5`     | Number of recent messages (and tool call events) to preserve during pruning and summarization. |

<Note>
  Context management uses a 90% safety margin of the model's maximum context window. Token estimation relies on actual `usage` data from model responses when available, falling back to a character-based heuristic otherwise.
</Note>
