Skip to content

agentix.context

context

Context management — keep the working transcript from growing unbounded.

A long agentic run accumulates a turn per step; without bounds, memory grows and the provider context window eventually overflows. A :class:ContextStrategy is applied to the message list before each model call and returns a (possibly smaller) list. Strategies are opt-in: with none, the full transcript is kept.

Pairing safety. Providers like Anthropic require every tool result to follow the assistant tool_use that produced it. The shipped strategies never split that pair: :class:TrimRounds drops whole rounds (an assistant tool-turn plus its tool results), and :class:TruncateToolOutputs only shrinks content in place. Write custom strategies with the same invariant.

ContextStrategy

Base strategy. Override :meth:compact; the default is a no-op.

Return the same list object when nothing changed (lets the loop skip the on_compact event).

TrimRounds

TrimRounds(max_rounds: int)

Bases: ContextStrategy

Keep the system prompt, the user's task, and the most recent max_rounds tool rounds — drop older ones.

TruncateToolOutputs

TruncateToolOutputs(
    max_chars: int, *, marker: str = "...[truncated]"
)

Bases: ContextStrategy

Shrink any tool-result message longer than max_chars in place.

Preserves every message and all tool pairing — only the content of large tool outputs is clipped. Idempotent (won't re-clip already-clipped text).

HeuristicTokenCounter

HeuristicTokenCounter(chars_per_token: float = 4.0)

A dependency-free token estimate: ceil(len(text) / chars_per_token).

The 4.0 default is a reasonable average for English prose and code. It is an estimate — pass a real tokenizer (e.g. tiktoken for OpenAI, or a provider counter) when you need exactness. Erring slightly high keeps context-window trimming on the safe side.

FitContextWindow

FitContextWindow(
    max_tokens: int,
    counter: TokenCounter = approx_token_counter,
    *,
    reserve_tokens: int = 0,
    per_message_overhead: int = 4,
    tokens_per_media: int = 600,
)

Bases: ContextStrategy

Keep the transcript under a token budget — the unit the model's context window is actually measured in (unlike :class:TrimRounds, which counts rounds, or :class:TruncateToolOutputs, which counts characters).

Always keeps the system prompt(s) and the user's task, then keeps as many of the most recent tool rounds as fit max_tokens - reserve_tokens — dropping older whole rounds (never splitting a tool call from its result). Use reserve_tokens to leave room for the model's response. If even the fixed prefix plus the latest round overflows, that minimum is kept anyway (a required tool turn can't be dropped) — layer :class:TruncateToolOutputs before this to shrink large tool outputs first.

count_message_tokens

count_message_tokens(
    message: Message,
    counter: TokenCounter = approx_token_counter,
    *,
    per_message_overhead: int = 4,
    tokens_per_media: int = 600,
) -> int

Estimate the tokens a single message contributes: its text, any tool calls it carries, a flat estimate per media part, and a small per-message framing overhead (roles/delimiters the provider adds).

count_tokens

count_tokens(
    messages: list[Message],
    counter: TokenCounter = approx_token_counter,
    *,
    per_message_overhead: int = 4,
    tokens_per_media: int = 600,
) -> int

Estimate the total tokens of a transcript (sum over its messages).