agentix.context¶
context ¶
Context management — keep the working transcript from growing unbounded.
A long agentic run accumulates a turn per step; without bounds, memory grows and
the provider context window eventually overflows. A :class:ContextStrategy is
applied to the message list before each model call and returns a (possibly
smaller) list. Strategies are opt-in: with none, the full transcript is kept.
Pairing safety. Providers like Anthropic require every tool result to follow
the assistant tool_use that produced it. The shipped strategies never split
that pair: :class:TrimRounds drops whole rounds (an assistant tool-turn plus
its tool results), and :class:TruncateToolOutputs only shrinks content in
place. Write custom strategies with the same invariant.
ContextStrategy ¶
Base strategy. Override :meth:compact; the default is a no-op.
Return the same list object when nothing changed (lets the loop skip the
on_compact event).
TrimRounds ¶
Bases: ContextStrategy
Keep the system prompt, the user's task, and the most recent
max_rounds tool rounds — drop older ones.
TruncateToolOutputs ¶
Bases: ContextStrategy
Shrink any tool-result message longer than max_chars in place.
Preserves every message and all tool pairing — only the content of large tool outputs is clipped. Idempotent (won't re-clip already-clipped text).
HeuristicTokenCounter ¶
A dependency-free token estimate: ceil(len(text) / chars_per_token).
The 4.0 default is a reasonable average for English prose and code. It is an
estimate — pass a real tokenizer (e.g. tiktoken for OpenAI, or a
provider counter) when you need exactness. Erring slightly high keeps
context-window trimming on the safe side.
FitContextWindow ¶
FitContextWindow(
max_tokens: int,
counter: TokenCounter = approx_token_counter,
*,
reserve_tokens: int = 0,
per_message_overhead: int = 4,
tokens_per_media: int = 600,
)
Bases: ContextStrategy
Keep the transcript under a token budget — the unit the model's
context window is actually measured in (unlike :class:TrimRounds, which
counts rounds, or :class:TruncateToolOutputs, which counts characters).
Always keeps the system prompt(s) and the user's task, then keeps as many of
the most recent tool rounds as fit max_tokens - reserve_tokens — dropping
older whole rounds (never splitting a tool call from its result). Use
reserve_tokens to leave room for the model's response. If even the fixed
prefix plus the latest round overflows, that minimum is kept anyway (a
required tool turn can't be dropped) — layer :class:TruncateToolOutputs
before this to shrink large tool outputs first.
count_message_tokens ¶
count_message_tokens(
message: Message,
counter: TokenCounter = approx_token_counter,
*,
per_message_overhead: int = 4,
tokens_per_media: int = 600,
) -> int
Estimate the tokens a single message contributes: its text, any tool calls it carries, a flat estimate per media part, and a small per-message framing overhead (roles/delimiters the provider adds).
count_tokens ¶
count_tokens(
messages: list[Message],
counter: TokenCounter = approx_token_counter,
*,
per_message_overhead: int = 4,
tokens_per_media: int = 600,
) -> int
Estimate the total tokens of a transcript (sum over its messages).