Changelog¶
Changelog¶
All notable changes to this project are documented here. The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
0.5.0 - 2026-06-24¶
Added¶
- Serving helpers (P23) —
agentix.servingturns anAgentinto a streaming HTTP endpoint. Dependency-free serializers map thestream()events to Server-Sent Events / NDJSON (event_to_dict,sse_events,ndjson_events,outcome_to_payload), and a thin lazy-imported FastAPI/Starlette adapter (sse_response/ndjson_response, extraagentix[serving]) wraps them in aStreamingResponse.outcome_to_payloadserializes a run (including asuspendedrun'spendingapprovals) for a request/response +/approveflow. Seeexamples/30_serving_fastapi.py.
Changed¶
- Friendlier, plain-language package summary and README intro (consistent with the docs voice). No API changes.
0.4.1 - 2026-06-24¶
Added¶
- Discoverability — the docs site now publishes
llms.txtandllms-full.txt(a curated and a full-text, LLM-friendly index of the docs) so AI coding assistants can read it cleanly; broader PyPIkeywordsand troveclassifiers(AI / async / typed); PyPI project links (Documentation, Changelog, Issues) so the docs link shows on the project page; and GitHub repo topics. - Documentation site (P17) — a Material for MkDocs site with a beginner-friendly
getting-started, ~12 task guides (each linking a runnable example), a plain-
language security model writeup, and a full API reference generated from
the docstrings (mkdocstrings). Builds are gated in CI (
mkdocs build --strict) and auto-deploy to GitHub Pages. Adds adocsdependency group; no library code changes.
0.4.0 - 2026-06-24¶
Added¶
- Tier C polish (one PR): eval dataset loaders —
load_cases(path)readsCases from.jsonl/.json/.csv(extra keys/columns fold intometadata); record/replay cassettes —CassetteModelrecords a real model's responses to a JSON file and replays them deterministically (mode="auto"), for fast offline tests; subagent cost roll-up — a tool may return aToolResultcarryingcost_usd/tokens_used, andsubagent_toolnow does, so a child run's spend is added to the parent'sAgentOutcometotals;instrument(agent)— one call wraps the model inTracingModeland mergestracing_eventsinto the agent's events (existing callbacks preserved). Seeexamples/29_cassettes.pyandexamples/13_subagents.py. - First-class structured output (P21) —
Agent(response_model=…)(a Pydantic model class or a raw JSON-Schema dict) wires the whole path in one knob: an output validator sooutcome.parsedis typed/validated (re-prompting on failure), a schema instruction injected into the system context (any model conforms), and native provider enforcement when the adapter supports it via a newwith_response_format(schema)(Anthropicoutput_config.format, OpenAI/ LiteLLMresponse_format, Geminiresponse_schema, Ollamaformat; composes throughRetryModel/FallbackModel). Seeexamples/27_structured_output.py. - Rate-limit-aware retries (P22) —
RetryModelnow honors a provider'sRetry-After(viaretry_after=, defaultdefault_retry_afterreading the error attribute or response header) instead of blind exponential backoff, capped bymax_sleep, with anon_retryhook to surface waits. Falls back to exponential backoff when no hint is present. Seeexamples/28_rate_limit.py. - Pluggable memory (P20) — a
Memoryprotocol for cross-session recall (recall(query)/write(content));MemoryRecord; and a dependency-freeInMemoryMemorydefault with keyword-overlap recall (+dump/loadfor persistence).Agent(memory=…, memory_limit=…)recalls before each run/stream and injects the records as trusted system context;remember_exchange=Truepersists each completed exchange. agentix owns the interface — bring your own vector DB / search index as the backend. Seeexamples/26_memory.py. - Token-accurate context (P19) — a
TokenCounterabstraction (Callable[[str], int]) with a dependency-freeHeuristicTokenCounter/approx_token_counterdefault, transcript counting (count_tokens,count_message_tokens; text + tool calls + a per-media estimate + per-message overhead), andFitContextWindow— a context strategy that trims to a real token budget (dropping oldest whole rounds, preserving tool pairing, withreserve_tokensheadroom) instead of counting rounds/characters. Pass any tokenizer (e.g.tiktoken, a provider counter) as thecounter. Seeexamples/25_token_context.py.
0.3.0 - 2026-06-23¶
Added¶
- Suspendable human-in-the-loop (P18) —
Agent(suspend_on_confirm=True)pauses a run when a tool needs confirmation instead of awaitingconfirm_fninline: it checkpoints (with the assistant tool-turn as the tail) and returnsAgentOutcome(status="suspended", pending=[PendingApproval(...)]). A laterresume(run_id, decisions={call_id: bool})— on the same or a brand-new Agent, since the state lives in the store — finishes that turn (approve/deny; undecided pending calls fail closed) and continues. Requires a store + run_id; adds theon_suspendevent and thePendingApprovaltype. Built for web/serverless flows where the request coroutine can't block. Seeexamples/24_suspend_resume.py. - Sandboxed execution (P16) —
SubprocessExecutor(aToolExecutor) runs each tool as a separate OS process and actually enforces the limits the loop passes: network egress is denied whennetwork_allowlistis empty (Linux network namespace viaunshare, auto-detected; fails closed if it can't isolate, unlessrequire_network_isolation=False), plus POSIX CPU/memory/file-size/ process rlimits, a fresh per-call temp working directory, a scrubbed environment (no parent secrets leak), an output cap, and a timeout that kills the process group. ShipsSandboxPolicyandCommand. This closes the gap whereLocalToolExecutorignorednetwork_allowlist. Seeexamples/23_sandbox.py. - Multimodal input (P15) —
Message.contentis nowstr | list[ContentPart], withTextPart,ImagePart,DocumentPart, andAudioPart(build viafrom_path/from_bytes/from_base64/from_url).Message.textgives a string view.Agent.run/run_sync/streamaccept a parts list anywhere a string request goes. Every adapter translates supported media to its provider format and raises a clear error otherwise (e.g. audio on Anthropic, URL images on Bedrock). Plain-string content is fully backward compatible. Seeexamples/22_multimodal.py. - Multi-provider adapters (P14) — the toolkit now ships five more model
backends alongside Anthropic, each behind its own extra and each a drop-in
ModelFn:OpenAIModel(agentix[openai]; Chat Completions, also drives any OpenAI-compatiblebase_url, with streaming),GeminiModel(agentix[gemini]),BedrockModel(agentix[bedrock]; AWS Converse API, run off-thread),OllamaModel(agentix[ollama]; local models), andLiteLLMModel(agentix[litellm]; one bridge to 100+ providers). Best-effort pricing added for common OpenAI/Gemini models (override withregister_price). Seeexamples/21_providers.py. AnthropicModeltyped reasoning/cost knobs:thinking(True/"adaptive"/"summarized"/"disabled"/dict),effort(low…max), andtask_budget(int; adds the required beta header) — previously only via opaqueextra. Docstring documents refusal-fallback behavior.PromptRegistry: lightweight in-process prompt versioning withregister/get/rollback/renderandto_dict/from_dictpersistence.
0.2.1 - 2026-06-23¶
Fixed¶
agentix.__version__now reflects the installed distribution version (derived from package metadata) instead of a hardcoded string that could drift. (0.2.0 shipped reporting0.1.0.)
0.2.0 - 2026-06-23¶
Added¶
- Subagents:
subagent_tool(agent, ...)exposes a child agent as a delegable tool (its own model/system prompt/tools/guards); composes with the loop andbounded_gather. - Cost & control: USD cost tracking (
pricingmodule,cost_usd, andcost_usdonModelResponse/AgentOutcome; the Anthropic adapter fillsinput_tokens/output_tokens/cost_usd);AgentPolicy.max_budget_usd; andInterruptto stop a run/stream at a safe boundary. - Dynamic permissions:
CallbackGuard(acan_use_tool-style per-call callback returning allow/deny/confirm) andToolAllowlistGuard(scope a run to a subset of tools). - Output validation + retry:
Agent(output_validator=, max_output_retries=)re-prompts on a failed validation and exposesAgentOutcome.parsed. Shipsjson_output,pydantic_output,regex_output. - Resilient model wrappers:
RetryModel(backoff) andFallbackModel(try-next-on-error), composable and drop-in. - Eval harness (
agentix.evals):evaluate(...)runs an agent overCases and returns anEvalReportwithpass_rate/format_success_rate/assert_pass_rate()(gate CI on regressions). Scorers:exact_match,contains,regex_match,predicate,llm_judge. SelfConsistencyModel: sample a model N times per turn and return the majority vote (drop-inModelFn).JudgeGuard: an LLM reviews the final answer against a rubric and replaces it on failure (anon_answersafety/on-brand/format gate).- Anthropic adapter: structured-output passthrough documented
(
output_config={"format": ...}) andstricttool schemas forwarded. - OpenTelemetry tracing (
agentix[otel]):TracingModel,tracing_events, andtrace_runproduce a span tree (run → model/tool spans) for your observability stack.
0.1.0 - 2026-06-22¶
Initial release.
Core¶
- Async agent loop:
Agent.run/run_sync/stream/resume, with step and token budgets. - Provider-agnostic
ModelFn; tool schemas flow to the model. @tooldecorator generating JSON Schema from type hints + docstrings;Tool/ToolRegistry.LocalToolExecutor— sync tools run off the event loop; real per-call timeouts.
Security (opt-in guard pipeline)¶
- Trust boundary between user instructions and tool data.
- Guards:
TierGuard,PiiUrlGuard,InjectionGuard,UntrustedDataGuard, fail-closedRecipientTrustGuard, andPiiRedactionGuard(answer egress). - Async-or-sync confirmation;
AgentEventsaudit hooks;secure_defaults().
Providers & streaming¶
- Anthropic adapter (
claude-opus-4-8) with tool use and streaming. - Streaming events:
AnswerDelta/ToolStarted/ToolFinished/Done.
Persistence & scale¶
- Pluggable
Store(MemoryStore, atomic non-blockingFileStore) + JSON codec. Limiterandbounded_gatherfor fleet backpressure.
Integrations & context¶
- MCP client support (
MCPServer,agentix[mcp]): discover an MCP server's tools and use them in an agent. - Context management:
ContextStrategy,TrimRounds,TruncateToolOutputs.
Delegation, cost & control¶
- Subagents:
subagent_toolexposes a child agent as a delegable tool. - Cost:
pricingmodule +cost_usd;ModelResponse/AgentOutcomecarrycost_usd;AgentPolicy.max_budget_usdaborts a run over budget. Interruptstops a run or stream at the next safe boundary.