agentix.providers.ollama¶
ollama ¶
Ollama model adapter — local models via the native ollama client.
Run open models (Llama, Qwen, Mistral, …) on your own machine. Install with
pip install "agentix[ollama]" and have an Ollama server running
(ollama serve; pull a tool-capable model, e.g. ollama pull llama3.1).
Ollama's chat API is OpenAI-ish but not identical: tool-call arguments are
JSON objects (not strings) and usage fields are named differently, so this
adapter does its own translation rather than reusing the OpenAI-compat helper.
Local inference is free, so cost_usd is always 0.0.
Tip: Ollama also serves an OpenAI-compatible endpoint at /v1 — if you prefer
that surface (or the openai SDK you already use), point
:class:~agentix.providers.openai.OpenAIModel at
base_url="http://localhost:11434/v1" instead.
OllamaModel ¶
OllamaModel(
*,
model: str = DEFAULT_MODEL,
host: str | None = None,
client: Any = None,
**extra: Any,
)
A :class:~agentix.model.ModelFn backed by a local Ollama server.
extra is forwarded to chat (e.g. options={"temperature": 0},
keep_alive, format for JSON mode). For tests, inject client= —
any object exposing an async chat(**kwargs).
with_response_format ¶
Return a copy that constrains output to schema via Ollama's
format (used by Agent(response_model=…)).