Skip to content

agentix.providers.ollama

ollama

Ollama model adapter — local models via the native ollama client.

Run open models (Llama, Qwen, Mistral, …) on your own machine. Install with pip install "agentix[ollama]" and have an Ollama server running (ollama serve; pull a tool-capable model, e.g. ollama pull llama3.1).

Ollama's chat API is OpenAI-ish but not identical: tool-call arguments are JSON objects (not strings) and usage fields are named differently, so this adapter does its own translation rather than reusing the OpenAI-compat helper. Local inference is free, so cost_usd is always 0.0.

Tip: Ollama also serves an OpenAI-compatible endpoint at /v1 — if you prefer that surface (or the openai SDK you already use), point :class:~agentix.providers.openai.OpenAIModel at base_url="http://localhost:11434/v1" instead.

OllamaModel

OllamaModel(
    *,
    model: str = DEFAULT_MODEL,
    host: str | None = None,
    client: Any = None,
    **extra: Any,
)

A :class:~agentix.model.ModelFn backed by a local Ollama server.

extra is forwarded to chat (e.g. options={"temperature": 0}, keep_alive, format for JSON mode). For tests, inject client= — any object exposing an async chat(**kwargs).

with_response_format

with_response_format(schema: dict[str, Any]) -> OllamaModel

Return a copy that constrains output to schema via Ollama's format (used by Agent(response_model=…)).