Skip to content

agentix.guards.injection

injection

Prompt-injection defense and the untrusted-data boundary.

InjectionGuard scans tool output for text that appears to be directed at the agent (instructions, authority claims, exfiltration requests) and, on a match, prefixes a warning so the model treats it as quoted data.

UntrustedDataGuard wraps all tool output in <untrusted_tool_output> tags. The system prompt should explain this convention: anything inside the tags is data to reason about, never instructions to follow.

wrap_as_untrusted_data

wrap_as_untrusted_data(text: str) -> str

Mark tool output so the model treats it as content to reason ABOUT, not instructions to follow. The system prompt must explain this convention.