agentix.guards.injection¶
injection ¶
Prompt-injection defense and the untrusted-data boundary.
InjectionGuard scans tool output for text that appears to be directed at the
agent (instructions, authority claims, exfiltration requests) and, on a match,
prefixes a warning so the model treats it as quoted data.
UntrustedDataGuard wraps all tool output in <untrusted_tool_output> tags.
The system prompt should explain this convention: anything inside the tags is
data to reason about, never instructions to follow.
wrap_as_untrusted_data ¶
Mark tool output so the model treats it as content to reason ABOUT, not instructions to follow. The system prompt must explain this convention.