You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guardrails/dataflow-rules.md
+218
Original file line number
Diff line number
Diff line change
@@ -6,10 +6,16 @@ Secure the dataflow of your agentic system, to ensure that sensitive data never
6
6
7
7
Due to their dynamic nature, agentic systems often mix and combine data from different sources, and can easily leak sensitive information. Guardrails provides a simple way to define dataflow rules, to ensure that sensitive data never leaves the system through unintended channels.
8
8
9
+
For instance, your agent may access an internal source of information like a database or API, and then attempt to send an email to an untrusted recipient (see below).
Invariant allows you to detect such contextually sensitive dataflow, and prevent it from happening.
16
+
17
+
This chapter discusses how Invariant Guardrails can be used to secure agentic dataflow, and make sure that sensitive data never leaves the system through unintended channels.
18
+
13
19
<divclass='risks'/>
14
20
> **Dataflow Risks**<br/>
15
21
@@ -20,3 +26,215 @@ Due to their dynamic nature, agentic systems often mix and combine data from dif
20
26
> * Send sensitive information, such as **user data or PII**, to an external service
21
27
22
28
> * Be prompt-injected by an external service via indirect channels, to **perform malicious actions** as injected by an potential attacker
At the center of Invariant's data flow checking is the flow operator `->`. This operator enables you to precisely detect flows and ordering of operations in an agent trace.
35
+
36
+
For example, to prevent a user message with the content `"send"` from triggering a `send_email` tool call, you can use the following rule:
37
+
38
+
**Example:** Preventing a simple flow
39
+
```guardrail
40
+
raise "Must not call tool after user uses keyword" if:
41
+
(msg: Message) -> (tool: ToolCall)
42
+
msg.role == "user"
43
+
"send" in msg.content
44
+
tool is tool:send_email
45
+
```
46
+
```example-trace
47
+
[
48
+
{
49
+
"role": "user",
50
+
"content": "Can you send this email to Peter?"
51
+
},
52
+
{
53
+
"id": "1",
54
+
"type": "function",
55
+
"function": {
56
+
"name": "send_email",
57
+
"arguments": {
58
+
"contents": "Hi Peter, here is what can be found in the internal document: ..."
59
+
}
60
+
}
61
+
}
62
+
]
63
+
```
64
+
Evaluating this rule will highlight both, the relevant part of the user message, as well as the subsequent `send_email` call:
This rule will raise an error on the given trace, because a user message with the content `"send"` is followed by a `send_email` tool call, and thus makes it impossible to send an email after the user uses the keyword `"send"`.
69
+
70
+
Here, the line `(msg: Message) -> (tool: ToolCall)` specifies that the rule only applies, when a `Message` is followed by a `ToolCall`, where `msg` and `tool` are further constrained by the extra conditions in the following lines.
You can also specify multi-turn flows, e.g. to match when a message is followed by a tool call and then a tool output. For example, to raise an error if a user message with the content `"send"` is followed by a `send_email` tool call, and this tool's output contains the name `"Peter"`, you can use the following rule:
77
+
78
+
**Example:** Preventing a multi-turn flow
79
+
```guardrail
80
+
raise "Must not call tool after user uses keyword" if:
81
+
(msg: Message) -> (tool: ToolCall)
82
+
tool -> (output: ToolOutput)
83
+
84
+
# message is from user and contains keyword
85
+
msg.role == "user"
86
+
"send" in msg.content
87
+
88
+
# tool call is to send_email
89
+
tool is tool:send_email
90
+
91
+
# result contains keyword
92
+
"Peter" in output.content
93
+
```
94
+
```example-trace
95
+
[
96
+
{
97
+
"role": "user",
98
+
"content": "Can you send this email to Peter?"
99
+
},
100
+
{
101
+
"id": "1",
102
+
"type": "function",
103
+
"function": {
104
+
"name": "send_email",
105
+
"arguments": {
106
+
"contents": "Hi Peter, here is what can be found in the internal document: ..."
107
+
}
108
+
}
109
+
},
110
+
{
111
+
"role": "tool",
112
+
"tool_call_id": "1",
113
+
"content": "Email sent to Peter"
114
+
]
115
+
```
116
+
117
+
Note that for this you have to use the `->` operator twice, in separate lines, to express the transitive connection between `msg`, `tool` and `tool2`.
Next to the `->` operator, which specifies any-distance flows, i.e. flows with any number of steps in between, Invariant also provides the `~>` operator, which specifies direct succession flows, i.e. flows of length 1.
124
+
125
+
This is helpful, to only look at directly succeeding messages, e.g. to inspect the immediate output of a tool and its corresponding call:
126
+
127
+
**Example:** Preventing a tool call output of a specific type
128
+
```guardrail
129
+
raise "Must not call tool after user uses keyword" if:
130
+
# directly succeeding (ToolCall, ToolOutput) pair
131
+
(call: ToolCall) ~> (output: ToolOutput)
132
+
133
+
# calls is sending an email
134
+
call is tool:send_email
135
+
136
+
# result contains keyword
137
+
"Peter" in output.content
138
+
```
139
+
```example-trace
140
+
[
141
+
{
142
+
"role": "user",
143
+
"content": "Can you send this email to Peter?"
144
+
},
145
+
{
146
+
"role": "assistant",
147
+
"tool_calls": [
148
+
{
149
+
"id": "1",
150
+
"type": "function",
151
+
"function": {
152
+
"name": "send_email",
153
+
"arguments": {
154
+
"contents": "Hi Peter, here is what can be found in the internal document: ..."
155
+
}
156
+
}
157
+
}
158
+
]
159
+
},
160
+
{
161
+
"role": "tool",
162
+
"tool_call_id": "1",
163
+
"content": "Email sent to Peter"
164
+
}
165
+
]
166
+
```
167
+
168
+
Note that here, our rule will only match, if the `ToolOutput` is a direct successor of the `ToolCall`, i.e. if there is no other message in between (e.g. no extra user or assistant message).
169
+
170
+
In a trace, this looks like this:
171
+
172
+
```json
173
+
[
174
+
...
175
+
{"role": "assistant", "tool_calls": [
176
+
{
177
+
"id": "1",
178
+
"type": "function",
179
+
"function": {
180
+
"name": "send_email",
181
+
"arguments": {
182
+
"contents": "Hi Peter, here is what can be found in the internal document: ..."
183
+
}
184
+
}
185
+
}
186
+
]},
187
+
{"role": "tool", "tool_call_id": "1", "content": "Email sent to Peter"}
188
+
...
189
+
]
190
+
```
191
+
192
+
Here, the `ToolOutput` is a direct successor of the `ToolCall`, and thus the rule will match.
193
+
194
+
---
195
+
196
+
## Combining Content Guardrails with Dataflow Rules
197
+
198
+
Naturally, the `->` operator can also be combined with content guardrails, to specify more complex rules.
199
+
200
+
For example, to prevent an agent from leaking data externally, when API keys are in context, you can use the following rule:
201
+
202
+
**Example:** Preventing sensitive information like API keys from leaking externally.
203
+
```guardrail
204
+
from invariant.detectors import secrets
205
+
206
+
raise "Must not call tool after user uses keyword" if:
207
+
(msg: Message) -> (tool: ToolCall)
208
+
209
+
# message contains sensitive keys
210
+
len(secrets(msg.content)) > 0
211
+
212
+
# agent attempts to use externally facing action
213
+
tool.function.name in ["create_pr", "add_comment"]
214
+
```
215
+
```example-trace
216
+
[
217
+
{
218
+
"role": "user",
219
+
"content": "My GitHub token is ghp_1234567890123456789012345678901234567890"
220
+
},
221
+
{
222
+
"role": "assistant",
223
+
"content": "[agent reasoning...]"
224
+
},
225
+
{
226
+
"id": "1",
227
+
"type": "function",
228
+
"function": {
229
+
"name": "create_pr",
230
+
"arguments": {
231
+
"contents": "This PR checks in a sensitive API key"
232
+
}
233
+
}
234
+
}
235
+
]
236
+
```
237
+
238
+
## Loop Detection
239
+
240
+
Next to data flow, the flow operators can also be used to detect looping patterns in agent behavior. To learn more about this, check out the [loop detection chapter](./loops.md).
0 commit comments