You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -4,20 +4,13 @@ Modern reasoning-capable LLMs expose their internal thought process as reasoning
4
4
5
5
NeMo Guardrails allows you to inspect and control these reasoning traces by extracting them and making them available throughout your guardrails configuration. This enables you to write guardrails that can block responses based on the model's reasoning process, enhance moderation decisions with reasoning context, or monitor reasoning patterns.
6
6
7
-
This guide shows you how to access and guardrail reasoning content using the three main mechanisms NeMo Guardrails provides.
8
-
9
7
```{important}
10
-
The examples in this guide range from minimal toy examples (for understanding concepts) to complete reference implementations. They are designed to teach you how to access and work with `bot_thinking` in different contexts, not as production-ready code to copy-paste.
11
-
12
-
- **Toy examples** demonstrate basic variable access and simple pattern matching for learning purposes
13
-
- **Reference implementations** show complete working configurations from the repository ([examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking))
14
-
15
-
You should adapt these patterns to your specific use case with appropriate validation, error handling, and business logic for your application.
8
+
The examples in this guide range from minimal toy examples (for understanding concepts) to complete reference implementations. They are designed to teach you how to access and work with `bot_thinking` in different contexts, not as production-ready code to copy-paste. Adapt these patterns to your specific use case with appropriate validation, error handling, and business logic for your application.
16
9
```
17
10
18
-
## How Reasoning Content is Made Available
11
+
## Accessing Reasoning Content
19
12
20
-
When an LLM generates a response with reasoning traces, NeMo Guardrails automatically extracts the reasoning and makes it available through:
13
+
When an LLM generates a response with reasoning traces, NeMo Guardrails automatically extracts the reasoning and makes it available in three ways:
21
14
22
15
### In Colang Flows: `$bot_thinking` Variable
23
16
@@ -26,29 +19,22 @@ The reasoning content is available as a context variable in Colang output rails:
26
19
```colang
27
20
define flow check_reasoning
28
21
if $bot_thinking
29
-
# Access reasoning content
30
22
$captured_reasoning = $bot_thinking
31
23
```
32
24
33
-
```{note}
34
-
This is a minimal example to demonstrate variable access. See [Using Reasoning in Self-Check Output](#using-reasoning-in-self-check-output) for a complete reference implementation.
35
-
```
36
-
37
-
### In Actions: `context.get("bot_thinking")`
25
+
### In Custom Actions: `context.get("bot_thinking")`
38
26
39
-
When writing custom Python actions, you can access the reasoning via the context dictionary:
27
+
When writing Python actions, you can access the reasoning via the context dictionary:
This is the same pattern used in the built-in `self_check_output` action, which accesses `bot_thinking` from context and passes it to the prompt template.
51
-
52
38
### In Prompt Templates: `{{ bot_thinking }}`
53
39
54
40
When rendering prompts for LLM tasks (like self-check output), the reasoning is available as a Jinja2 template variable:
@@ -66,26 +52,13 @@ prompts:
66
52
Should this be blocked (Yes or No)?
67
53
```
68
54
55
+
**Important**: Always check if reasoning exists before using it, as not all models provide reasoning traces.
56
+
69
57
## Guardrailing with Output Rails
70
58
71
59
Output rails can use the `$bot_thinking` variable to inspect and control responses based on reasoning content.
72
60
73
-
### Basic Example: Checking for Reasoning
74
-
75
-
```colang
76
-
define flow log_reasoning_presence
77
-
if $bot_thinking
78
-
$has_reasoning = True
79
-
log "Reasoning detected: {$bot_thinking}"
80
-
else
81
-
$has_reasoning = False
82
-
```
83
-
84
-
```{note}
85
-
This is a toy example for learning purposes. For production use, see [Using Reasoning in Self-Check Output](#using-reasoning-in-self-check-output) below.
86
-
```
87
-
88
-
### Blocking Based on Reasoning Patterns
61
+
### Basic Pattern Matching
89
62
90
63
```colang
91
64
define bot refuse to respond
@@ -113,9 +86,9 @@ This demonstrates basic pattern matching for learning purposes. Real implementat
113
86
114
87
## Guardrailing with Custom Actions
115
88
116
-
You can write custom Python actions that access reasoning content through the context dictionary.
89
+
For complex validation logic or reusable checks across multiple flows, write custom Python actions:
Create a flow that uses this action in `config/rails/reasoning_check.co`:
117
+
**config/rails/reasoning_check.co**:
145
118
146
119
```colang
147
120
define bot refuse to respond
@@ -164,13 +137,9 @@ rails:
164
137
- quality_check_reasoning
165
138
```
166
139
167
-
```{note}
168
-
This example shows how to structure a custom action that accesses `bot_thinking`. Adapt the validation logic to your specific requirements.
169
-
```
170
-
171
140
## Using Reasoning in Self-Check Output
172
141
173
-
This is the **complete reference implementation** from [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) showing how `bot_thinking` is actually used in the codebase. This pattern provides reasoning traces to your self-check output rail, allowing the moderation LLM to make more informed decisions by considering both the response and the reasoning process.
142
+
This is the **complete reference implementation** from [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking), showing how `bot_thinking` is used in practice. This pattern provides reasoning traces to your self-check output rail, allowing the moderation LLM to make more informed decisions.
174
143
175
144
### Configuration
176
145
@@ -218,214 +187,11 @@ prompts:
218
187
219
188
The `{% if bot_thinking %}` conditional ensures the prompt works with both reasoning and non-reasoning models. When reasoning is available, the self-check LLM can evaluate both the final response and the reasoning process.
220
189
221
-
**Explore the complete implementation**: You can find the full working configuration in [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) with all files (config.yml, prompts.yml) ready to use as a reference for your own implementation.
222
-
223
-
## Managing Reasoning in API Responses
224
-
225
-
### Breaking Change in v0.18.0
226
-
227
-
Prior to v0.18.0, reasoning traces were prepended directly to the response content. Starting from v0.18.0, reasoning is handled separately:
228
-
229
-
### With `GenerationOptions` (Structured Access)
230
-
231
-
When using the Python API with `GenerationOptions`, reasoning is available in the separate `reasoning_content` field:
232
-
233
-
```python
234
-
from nemoguardrails import RailsConfig, LLMRails
235
-
236
-
config = RailsConfig.from_path("./config")
237
-
rails = LLMRails(config)
238
-
239
-
result = rails.generate_async(
240
-
messages=[{"role": "user", "content": "What is 2+2?"}]
241
-
)
242
-
243
-
if result.reasoning_content:
244
-
print("Reasoning:", result.reasoning_content)
245
-
246
-
print("Response:", result.response[0]["content"])
247
-
```
248
-
249
-
This is the recommended approach as it provides clean separation between reasoning and response content.
250
-
251
-
### Without `GenerationOptions` (Tagged String)
252
-
253
-
When calling without `GenerationOptions` (e.g., via dict/string response), reasoning is wrapped in `<thinking>` tags:
254
-
255
-
```python
256
-
response = rails.generate(
257
-
messages=[{"role": "user", "content": "What is 2+2?"}]
258
-
)
259
-
260
-
print(response["content"])
261
-
```
262
-
263
-
Output:
264
-
265
-
```
266
-
<think>Let me calculate: 2 plus 2 equals 4.</think>
267
-
The answer is 4.
268
-
```
269
-
270
-
This maintains backward compatibility while still separating reasoning from the actual response.
271
-
272
-
## Complete Working Example
273
-
274
-
This example combines multiple patterns for demonstration purposes. Use it as a reference to understand how different access methods can work together, but adapt each component to your specific needs.
275
-
276
-
### Directory Structure
277
-
278
-
```
279
-
config/
280
-
├── config.yml
281
-
├── prompts.yml
282
-
├── actions.py
283
-
└── rails/
284
-
└── reasoning_checks.co
285
-
```
286
-
287
-
### Model and Rails Configuration
288
-
289
-
**config.yml**:
290
-
291
-
```yaml
292
-
models:
293
-
- type: main
294
-
engine: <your_engine>
295
-
model: <your_reasoning_model>
296
-
- type: self_check_output
297
-
model: <your_moderation_model>
298
-
engine: <your_engine>
299
-
300
-
rails:
301
-
output:
302
-
flows:
303
-
- self check output
304
-
- block_sensitive_reasoning
305
-
- custom_reasoning_check
306
-
```
307
-
308
-
### Self-Check Output Prompt
309
-
310
-
**prompts.yml**:
311
-
312
-
```yaml
313
-
prompts:
314
-
- task: self_check_output
315
-
content: |
316
-
Your task is to check if the bot message complies with company policy.
Always check if reasoning exists before using it, as not all models provide reasoning traces:
384
-
385
-
```colang
386
-
if $bot_thinking
387
-
# Your logic here
388
-
```
389
-
390
-
```python
391
-
bot_thinking = context.get("bot_thinking")
392
-
if bot_thinking:
393
-
# Your logic here
394
-
```
395
-
396
-
```yaml
397
-
{% if bot_thinking %}
398
-
Bot reasoning: "{{ bot_thinking }}"
399
-
{% endif %}
400
-
```
401
-
402
-
### When to Use Rails vs Custom Actions
403
-
404
-
-**Use Colang Rails** for simple pattern matching and logic directly in your flow definitions
405
-
-**Use Custom Actions** for complex logic, external API calls, or when you need reusable validation logic across multiple flows
406
-
407
-
### Privacy Considerations
408
-
409
-
Reasoning traces may contain sensitive information about the model's decision-making process. Consider:
410
-
411
-
- Filtering or redacting sensitive content before logging reasoning traces
412
-
- Applying guardrails to prevent reasoning from revealing proprietary information
413
-
- Being mindful of what reasoning content is included in API responses
414
-
415
-
### Model Configuration
416
-
417
-
Ensure you're using a reasoning-capable model as your main model:
418
-
419
-
```yaml
420
-
models:
421
-
- type: main
422
-
engine: <your_engine>
423
-
model: <your_reasoning_model>
424
-
```
190
+
**Explore the complete implementation**: You can find the full working configuration in [examples/configs/self_check_thinking/](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) with all files ready to use as a reference for your own implementation.
425
191
426
192
## See Also
427
193
194
+
- [LLM Configuration - Using LLMs with Reasoning Traces](../configuration-guide/llm-configuration.md#using-llms-with-reasoning-traces) - API response handling and breaking changes
428
195
- [Output Rails](../../getting-started/5-output-rails/README.md) - General guide on output rails
429
-
- [Generation Options](./generation-options.md) - Using generation options for structured API responses
430
196
- [Self-Check Output Example](https://github.com/NVIDIA/NeMo-Guardrails/tree/bc799fbb05e1f12f1ba79461f7f7378b3af50c22/examples/configs/self_check_thinking) - Complete working configuration
431
197
- [Custom Actions](../../colang-language-syntax-guide.md#actions) - Guide on writing custom actions
0 commit comments