Config file:
examples/conditional-routing.json
Route requests based on input characteristics using system variables and conditions.
┌─────────────────────┐
│ Short Input │──────────────────────┐
│ Handler │ │
┌────────────┐ ╱└─────────────────────┘ ▼
│ Router │────< ┌────────┐
└────────────┘ ╲┌─────────────────────┐ ┌────────┐ │ Output │
│ Long Input │────▶│Refiner │─▶│ │
│ Handler │ └────────┘ └────────┘
└─────────────────────┘
Layer 0 Layer 1 Layer 2 Output
Conditions:
- Short: $WORD_COUNT < 10 → Direct to output
- Long: $WORD_COUNT >= 10 → Through refiner
Key Feature: The if field uses system variables to conditionally activate nodes.
llmnet automatically tracks these variables for each request:
| Variable | Description | Example |
|---|---|---|
$INITIAL_INPUT |
Original user prompt | "Hello world" |
$CURRENT_INPUT |
Current content (may change) | "Hello world" |
$PREV_NODE |
Name of previous node | "router" |
$PREV_LAYER |
Layer number of previous node | "0" |
$CURRENT_LAYER |
Current layer being evaluated | "1" |
$HOP_COUNT |
Number of hops so far | "2" |
$TIMESTAMP |
Unix timestamp | "1703612400" |
$REQUEST_ID |
Unique request UUID | "a1b2c3..." |
$ROUTE_DECISION |
Last routing decision | "technical-handler" |
$INPUT_LENGTH |
Character count | "150" |
$WORD_COUNT |
Word count | "25" |
| Operator | Example | Description |
|---|---|---|
| Existence | $PREV_NODE |
True if variable exists and non-empty |
== |
$PREV_NODE == "router" |
String equality |
!= |
$PREV_NODE != "router" |
String inequality |
> |
$WORD_COUNT > 10 |
Numeric greater than |
< |
$WORD_COUNT < 10 |
Numeric less than |
>= |
$HOP_COUNT >= 2 |
Numeric greater or equal |
<= |
$INPUT_LENGTH <= 100 |
Numeric less or equal |
{
"architecture": [
{
"name": "router",
"layer": 0,
"output-to": [1]
},
{
"name": "short-input-handler",
"layer": 1,
"if": "$WORD_COUNT < 10",
"use-case": "Quick responses for short inputs",
"output-to": ["output"]
},
{
"name": "long-input-handler",
"layer": 1,
"if": "$WORD_COUNT >= 10",
"use-case": "Detailed responses for longer inputs",
"output-to": [2]
},
{
"name": "response-refiner",
"layer": 2,
"output-to": ["output"]
},
{ "name": "output", "adapter": "output" }
]
}Scenario: Use simple model for simple queries, powerful model for complex ones.
{
"name": "simple-handler",
"if": "$WORD_COUNT < 20",
"extra-options": { "model_override": "gemma-7b" }
},
{
"name": "complex-handler",
"if": "$WORD_COUNT >= 20",
"extra-options": { "model_override": "llama-70b" }
}Cost savings: Short queries (greetings, simple questions) use cheap, fast models. Complex queries get full power.
Scenario: Different refiners based on which handler processed the request.
{
"name": "code-refiner",
"layer": 2,
"if": "$PREV_NODE == \"code-handler\"",
"use-case": "Format code blocks, add syntax highlighting"
},
{
"name": "prose-refiner",
"layer": 2,
"if": "$PREV_NODE == \"writing-handler\"",
"use-case": "Polish prose, check grammar"
}Why it works: Code needs different post-processing than prose.
Scenario: Prevent infinite loops in complex pipelines.
{
"name": "recursive-processor",
"layer": 1,
"if": "$HOP_COUNT < 5",
"output-to": [1, 2] // Can loop back to layer 1
},
{
"name": "final-processor",
"layer": 1,
"if": "$HOP_COUNT >= 5",
"output-to": ["output"] // Forces exit after 5 hops
}Scenario: Different handling during peak vs off-peak hours.
{
"name": "peak-hours-handler",
"if": "$TIMESTAMP > 1703602800", // After 9 AM
"extra-options": { "model_override": "fast-model" }
},
{
"name": "off-peak-handler",
"if": "$TIMESTAMP <= 1703602800",
"extra-options": { "model_override": "quality-model" }
}Scenario: Only summarize long responses.
{
"name": "summarizer",
"layer": 2,
"if": "$INPUT_LENGTH > 1000",
"use-case": "Summarize the response to key points"
},
{
"name": "passthrough",
"layer": 2,
"if": "$INPUT_LENGTH <= 1000",
"output-to": ["output"]
}Scenario: Different processing for initial response vs subsequent refinements.
{
"name": "initial-processor",
"layer": 1,
"if": "$HOP_COUNT == \"0\"",
"use-case": "Generate initial response"
},
{
"name": "refinement-processor",
"layer": 1,
"if": "$HOP_COUNT > 0",
"use-case": "Refine previous response"
}Scenario: Only send WebSocket notifications for certain conditions.
{
"name": "ws-alert",
"if": "$WORD_COUNT > 50",
"adapter": "ws",
"url": "ws://alerts:3000"
}Long responses trigger WebSocket alerts; short ones don't.
Scenario: Route a percentage of traffic to experimental handlers.
{
"name": "control-handler",
"if": "$REQUEST_ID < \"8\"", // ~50% of UUIDs start with 0-7
"use-case": "Control group: existing model"
},
{
"name": "experiment-handler",
"if": "$REQUEST_ID >= \"8\"", // ~50% start with 8-f
"use-case": "Experiment: new model"
}If no conditions match, all nodes in the layer are considered valid targets. This prevents pipelines from getting stuck:
// If input has exactly 10 words, neither condition matches
{
"name": "handler-a",
"if": "$WORD_COUNT < 10" // False for 10 words
},
{
"name": "handler-b",
"if": "$WORD_COUNT > 10" // False for 10 words
}
// Result: Both handlers become available, router chooses// Good: No overlap
"if": "$WORD_COUNT < 10"
"if": "$WORD_COUNT >= 10"
// Bad: Overlap at 10
"if": "$WORD_COUNT <= 10"
"if": "$WORD_COUNT >= 10"{
"name": "specialized-handler",
"if": "$WORD_COUNT > 50"
},
{
"name": "default-handler" // No condition = always matches
}// Good
"name": "short-query-handler"
"name": "complex-query-handler"
// Bad
"name": "handler-1"
"name": "handler-2"# Short input (< 10 words) - should use short-input-handler
curl -X POST http://localhost:8080/v1/chat/completions \
-d '{"model":"llmnet","messages":[{"role":"user","content":"Hello"}]}'
# Long input (>= 10 words) - should use long-input-handler + refiner
curl -X POST http://localhost:8080/v1/chat/completions \
-d '{"model":"llmnet","messages":[{"role":"user","content":"Please explain in detail how machine learning algorithms work and provide examples of common use cases"}]}'Conditions work alongside LLM-based routing:
- Conditions filter available targets based on variables
- Router chooses among filtered targets using LLM intelligence
{
"name": "premium-handler",
"if": "$INPUT_LENGTH > 500", // Only available for long inputs
"use-case": "Complex analysis requiring premium resources"
},
{
"name": "standard-handler",
"use-case": "Standard queries" // Always available
}For a 600-character input:
- Both handlers available (premium passes condition, standard has no condition)
- Router uses LLM to choose based on use-case descriptions
Next step: Nemotron Router for advanced LLM-based routing.