Integrate vLLM Semantic Router with OpenWebUI


**Is your feature request related to a problem? Please describe.**

Currently, vLLM Semantic Router provides powerful intelligent routing capabilities including BERT-based classification, semantic caching, PII detection, and tool selection through OpenAI-compatible APIs. However, users must interact with these features through curl commands or direct API calls, which creates barriers for:

- Non-technical users who want to benefit from semantic routing
- Teams that need to visualize routing decisions and model selections
- Administrators who want to configure routing rules through a GUI
- Users who need to monitor cache hit rates and performance metrics
- Organizations that want to manage multiple model endpoints and their weights easily

The lack of a modern web interface limits adoption and makes it difficult to fully leverage the semantic router's capabilities.

**Describe the solution you'd like**

Would like to integrate vLLM Semantic Router with [OpenWebUI](https://github.com/open-webui/open-webui) to provide a comprehensive web-based interface for semantic routing capabilities. The solution should include:

**Core Integration Features:**
- Configure OpenWebUI to use vLLM Semantic Router as the backend API endpoint
- Enable model discovery through the `/v1/models` endpoint including the special "auto" model
- Support for all existing OpenAI-compatible API functionality through the web interface
- Docker Compose setup for easy deployment of the integrated stack

**Enhanced User Interface:**
- Visual indicators showing which model was selected for each request
- Display of classification confidence scores and routing decisions
- Cache hit/miss indicators in the chat interface
- Model selection interface showing routing capabilities and weights

**Configuration and Management:**
- Web-based GUI for editing semantic routing rules and intent categories
- Threshold adjustment controls for classification confidence
- Model weight and priority management interface
- Filter configuration for PII detection and prompt guard settings

**Monitoring and Analytics:**
- Dashboard widgets showing cache hit rates and performance metrics
- Model usage statistics and response time analytics
- Classification accuracy metrics and routing decision logs
- Real-time health monitoring of model endpoints

**Additional context**


**Existing Foundation:**
- vLLM Semantic Router already provides OpenAI-compatible endpoints at port 8801

**Expected Benefits:**
- Lower barrier to entry for non-technical users
- Visual feedback on routing decisions and performance optimization
- Centralized management of multiple model endpoints
- Cost optimization through visible cache efficiency metrics
- Enhanced security through integrated PII detection and prompt guard features

**Implementation Phases:**
1. **Basic Integration**: OpenWebUI configuration, Docker setup, model discovery
2. **Enhanced Features**: Routing visualization, configuration interface, metrics dashboard
3. **Advanced Integration**: Custom UI components, admin panel, advanced monitoring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate vLLM Semantic Router with OpenWebUI #231

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate vLLM Semantic Router with OpenWebUI #231

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions