Generative API Router

A production-ready Go microservice that provides a unified OpenAI-compatible API for multiple LLM vendors (OpenAI, Gemini). This transparent proxy router simplifies AI integration by offering a single interface while intelligently distributing requests across multiple vendors and preserving your original model names in responses.

🏗️ Architecture Overview

Multi-Vendor OpenAI-Compatible Router

This service acts as a transparent proxy that provides a unified OpenAI-compatible API interface while routing requests to multiple LLM vendors behind the scenes:

OpenAI API Compatibility: All vendors accessed through OpenAI-compatible endpoints
Transparent Model Handling: Preserves your original model names in responses
Multi-Vendor Design: Currently supports 19 credentials (18 Gemini + 1 OpenAI) with 2 models
Context-Aware Selection: Intelligent selection across 38 vendor-credential-model combinations based on payload requirements

Enterprise-Grade Features (2024)

Recent comprehensive improvements include:

🔒 Security: AES-GCM encryption for credentials, sensitive data masking in logs
🔄 Reliability: Exponential backoff retry logic, circuit breaker pattern implementation
📊 Monitoring: Comprehensive health checks with vendor connectivity monitoring
⚡ Performance: Production-optimized logging, conditional detail levels
🧹 Code Quality: DRY principles, centralized utilities, eliminated code duplication

Features

Multi-Vendor Support: Routes requests to OpenAI or Gemini using OpenAI API compatibility
Even Distribution Selection: Fair distribution across all vendor-credential-model combinations
Vendor Filtering: Supports explicit vendor selection via ?vendor= query parameter
Transparent Proxy: Maintains all original request/response data (except for model selection)
Streaming Support: Properly handles chunked streaming responses for real-time applications
Tool Calling: Supports function calling/tools for AI agents with proper validation
Enterprise Reliability: Circuit breakers, retry logic, comprehensive health monitoring
Security: Encrypted credential storage, sensitive data masking
Modular Design: Clean separation of concerns with selector, validator, and client components
Configuration Driven: Easily configure available models and credentials via JSON files
Health Check: Built-in health check endpoint with service status monitoring
Comprehensive Testing: Full test coverage with unit tests for all components
🌐 Public Image URL Support: Automatic downloading and conversion of public image URLs to base64
📄 Advanced File Processing: Comprehensive document processing supporting PDF, Word, Excel, PowerPoint, ZIP archives, and more via markitdown integration

Quick Start

Prerequisites

Go 1.21 or higher
API keys for OpenAI and/or Google Gemini
Make (for build automation)
Python 3.8+ with markitdown for file processing (automatically installed via setup)

Installation

Clone the Repository:

git clone https://github.com/aashari/go-generative-api-router.git
cd go-generative-api-router

Setup Environment:
```
make setup
```
This will:
- Download Go dependencies
- Install development tools
- Create configs/credentials.json from the example template

Verify Configuration:

# Check existing configuration (service likely has working credentials)
cat configs/credentials.json | jq length && echo "credentials configured"
cat configs/models.json | jq length && echo "models configured"

Configure Credentials (if needed): Edit configs/credentials.json with your API keys:

[
  {
    "platform": "openai",
    "type": "api-key",
    "value": "sk-your-openai-key"
  },
  {
    "platform": "gemini",
    "type": "api-key",
    "value": "your-gemini-key"
  }
]

Configure Models (if needed): Edit configs/models.json to define which vendor-model pairs can be selected:

[
  {
    "vendor": "gemini",
    "model": "gemini-2.0-flash"
  },
  {
    "vendor": "openai",
    "model": "gpt-4o"
  }
]

Run the Service:
```
make run
```
The service will be available at http://localhost:8082

Selection Strategy

The router uses an Even Distribution Selector that ensures fair distribution across all vendor-credential-model combinations. This approach provides true fairness where each combination has exactly equal probability of being selected.

How It Works

Combination Generation: The system creates a flat list of all valid vendor-credential-model combinations
Equal Probability: Each combination gets exactly 1/N probability where N = total combinations
Fair Distribution: Unlike traditional two-stage selection (vendor → model), this ensures no bias toward vendors with fewer models

Example Distribution

With the current configuration:

18 Gemini credentials × 2 models = 36 combinations
1 OpenAI credential × 2 models = 2 combinations (when OpenAI models are configured)
Total: 38 combinations (current Gemini-only setup)

The context-aware selector intelligently filters combinations based on:

Image/Video Support: Routes vision requests to capable models
Tool Support: Routes function calling to compatible models
Streaming Support: Ensures streaming requests go to streaming-capable models

This reflects the actual resource availability rather than artificial vendor-level balancing.

Benefits

✅ True Fairness: Each credential-model combination has exactly equal probability
✅ Resource Proportional: Distribution reflects actual available resources
✅ Scalable: Automatically adapts as credentials/models are added/removed
✅ Transparent: Clear logging shows selection and total combination count
✅ No Bias: Eliminates bias toward vendors with fewer models per credential

Monitoring Selection

The service logs each selection decision for transparency:

Context-aware selection - Vendor: gemini, Model: gemini-2.5-flash-preview-04-17 (from 38 total combinations, filtered by capabilities)

You can monitor the distribution by checking the server logs to verify intelligent selection based on payload requirements.

Usage

Basic API Usage

# Health check
curl http://localhost:8082/health

# List available models
curl http://localhost:8082/v1/models

# Chat completion (any model name)
curl -X POST http://localhost:8082/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-preferred-model",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Process PDF documents
curl -X POST http://localhost:8082/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "document-analyzer",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Please summarize this research paper:"
          },
          {
            "type": "file_url",
            "file_url": {
              "url": "https://example.com/research-paper.pdf"
            }
          }
        ]
      }
    ]
  }'

# Force specific vendor
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=openai" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "my-model",
    "messages": [{"role": "user", "content": "Hello from OpenAI!"}]
  }'

Using Example Scripts

Example scripts are provided for common use cases:

# Basic usage examples
./examples/curl/basic.sh

# Streaming examples
./examples/curl/streaming.sh

# Tool calling examples
./examples/curl/tools.sh

Client Libraries

Example implementations are available for multiple languages:

Python: examples/clients/python/client.py
Node.js: examples/clients/nodejs/client.js
Go: examples/clients/go/client.go

Docker Deployment

Build and run using Docker:

# Build and run with Docker Compose
make docker-build
make docker-run

# Stop the service
make docker-stop

Testing

Multi-Vendor Testing

IMPORTANT: This is a multi-vendor service. Always test both vendors:

# Test OpenAI vendor
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=openai" \
  -H "Content-Type: application/json" \
  -d '{"model": "test-openai", "messages": [{"role": "user", "content": "Hello"}]}'

# Test Gemini vendor  
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=gemini" \
  -H "Content-Type: application/json" \
  -d '{"model": "test-gemini", "messages": [{"role": "user", "content": "Hello"}]}'

# Monitor vendor distribution
grep "Even distribution selected combination" logs/server.log | tail -5

Development Testing

# Run all tests
make test

# Run with coverage
make test-coverage

# Full CI check
make ci-check

Documentation

📚 Complete Documentation

Documentation Index - Complete documentation roadmap
User Guide - API usage and integration guide
API Reference - Complete API documentation
Development Guide - Setup and development workflow

🔧 Development Guides

Contributing Guide - How to contribute to the project
Testing Guide - Testing strategies and procedures
Logging Guide - Comprehensive logging documentation
Deployment Guide - AWS infrastructure and deployment

📖 Cursor AI Context

For Cursor AI development, see the comprehensive guides in .cursor/rules/:

Development Guide - Complete workflow, architecture, Git practices
Running & Testing Guide - Setup, testing, debugging

Architecture

Core Components

Proxy Handler: Routes requests to selected vendors, handles streaming/non-streaming responses
Vendor Selector: Implements even distribution selection across vendor-credential-model combinations
Request Validator: Validates OpenAI-compatible requests, preserves original model names
Response Processor: Processes vendor responses while maintaining model name transparency
Health Monitor: Comprehensive health checks with vendor connectivity monitoring
Circuit Breaker: Reliability pattern implementation for vendor communication
Retry Logic: Exponential backoff for failed vendor requests

Key Principles

Transparent Proxy: Original model names preserved in responses
Vendor Agnostic: Unified interface regardless of backend vendor
Fair Distribution: Even probability across all vendor-model combinations
OpenAI Compatibility: 100% compatible with OpenAI API format
Enterprise Reliability: Circuit breakers, retries, comprehensive monitoring

Production Deployment

The service is production-ready with:

AWS ECS Deployment: Containerized deployment on AWS
Load Balancing: High availability with load balancer integration
Monitoring: CloudWatch integration and comprehensive logging
Security: Encrypted credentials, sensitive data masking
Reliability: Circuit breakers, retry logic, health monitoring

See Deployment Guide for complete deployment instructions.

Contributing

We welcome contributions! Please see our Contributing Guide for details on:

Development setup and workflow
Code standards and review process
Testing requirements
Pull request guidelines

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Documentation: Complete documentation
Issues: GitHub Issues
Discussions: GitHub Discussions

Need help? Check the documentation or open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.cursor/rules		.cursor/rules
.github		.github
cmd/server		cmd/server
configs		configs
deployments/docker		deployments/docker
docs		docs
examples/clients		examples/clients
internal		internal
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

License

aashari/go-generative-api-router

Folders and files

Latest commit

History

Repository files navigation

Generative API Router

🏗️ Architecture Overview

Multi-Vendor OpenAI-Compatible Router

Enterprise-Grade Features (2024)

Features

Quick Start

Prerequisites

Installation

Selection Strategy

How It Works

Example Distribution

Benefits

Monitoring Selection

Usage

Basic API Usage

Using Example Scripts

Client Libraries

Docker Deployment

Testing

Multi-Vendor Testing

Development Testing

Documentation

📚 Complete Documentation

🔧 Development Guides

📖 Cursor AI Context

Architecture

Core Components

Key Principles

Production Deployment

Contributing

License

Support

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages