Implementation showcase demonstrating five different transport protocols for building real-time AI chat applications using Google's Gemini model. This repository explores the performance, complexity, and use-case differences between various communication protocols for streaming AI responses.
For a comprehensive technical guide covering the engineering decisions, performance comparisons, and implementation details of these transport protocols, check out the supporting Medium article: Engineering Transport Layers for GenAI: REST, WebSockets, gRPC and Beyond
This project implements the same multi-turn chat functionality using five different transport protocols, allowing developers to compare and choose the best approach for their specific needs:
- REST HTTP - Traditional request/response pattern
- Streamable HTTP - HTTP chunked transfer encoding
- Server-Sent Events (SSE) - Real-time event streaming
- WebSockets - Full bidirectional communication
- gRPC - High-performance binary protocol
pip install -r requirements.txt
export GENAI_MODEL_ID="gemini-2.0-flash" # Optional, defaults to gemini-2.0-flash
Execute these commands from the project root:
export PYTHONDONTWRITEBYTECODE=1 PYTHONPATH=$PYTHONPATH:.
Important: These environment variables must be set from the root directory of the project to ensure proper module imports and clean Python execution across all protocol implementations.
Choose your protocol and run both server and client from the project root:
# Example: REST HTTP implementation
python protocols/http_rest/server.py # Start server (Terminal 1)
python protocols/http_rest/client.py # Start client (Terminal 2)
# Example: WebSocket implementation
python protocols/websocket/server.py # Start server (Terminal 1)
python protocols/websocket/client.py # Start client (Terminal 2)
# Example: gRPC implementation (requires setup first)
python protocols/grpc/setup.py # Generate Protocol Buffers
python protocols/grpc/server.py # Start server (Terminal 1)
python protocols/grpc/client.py # Start client (Terminal 2)
All clients support the same command set:
Command | Action |
---|---|
/help |
Show available commands |
/new |
Create new chat session |
/sessions |
List all active sessions |
/info |
Show current session details |
/stats |
Display client statistics |
/server |
Show server statistics |
/health |
Check server health |
/docs |
Open API documentation (web protocols) |
/demo |
Open interactive demo (where available) |
/quit |
Exit client |
All implementations share the same core components:
- Multi-turn Context: Persistent conversation history across all protocols
- Session Management: Independent conversations with UUID-based sessions
- Google Gemini Integration: Consistent AI model across all implementations
- Interactive Clients: Rich CLI clients with session management commands
- Performance Monitoring: Detailed statistics and connection tracking
Traditional request/response pattern for simple chat applications
python protocols/http_rest/server.py # Terminal 1
python protocols/http_rest/client.py # Terminal 2
- ✅ Simple implementation
- ✅ Universal compatibility
- ✅ Easy debugging
- ❌ No real-time streaming
- ❌ Higher latency per message
HTTP chunked transfer encoding for real-time response streaming
python protocols/streamable_http/server.py # Terminal 1
python protocols/streamable_http/client.py # Terminal 2
- ✅ Real-time streaming responses
- ✅ Standard HTTP compatibility
- ✅ NDJSON protocol
- ✅ Works with any HTTP client
- ❌ One-way communication only
Event-driven real-time streaming with native browser support
python protocols/sse/server.py # Terminal 1
python protocols/sse/client.py # Terminal 2
- ✅ Native browser EventSource API
- ✅ Automatic reconnection
- ✅ Structured event types
- ✅ Built-in error handling
- ❌ One-way communication only
Full bidirectional real-time communication
python protocols/websocket/server.py # Terminal 1
python protocols/websocket/client.py # Terminal 2
- ✅ Full bidirectional communication
- ✅ Real-time typing indicators
- ✅ Session broadcasting
- ✅ Persistent connections
- ✅ Interactive web demo
High-performance binary protocol with type safety
python protocols/grpc/setup.py # Generate Protocol Buffers
python protocols/grpc/server.py # Terminal 1
python protocols/grpc/client.py # Terminal 2
- ✅ High performance binary protocol
- ✅ Strong type safety with Protocol Buffers
- ✅ Bidirectional streaming
- ✅ Built-in compression and multiplexing
- ✅ Cross-language compatibility
Understanding how each protocol handles the conversation flow:
- Python 3.8+: Primary development language
- FastAPI: Modern async web framework (REST, SSE, Streamable HTTP, WebSocket)
- Uvicorn: ASGI server for FastAPI applications
- gRPC: High-performance RPC framework
- Google GenAI: Gemini model integration
- Pydantic: Data validation and serialization
- WebSockets: Native Python websockets library
- SSE: sse-starlette for Server-Sent Events
- gRPC: grpcio and grpcio-tools for Protocol Buffers
- HTTP: httpx for async HTTP client operations
genai-transport-protocols/
├── protocols/
│ ├── http_rest/ # Traditional REST API
│ │ ├── server.py
│ │ ├── client.py
│ │ └── README.md
│ ├── streamable_http/ # HTTP chunked encoding
│ │ ├── server.py
│ │ ├── client.py
│ │ └── README.md
│ ├── sse/ # Server-Sent Events
│ │ ├── server.py
│ │ ├── client.py
│ │ └── README.md
│ ├── websocket/ # WebSocket implementation
│ │ ├── server.py
│ │ ├── client.py
│ │ └── README.md
│ └── grpc/ # gRPC implementation
│ ├── server.py
│ ├── client.py
│ ├── setup.py
│ ├── chat.proto
│ └── README.md
├── sequence_diagrams/ # Protocol flow diagrams
│ ├── rest.png
│ ├── streamable_http.png
│ ├── sse.png
│ ├── websockets.png
│ └── grpc.png
├── shared/ # Common utilities
│ ├── io.py # Input/output utilities
│ ├── llm.py # AI model integration
│ ├── logger.py # Logging utilities
│ └── setup.py # Common setup functions
├── requirements.txt # Python dependencies
└── README.md # This file
- Fork the repository
- Create a feature branch
- Add your transport protocol implementation
- Follow the existing patterns for session management and client commands
- Include comprehensive documentation and examples
- Submit a pull request