Skip to content

Commit f09ff3d

Browse files
feat: Add comprehensive OpenAI Responses API integration with dual-mode support
### Major Features Added - **OpenAI Responses API Integration**: Full support for latest 2025 Responses API with gpt-4o + image_generation tool - **Dual API Architecture**: Seamless switching between Responses API (default) and Images API via API_MODE env var - **Enhanced Image Editing**: Multi-image input, context-aware editing, and previous response ID linking - **Comprehensive Test Coverage**: 99+ tests with full API coverage including ResponsesImageGenerator and ResponsesAPIAdapter test suites ### Technical Improvements - **SOLID Architecture**: Complete refactoring following dependency injection patterns - **Improved Configuration**: Enhanced environment variables and tool configuration system - **Advanced Conversation Context**: Multi-turn editing with proper metadata handling - **Better Error Handling**: Robust error handling across both API implementations ### Documentation Updates - Updated README.md with dual API documentation and comparison table - Enhanced CHANGELOG.md with detailed feature descriptions - Added comprehensive API mode examples and configuration guidance ### Version Bump - Updated package.json to v1.2.0 reflecting major Responses API integration - Added new keywords: gpt-4o, responses-api, dual-api 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 5750167 commit f09ff3d

File tree

179 files changed

+18764
-672
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

179 files changed

+18764
-672
lines changed

.env.example

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,9 @@ PORT=3000
66
NODE_ENV=development
77

88
# CORS Configuration (optional)
9-
CORS_ORIGIN=*
9+
CORS_ORIGIN=*
10+
11+
# Cache Configuration (optional)
12+
CACHE_DIR=.cache/images
13+
CACHE_TTL=3600
14+
CACHE_MAX_SIZE=100

API.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
The MCP Server GPT Image-1 provides multiple endpoints for image generation and management. The server supports both MCP protocol communication and direct HTTP/SSE streaming.
66

7+
> **Note**: The internal architecture follows SOLID principles with dependency injection. For implementation details, see [ARCHITECTURE.md](ARCHITECTURE.md).
8+
79
## Base URL
810

911
```

ARCHITECTURE.md

Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,278 @@
1+
# Architecture Documentation
2+
3+
This document describes the architecture of the MCP Server GPT Image-1 project, focusing on the SOLID principles implementation and clean architecture patterns used throughout the codebase.
4+
5+
## Overview
6+
7+
The project follows clean architecture principles with a focus on:
8+
- **Separation of Concerns**: Each component has a single, well-defined responsibility
9+
- **Dependency Inversion**: High-level modules don't depend on low-level modules
10+
- **Testability**: All business logic is easily testable through dependency injection
11+
- **Extensibility**: New features can be added without modifying existing code
12+
13+
## SOLID Principles Implementation
14+
15+
### 1. Single Responsibility Principle (SRP)
16+
17+
Each class has one reason to change:
18+
19+
```typescript
20+
// ❌ Before: Multiple responsibilities
21+
class ImageGeneration {
22+
generateImage() { /* API calls + caching + optimization */ }
23+
optimizeImage() { /* optimization logic */ }
24+
cacheResult() { /* caching logic */ }
25+
}
26+
27+
// ✅ After: Single responsibilities
28+
class ImageGenerator {
29+
generate() { /* only generation logic */ }
30+
}
31+
32+
class ImageOptimizer {
33+
optimize() { /* only optimization logic */ }
34+
}
35+
36+
class ImageCache {
37+
get() { /* only caching logic */ }
38+
set() { /* only caching logic */ }
39+
}
40+
```
41+
42+
### 2. Open/Closed Principle (OCP)
43+
44+
Classes are open for extension but closed for modification:
45+
46+
```typescript
47+
// Interfaces allow extension without modification
48+
interface IImageGenerator {
49+
generate(input: ImageGenerationInput): Promise<ImageGenerationResult>;
50+
}
51+
52+
// New implementations can be added without changing existing code
53+
class StandardImageGenerator implements IImageGenerator { }
54+
class StreamingImageGenerator implements IImageGenerator { }
55+
```
56+
57+
### 3. Liskov Substitution Principle (LSP)
58+
59+
Derived classes can be substituted for their base classes:
60+
61+
```typescript
62+
// All implementations follow the same contract
63+
const generators: IImageGenerator[] = [
64+
new StandardImageGenerator(),
65+
new StreamingImageGenerator()
66+
];
67+
68+
// Any generator can be used interchangeably
69+
generators.forEach(gen => gen.generate(input));
70+
```
71+
72+
### 4. Interface Segregation Principle (ISP)
73+
74+
Clients shouldn't depend on interfaces they don't use:
75+
76+
```typescript
77+
// Segregated interfaces for different concerns
78+
interface IImageCache {
79+
get(type: string, input: any): Promise<any>;
80+
set(type: string, input: any, data: any): Promise<void>;
81+
}
82+
83+
interface IImageOptimizer {
84+
optimizeBatch(images: string[], input: any): Promise<string[]>;
85+
calculateSizeReduction(original: string, optimized: string): Promise<number>;
86+
}
87+
```
88+
89+
### 5. Dependency Inversion Principle (DIP)
90+
91+
Depend on abstractions, not concretions:
92+
93+
```typescript
94+
// High-level service depends on abstractions
95+
class ImageGenerator {
96+
constructor(
97+
private openaiClient: IOpenAIClient, // interface
98+
private cache: IImageCache, // interface
99+
private optimizer: IImageOptimizer // interface
100+
) {}
101+
}
102+
```
103+
104+
## Project Structure
105+
106+
```
107+
src/
108+
├── interfaces/ # Abstract contracts (DIP)
109+
│ └── image-generation.interface.ts
110+
111+
├── services/ # Core business logic (SRP)
112+
│ ├── image-generator.ts
113+
│ ├── streaming-image-generator.ts
114+
│ ├── file-converter.ts
115+
│ └── openai-client-adapter.ts
116+
117+
├── adapters/ # Interface implementations (OCP)
118+
│ ├── cache-adapter.ts
119+
│ └── optimizer-adapter.ts
120+
121+
├── tools/ # MCP tool endpoints
122+
│ ├── image-generation.ts
123+
│ └── image-generation-streaming.ts
124+
125+
├── utils/ # Utility classes
126+
│ ├── cache.ts
127+
│ └── image-optimizer.ts
128+
129+
└── transport/ # Communication layer
130+
└── http.ts
131+
```
132+
133+
## Key Components
134+
135+
### 1. Interfaces Layer (`src/interfaces/`)
136+
137+
Defines contracts for all major components:
138+
- `IImageGenerator`: Image generation contract
139+
- `IImageCache`: Caching operations contract
140+
- `IImageOptimizer`: Image optimization contract
141+
- `IOpenAIClient`: OpenAI API contract
142+
- `IFileConverter`: File conversion contract
143+
144+
### 2. Services Layer (`src/services/`)
145+
146+
Contains core business logic implementations:
147+
148+
#### ImageGenerator
149+
- Handles standard image generation
150+
- Orchestrates caching, API calls, and optimization
151+
- Depends only on interfaces
152+
153+
#### StreamingImageGenerator
154+
- Implements streaming image generation
155+
- Emits progress events during generation
156+
- Supports partial image previews
157+
158+
#### FileConverter
159+
- Converts between base64 and File objects
160+
- Extracts base64 from data URLs
161+
- Pure utility service with no external dependencies
162+
163+
#### OpenAIClientAdapter
164+
- Adapts OpenAI SDK to our interface
165+
- Handles API error translation
166+
- Isolates third-party dependencies
167+
168+
### 3. Adapters Layer (`src/adapters/`)
169+
170+
Bridges between interfaces and concrete implementations:
171+
172+
#### CacheAdapter
173+
- Adapts the Cache utility to IImageCache interface
174+
- Allows cache implementation to be swapped
175+
176+
#### OptimizerAdapter
177+
- Adapts ImageOptimizer utility to IImageOptimizer interface
178+
- Enables different optimization strategies
179+
180+
### 4. Utils Layer (`src/utils/`)
181+
182+
Low-level utilities with specific responsibilities:
183+
184+
#### Cache
185+
- Two-tier caching (memory + disk)
186+
- TTL-based expiration
187+
- Size-based cleanup
188+
- Cache key generation
189+
190+
#### ImageOptimizer
191+
- Image format conversion
192+
- Quality optimization
193+
- Size reduction calculations
194+
- Sharp library integration
195+
196+
## Dependency Flow
197+
198+
```
199+
┌─────────────────┐
200+
│ MCP Server │
201+
└────────┬────────┘
202+
203+
┌────────▼────────┐
204+
│ Tools │ (image-generation.ts)
205+
└────────┬────────┘
206+
│ depends on
207+
┌────────▼────────┐
208+
│ Services │ (ImageGenerator)
209+
└────────┬────────┘
210+
│ depends on
211+
┌────────▼────────┐
212+
│ Interfaces │ (IImageCache, IImageOptimizer)
213+
└────────┬────────┘
214+
│ implemented by
215+
┌────────▼────────┐
216+
│ Adapters │ (CacheAdapter, OptimizerAdapter)
217+
└────────┬────────┘
218+
│ wraps
219+
┌────────▼────────┐
220+
│ Utils │ (Cache, ImageOptimizer)
221+
└─────────────────┘
222+
```
223+
224+
## Testing Architecture
225+
226+
The architecture supports comprehensive testing through:
227+
228+
1. **Dependency Injection**: All dependencies are injected, making mocking trivial
229+
2. **Interface-based Design**: Tests can use mock implementations
230+
3. **Pure Functions**: Many operations are pure, making them easy to test
231+
4. **Isolated Components**: Each component can be tested independently
232+
233+
Example test structure:
234+
```typescript
235+
describe('ImageGenerator', () => {
236+
let generator: ImageGenerator;
237+
let mockClient: IOpenAIClient;
238+
let mockCache: IImageCache;
239+
240+
beforeEach(() => {
241+
// Create mocks
242+
mockClient = { generateImage: vi.fn() };
243+
mockCache = { get: vi.fn(), set: vi.fn() };
244+
245+
// Inject mocks
246+
generator = new ImageGenerator(mockClient, mockCache, mockOptimizer);
247+
});
248+
249+
it('should use cache when available', async () => {
250+
// Test with mocked dependencies
251+
});
252+
});
253+
```
254+
255+
## Extension Points
256+
257+
The architecture allows easy extension through:
258+
259+
1. **New Image Generators**: Implement `IImageGenerator` for new generation strategies
260+
2. **Alternative Caching**: Implement `IImageCache` for Redis, Memcached, etc.
261+
3. **Different Optimizers**: Implement `IImageOptimizer` for different optimization libraries
262+
4. **Additional Clients**: Implement `IOpenAIClient` for other AI providers
263+
264+
## Benefits
265+
266+
1. **Maintainability**: Clear separation of concerns makes code easy to understand
267+
2. **Testability**: 98%+ test coverage for core utilities
268+
3. **Flexibility**: Easy to swap implementations
269+
4. **Scalability**: Components can be scaled independently
270+
5. **Type Safety**: Full TypeScript support with interfaces
271+
272+
## Future Considerations
273+
274+
1. **Event-Driven Architecture**: Consider event bus for decoupling
275+
2. **Repository Pattern**: Abstract data access for image storage
276+
3. **Strategy Pattern**: For different generation algorithms
277+
4. **Plugin System**: Dynamic loading of processors
278+
5. **Microservices**: Components could be separated into services

CHANGELOG.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,81 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.2.0] - 2025-01-13
9+
10+
### Added
11+
- **OpenAI Responses API Integration** - Full support for the latest 2025 Responses API
12+
- New `ResponsesImageGenerator` service using gpt-4o model with image_generation tool
13+
- `ResponsesAPIAdapter` for handling both sync and streaming API calls
14+
- Comprehensive interface definitions for Responses API compatibility
15+
- Native multimodal understanding with better context awareness
16+
- Enhanced prompt following and superior text rendering in images
17+
18+
- **Dual API Architecture** - Seamless switching between API modes
19+
- `API_MODE` environment variable for runtime API selection
20+
- Default to Responses API (recommended) with fallback to Images API
21+
- Backward compatibility with existing Images API implementations
22+
- Unified interface supporting both legacy and modern workflows
23+
24+
- **Enhanced Image Editing** - Advanced editing capabilities via Responses API
25+
- Multi-image input support with proper message formatting
26+
- Context-aware editing with conversation history integration
27+
- Mask support for precise inpainting operations
28+
- Previous response ID linking for multi-turn editing sessions
29+
30+
- **Comprehensive Test Coverage** - 99+ tests with full API coverage
31+
- `ResponsesImageGenerator` test suite (11 tests)
32+
- `ResponsesAPIAdapter` test suite (9 tests)
33+
- Complete error handling and edge case coverage
34+
- Mock-based testing with proper dependency injection
35+
- Real API key validation and integration testing
36+
37+
- **Updated Configuration System**
38+
- `RESPONSES_MODEL` for Responses API model selection (default: gpt-4o)
39+
- Enhanced tool configuration with image generation parameters
40+
- Flexible model switching between dedicated and integrated approaches
41+
42+
### Changed
43+
- Updated README.md with comprehensive dual API documentation
44+
- Enhanced environment variable documentation
45+
- Added API mode comparison table and feature matrix
46+
- Improved streaming examples for both API modes
47+
- Updated roadmap to reflect completed Responses API integration
48+
49+
### Fixed
50+
- Proper error handling for both API implementations
51+
- Consistent response formatting across API modes
52+
- Fixed test mocking issues with OpenAI SDK
53+
- Improved conversation context handling in edit operations
54+
855
## [Unreleased]
956

57+
### Added
58+
- **SOLID Architecture Refactoring** - Complete codebase restructuring following SOLID principles
59+
- Single Responsibility: Separated concerns into focused services
60+
- Open/Closed: Extensible through interfaces without modification
61+
- Liskov Substitution: Consistent interface implementations
62+
- Interface Segregation: Focused, specific interfaces
63+
- Dependency Inversion: Services depend on abstractions
64+
65+
- **Comprehensive Test Suite** - TDD implementation with 81+ tests
66+
- Unit tests for all core services and utilities
67+
- Integration tests for MCP server endpoints
68+
- Streaming functionality tests
69+
- 98%+ coverage for utilities, 78%+ for services
70+
- Vitest framework with fast execution
71+
72+
- **Documentation Updates**
73+
- New ARCHITECTURE.md documenting SOLID implementation
74+
- New TESTING.md with testing guidelines
75+
- Updated CONTRIBUTING.md with TDD practices
76+
- Enhanced README.md with architecture overview
77+
78+
### Changed
79+
- Refactored image generation into separate service classes
80+
- Introduced dependency injection throughout the codebase
81+
- Improved error handling and type safety
82+
1083
### Added
1184
- **Streaming Support** - Real-time image generation with Server-Sent Events (SSE)
1285
- New `/mcp/stream` endpoint for streaming requests

0 commit comments

Comments
 (0)