Skip to content

Releases: pavelsukhachev/mcp-server-gpt-image

v1.2.0: OpenAI Responses API Integration & Dual-Mode Support

13 Jul 16:22

Choose a tag to compare

🚀 Major Release: OpenAI Responses API Integration

✨ New Features

🆕 OpenAI Responses API Integration

  • Full support for 2025 Responses API with gpt-4o + image_generation tool
  • Native multimodal understanding with better context awareness
  • Enhanced prompt following and superior text rendering in images
  • Real-time streaming with partial previews via Responses API

🔄 Dual API Architecture

  • Seamless API switching via API_MODE environment variable
  • Default to Responses API (recommended) with fallback to Images API
  • Backward compatibility with existing Images API implementations
  • Unified interface supporting both legacy and modern workflows

🖼️ Enhanced Image Editing

  • Multi-image input support with proper message formatting
  • Context-aware editing with conversation history integration
  • Previous response ID linking for multi-turn editing sessions
  • Advanced mask support for precise inpainting operations

🧪 Comprehensive Test Coverage

  • 99+ tests with full API coverage for both implementations
  • Complete error handling and edge case coverage
  • Real API validation with integration testing
  • Mock-based testing with proper dependency injection

🔧 Technical Improvements

⚙️ Enhanced Configuration System

  • API_MODE: Switch between 'responses' (default) and 'images'
  • RESPONSES_MODEL: Model selection for Responses API (default: gpt-4o)
  • Flexible model switching between dedicated and integrated approaches

🏗️ SOLID Architecture

  • Complete refactoring following dependency injection patterns
  • Single Responsibility principle with focused services
  • Interface-based design for maximum extensibility
  • Test-driven development with comprehensive coverage

📚 Updated Documentation

📖 Comprehensive Guides

  • Dual API documentation with feature comparison table
  • Enhanced environment variables documentation
  • Updated streaming examples for both API modes
  • API mode selection guidance and best practices

📋 API Comparison

Feature Responses API (gpt-4o) Images API (gpt-image-1)
Latest Technology ✅ 2025 Responses API ⚠️ Legacy API
Text in Images ✅ Superior ✅ Good
Context Awareness ✅ Excellent ⚠️ Limited
Streaming ✅ Partial previews ⚠️ Final only
Multi-turn ✅ Full support ⚠️ Basic

🔧 Migration Guide

For New Users

  • Default configuration uses Responses API automatically
  • No action required - just set OPENAI_API_KEY

For Existing Users

  • Backward compatible - existing configurations continue to work
  • Optional upgrade: Set API_MODE=responses to use latest features
  • Gradual migration supported with dual-mode architecture

📦 Installation

# Clone and install
git clone https://github.com/pavelsukhachev/mcp-server-gpt-image.git
cd mcp-server-gpt-image
npm install

# Configure (uses Responses API by default)
echo "OPENAI_API_KEY=your-api-key-here" > .env

# Run
npm run start:http

🚀 What's Next

This release establishes MCP Server GPT Image as the most advanced OpenAI image generation server available, supporting both cutting-edge Responses API and legacy Images API with seamless switching.

Perfect for:

  • 🎨 Advanced image generation workflows
  • 🔄 Multi-turn conversation contexts
  • ⚡ Real-time streaming applications
  • 🧪 Development and production environments

Full Changelog: https://github.com/pavelsukhachev/mcp-server-gpt-image/blob/main/CHANGELOG.md

v1.1.0 - Streaming, Caching & Optimization

13 Jul 14:14

Choose a tag to compare

🚀 New Features

⚡ Real-time Streaming Support

  • Server-Sent Events (SSE) endpoint at /mcp/stream
  • Live progress updates during image generation
  • Partial image previews (1-3 configurable)
  • Seamless error handling in stream

💾 Intelligent Caching System

  • Two-tier caching (memory + disk)
  • Content-based cache keys using SHA256
  • Configurable TTL and size limits
  • New tools: clear_cache and cache_stats
  • Automatic cleanup when limits exceeded

🖼️ Image Optimization Engine

  • Automatic compression with Sharp library
  • Support for JPEG, PNG, and WebP formats
  • Configurable quality/compression levels
  • Typical size reductions of 30-80%
  • Preserves transparency when needed

🔧 Enhanced Configuration

  • New environment variables for cache control
  • output_compression parameter for quality
  • partialImages parameter for streaming
  • format parameter for output format

📝 Documentation

  • Comprehensive API documentation
  • Environment configuration guide
  • Updated examples with streaming demos
  • Complete changelog

🐛 Bug Fixes

  • Removed unsupported response_format parameter
  • Improved error handling for API failures
  • Better base64 encoding for large images
  • Fixed package-lock.json sync issues

📦 Installation

npm install mcp-server-gpt-image@1.1.0

Or clone and build:

git clone https://github.com/pavelsukhachev/mcp-server-gpt-image.git
cd mcp-server-gpt-image
npm install
npm run build

🏃 Quick Start

Claude Desktop

Add to your Claude Desktop config:

{
  "mcpServers": {
    "gpt-image": {
      "command": "node",
      "args": ["/path/to/mcp-server-gpt-image/dist/index.js", "stdio"],
      "env": {
        "OPENAI_API_KEY": "your-openai-api-key-here"
      }
    }
  }
}

HTTP Server

npm run start:http

See the README for full documentation.

MCP Server GPT Image v1.0.0

13 Jul 03:30

Choose a tag to compare

🎉 Initial Release

We're excited to announce the first release of MCP Server GPT Image - a Model Context Protocol server that provides access to OpenAI's GPT Image-1 model for advanced image generation and editing.

✨ Features

  • 🎨 Image Generation: Create stunning images from text descriptions
  • ✏️ Image Editing: Modify existing images with text prompts and masks
  • 🔄 Multiple Transports: Support for both stdio and HTTP transports
  • 🐳 Docker Support: Easy deployment with Docker and Docker Compose
  • 📦 Production Ready: Session management, error handling, and health checks
  • 🔒 Secure: Environment-based API key management

🚀 Quick Start

  1. Clone the repository
  2. Set your OpenAI API key in .env
  3. Run npm install && npm run build
  4. Use with Claude Desktop or run as HTTP server

📖 Documentation

🐛 Known Issues

  • Streaming image generation is not yet available (pending Responses API support)
  • Some advanced editing features require specific mask formats

🙏 Acknowledgments

Built with the Model Context Protocol SDK and powered by OpenAI's GPT Image-1.


Full Changelog: https://github.com/pavelsukhachev/mcp-server-gpt-image/commits/v1.0.0