A Flask-based API that generates captions for images using a custom deep learning model (BLIP). The API provides endpoints for:
✅ Generate descriptive captions for uploaded or base64-encoded images ✅ Optionally overlay a bounding box on the image ✅ Fast processing time ✅ CORS-enabled for cross-origin access
blip/
├── app.py # Flask server
├── model.py # Image captioning model logic
├── index.html # Simple frontend for testing
Endpoint | Method | Description |
---|---|---|
/ |
GET | API status and list of endpoints |
/caption |
POST | Generate a caption for an image |
/caption_with_box |
POST | Generate a caption and return image with a bounding box |
- Form field:
image
(file upload or base64 string)
1️⃣ Clone the repository
git clone https://github.com/sebastianbenjamin/blip.git
cd blip
2️⃣ Install the required packages
pip install -r requirements.txt
python app.py
By default, it will run at:
http://0.0.0.0:5000/
You can test the API using the provided frontend:
Open blip/index.html in your browser
This allows you to upload images and see the generated captions directly.
curl -X POST -F "image=@/path/to/your/image.jpg" http://localhost:5000/caption
curl -X POST -F "image=@/path/to/your/image.jpg" http://localhost:5000/caption_with_box
- Input can be an uploaded image file or a base64-encoded string in the
image
field. - Response includes processing time in seconds.
This project is licensed under the MIT License.
👉 Quick Start:
python blip/app.py
Then open:
blip/index.html
in your browser!