Skip to content

Commit 1fd3ac4

Browse files
committed
added docket setup guide
1 parent 541182f commit 1fd3ac4

File tree

3 files changed

+97
-2
lines changed

3 files changed

+97
-2
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ See [docs/ollama_setup.md](docs/ollama_setup.md) on how to setup Ollama locally.
5959

6060
⚠️ **Warning**: While Ollama provides free local model hosting, please note that vision models from Ollama can be significantly slower in processing documents and may not produce optimal results when handling complex PDF documents. For better accuracy and performance with complex layouts in PDF documents, consider using API-based models like OpenAI or Gemini.
6161

62+
### Setting up Vision Parse with Docker (Optional)
63+
See [docs/docker_setup.md](docs/docker_setup.md) on how to setup Vision Parse with Docker.
64+
6265
## 📚 Usage
6366

6467
### Basic Example Usage
@@ -204,7 +207,7 @@ Vision Parse offers several customization parameters to enhance document process
204207
| detailed_extraction | Enable advanced content extraction to extract complex information such as LaTeX equations, tables, images, etc. | bool |
205208
| enable_concurrency | Enable parallel processing of multiple pages in a PDF document in a single request | bool |
206209

207-
Note: For more details on custom configuration for Vision LLM providers, please refer to [docs/config.md](docs/config.md).
210+
Note: For more details on custom model configuration i.e. `openai_config`, `gemini_config`, and `ollama_config`; please refer to [docs/config.md](docs/config.md).
208211

209212
## 📊 Benchmarks
210213

docs/docker_setup.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Docker Setup Guide for Vision Parse
2+
3+
This guide explains how to set up Vision Parse using Docker on macOS and Linux systems.
4+
5+
## Prerequisites
6+
7+
- Docker and Docker Compose installed on your system
8+
- Nvidia GPU (optional, but recommended for better performance)
9+
10+
### macOS
11+
1. Install Docker Desktop for Mac from [Docker Hub](https://hub.docker.com/editions/community/docker-ce-desktop-mac)
12+
2. For Apple Silicon (M1/M2) users, ensure you're using ARM64 compatible images
13+
14+
### Linux
15+
1. Install Docker Engine:
16+
```bash
17+
curl -fsSL https://get.docker.com -o get-docker.sh
18+
sudo sh get-docker.sh
19+
```
20+
2. Install Docker Compose:
21+
```bash
22+
sudo apt-get install docker-compose
23+
```
24+
3. For GPU support, install NVIDIA Container Toolkit:
25+
```bash
26+
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
27+
28+
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
29+
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
30+
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
31+
32+
sudo apt-get update
33+
34+
sudo apt-get install -y nvidia-container-toolkit
35+
```
36+
37+
## Environment Setup
38+
39+
Export the required environment variables in your terminal:
40+
```bash
41+
# Required: Choose one of the following models
42+
export MODEL_NAME=llama3.2-vision:11b # select the model name from the list of supported models
43+
44+
# Optional: API keys (required only for specific models)
45+
export OPENAI_API_KEY=your_openai_api_key
46+
export GEMINI_API_KEY=your_gemini_api_key
47+
```
48+
49+
## Running Docker Container
50+
51+
1. If you have an Nvidia GPU, uncomment the following lines in your docker-compose.yml:
52+
```yaml
53+
deploy:
54+
resources:
55+
reservations:
56+
devices:
57+
- driver: nvidia
58+
count: 1
59+
capabilities: [gpu]
60+
```
61+
62+
2. Build and start the container:
63+
```bash
64+
# Build the image
65+
docker compose build
66+
67+
# Start the container in detached mode
68+
docker compose up -d
69+
```
70+
71+
## Troubleshooting
72+
73+
1. If using Ollama-based models, ensure port 11434 is not being used by another service:
74+
```bash
75+
# macOS
76+
lsof -i :11434
77+
78+
# Linux
79+
sudo netstat -tulpn | grep 11434
80+
```
81+
82+
2. Check container logs for any errors:
83+
```bash
84+
docker compose logs vision-parse
85+
```
86+
87+
## Stopping the Container
88+
89+
To stop the Vision Parse container:
90+
```bash
91+
docker compose down
92+
```

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)