Skip to content

Commit 88da3d3

Browse files
authored
HF pulling model docs (#3273)
Ticket:CVS-166548
1 parent 9834f6b commit 88da3d3

File tree

1 file changed

+150
-0
lines changed

1 file changed

+150
-0
lines changed

docs/pull_hf_models.md

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
*Note:*
2+
This functionality is a work in progress
3+
4+
# Pulling the models {#ovms_pul}
5+
6+
There is a special mode to make OVMS pull the model from Hugging Face before starting the service:
7+
8+
```
9+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --pull_hf_model --source_model <model_name_in_HF> --model_repository <path_where_to_store_model_files> --model_name <external_model_name> --task <task> --task_params <task_params>
10+
```
11+
12+
| option | description |
13+
|----------------------|-----------------------------------------------------------------------------------------------|
14+
| `--pull` | Instructs the server to run in pulling mode to get the model from the Hugging Face repository |
15+
| `--source_model` | Specifies the model name in the Hugging Face model repository (optional - if empty model_name is used) |
16+
| `--model_repository` | Directory where all required model files will be saved |
17+
| `--model_name` | Name of the model as exposed externally by the server |
18+
| `--task` | Defines the task the model will support (e.g., text_generation/embedding, rerank, etc.) |
19+
| `--task_params` | Task-specific parameters in a format to be determined (TBD FIXME) |
20+
21+
```
22+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \
23+
--model_path <path_to_model> --model_name <model_name> --port 9000 --rest_port 8000 --log_level DEBUG
24+
```
25+
26+
It will prepare all needed configuration files to support LLMS with OVMS in model repository
27+
28+
# Starting the mediapipe graph or LLM models
29+
Now you can start server with single mediapipe graph, or LLM model that is already present in local filesystem with:
30+
31+
```
32+
docker run -d --rm -v <models_repository>:/models -p 9000:9000 -p 8000:8000 openvino/model_server:latest \
33+
--model_path <path_to_model> --model_name <model_name> --port 9000 --rest_port 8000
34+
```
35+
36+
Server will detect the type of requested servable (model or mediapipe graph) and load it accordingly. This detection is based on the presence of a `.pbtxt` file, which defines the Mediapipe graph structure.
37+
38+
*Note*: There is no online model modification nor versioning capability as of now for graphs, LLM like models.
39+
40+
# Starting the LLM model from HF directly
41+
42+
In case you do not want to prepare model repository before starting the server in one command you can run OVMS with:
43+
44+
```
45+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --source_model <model_name_in_HF> --model_repository <path_where_to_store_model_files> --model_name <ovms_servable_name> --task <task> --task_params <task_params>
46+
```
47+
48+
It will download required model files, prepare configuration for OVMS and start serving the model.
49+
50+
# Starting the LLM model from local storage
51+
52+
In case you have predownloaded the model files from HF but you lack OVMS configuration files you can start OVMS with
53+
```
54+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --source_model <model_name_in_HF> --model_repository <path_where_to_store_ovms_config_files> --model_path <model_files_path> --model_name <external_model_name> --task <task> --task_params <task_params>
55+
```
56+
57+
# Simplified mediapipe graphs and LLM models loading
58+
59+
Now there is an easier way to specify LLM configurations in `config.json`. In the `model_config` section, it is sufficient to specify `model_name` and `base_path`, and the server will detect if there is a graph configuration file (`.pbtxt`) present and load the servable accordingly.
60+
61+
For example, the `model_config` section in `config.json` could look like this:
62+
63+
```json
64+
{
65+
"model_config_list": [
66+
{
67+
"config": {
68+
"name": "text_generation_model",
69+
"base_path": "/models/text_generation_model"
70+
}
71+
},
72+
{
73+
"config": {
74+
"name": "embedding_model",
75+
"base_path": "/models/embedding_model"
76+
}
77+
},
78+
{
79+
"config": {
80+
"name": "mediapipe_graph",
81+
"base_path": "/models/mediapipe_graph"
82+
}
83+
}
84+
]
85+
}
86+
```
87+
# List models
88+
89+
To check what models are servable from specified model repository:
90+
```
91+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \
92+
--models_repository /models --list_models
93+
```
94+
95+
For following directory structure:
96+
```
97+
/models
98+
├── meta
99+
│ ├── llama4
100+
│ │ └── graph.pbtxt
101+
│ ├── llama3.1
102+
│ │ └── graph.pbtxt
103+
├── LLama3.2
104+
│ └── graph.pbtxt
105+
└── resnet
106+
└── 1
107+
└── saved_model.pb
108+
```
109+
110+
The output would be:
111+
```
112+
meta/llama4
113+
meta/llama3.1
114+
LLama3.2
115+
resnet
116+
```
117+
118+
# Enable model
119+
120+
To add model to ovms configuration file with specific model use either:
121+
122+
```
123+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \
124+
--models_repository /models/<model_path> --add_to_config <config_file_path> --model_name <name>
125+
```
126+
127+
When model is directly inside `/models`.
128+
129+
Or
130+
131+
```
132+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \
133+
--add_to_config <config_file_path> --model_name <name> --model_path <model_path>
134+
```
135+
when there is no model_repository specified.
136+
137+
# Disable model
138+
139+
If you want to remove model from configuration file you can do it either manually or use command:
140+
141+
```
142+
docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \
143+
--remove_from_config <config_file_path> --model_name <name>
144+
```
145+
146+
FIXME TODO TBD
147+
- adjust existing documentation to link with this doc
148+
- task, task_params to be updated explained
149+
- do we want to allow in pulling mode separately specifying model_path/repository?
150+
- we should explain the relevance of config.json to model repository (ie that config.json will work with specific dir)

0 commit comments

Comments
 (0)