|
| 1 | +*Note:* |
| 2 | +This functionality is a work in progress |
| 3 | + |
| 4 | +# Pulling the models {#ovms_pul} |
| 5 | + |
| 6 | +There is a special mode to make OVMS pull the model from Hugging Face before starting the service: |
| 7 | + |
| 8 | +``` |
| 9 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --pull_hf_model --source_model <model_name_in_HF> --model_repository <path_where_to_store_model_files> --model_name <external_model_name> --task <task> --task_params <task_params> |
| 10 | +``` |
| 11 | + |
| 12 | +| option | description | |
| 13 | +|----------------------|-----------------------------------------------------------------------------------------------| |
| 14 | +| `--pull` | Instructs the server to run in pulling mode to get the model from the Hugging Face repository | |
| 15 | +| `--source_model` | Specifies the model name in the Hugging Face model repository (optional - if empty model_name is used) | |
| 16 | +| `--model_repository` | Directory where all required model files will be saved | |
| 17 | +| `--model_name` | Name of the model as exposed externally by the server | |
| 18 | +| `--task` | Defines the task the model will support (e.g., text_generation/embedding, rerank, etc.) | |
| 19 | +| `--task_params` | Task-specific parameters in a format to be determined (TBD FIXME) | |
| 20 | + |
| 21 | +``` |
| 22 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \ |
| 23 | +--model_path <path_to_model> --model_name <model_name> --port 9000 --rest_port 8000 --log_level DEBUG |
| 24 | +``` |
| 25 | + |
| 26 | +It will prepare all needed configuration files to support LLMS with OVMS in model repository |
| 27 | + |
| 28 | +# Starting the mediapipe graph or LLM models |
| 29 | +Now you can start server with single mediapipe graph, or LLM model that is already present in local filesystem with: |
| 30 | + |
| 31 | +``` |
| 32 | +docker run -d --rm -v <models_repository>:/models -p 9000:9000 -p 8000:8000 openvino/model_server:latest \ |
| 33 | +--model_path <path_to_model> --model_name <model_name> --port 9000 --rest_port 8000 |
| 34 | +``` |
| 35 | + |
| 36 | +Server will detect the type of requested servable (model or mediapipe graph) and load it accordingly. This detection is based on the presence of a `.pbtxt` file, which defines the Mediapipe graph structure. |
| 37 | + |
| 38 | +*Note*: There is no online model modification nor versioning capability as of now for graphs, LLM like models. |
| 39 | + |
| 40 | +# Starting the LLM model from HF directly |
| 41 | + |
| 42 | +In case you do not want to prepare model repository before starting the server in one command you can run OVMS with: |
| 43 | + |
| 44 | +``` |
| 45 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --source_model <model_name_in_HF> --model_repository <path_where_to_store_model_files> --model_name <ovms_servable_name> --task <task> --task_params <task_params> |
| 46 | +``` |
| 47 | + |
| 48 | +It will download required model files, prepare configuration for OVMS and start serving the model. |
| 49 | + |
| 50 | +# Starting the LLM model from local storage |
| 51 | + |
| 52 | +In case you have predownloaded the model files from HF but you lack OVMS configuration files you can start OVMS with |
| 53 | +``` |
| 54 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest --source_model <model_name_in_HF> --model_repository <path_where_to_store_ovms_config_files> --model_path <model_files_path> --model_name <external_model_name> --task <task> --task_params <task_params> |
| 55 | +``` |
| 56 | + |
| 57 | +# Simplified mediapipe graphs and LLM models loading |
| 58 | + |
| 59 | +Now there is an easier way to specify LLM configurations in `config.json`. In the `model_config` section, it is sufficient to specify `model_name` and `base_path`, and the server will detect if there is a graph configuration file (`.pbtxt`) present and load the servable accordingly. |
| 60 | + |
| 61 | +For example, the `model_config` section in `config.json` could look like this: |
| 62 | + |
| 63 | +```json |
| 64 | +{ |
| 65 | + "model_config_list": [ |
| 66 | + { |
| 67 | + "config": { |
| 68 | + "name": "text_generation_model", |
| 69 | + "base_path": "/models/text_generation_model" |
| 70 | + } |
| 71 | + }, |
| 72 | + { |
| 73 | + "config": { |
| 74 | + "name": "embedding_model", |
| 75 | + "base_path": "/models/embedding_model" |
| 76 | + } |
| 77 | + }, |
| 78 | + { |
| 79 | + "config": { |
| 80 | + "name": "mediapipe_graph", |
| 81 | + "base_path": "/models/mediapipe_graph" |
| 82 | + } |
| 83 | + } |
| 84 | + ] |
| 85 | +} |
| 86 | +``` |
| 87 | +# List models |
| 88 | + |
| 89 | +To check what models are servable from specified model repository: |
| 90 | +``` |
| 91 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \ |
| 92 | +--models_repository /models --list_models |
| 93 | +``` |
| 94 | + |
| 95 | +For following directory structure: |
| 96 | +``` |
| 97 | +/models |
| 98 | +├── meta |
| 99 | +│ ├── llama4 |
| 100 | +│ │ └── graph.pbtxt |
| 101 | +│ ├── llama3.1 |
| 102 | +│ │ └── graph.pbtxt |
| 103 | +├── LLama3.2 |
| 104 | +│ └── graph.pbtxt |
| 105 | +└── resnet |
| 106 | + └── 1 |
| 107 | + └── saved_model.pb |
| 108 | +``` |
| 109 | + |
| 110 | +The output would be: |
| 111 | +``` |
| 112 | +meta/llama4 |
| 113 | +meta/llama3.1 |
| 114 | +LLama3.2 |
| 115 | +resnet |
| 116 | +``` |
| 117 | + |
| 118 | +# Enable model |
| 119 | + |
| 120 | +To add model to ovms configuration file with specific model use either: |
| 121 | + |
| 122 | +``` |
| 123 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \ |
| 124 | +--models_repository /models/<model_path> --add_to_config <config_file_path> --model_name <name> |
| 125 | +``` |
| 126 | + |
| 127 | +When model is directly inside `/models`. |
| 128 | + |
| 129 | +Or |
| 130 | + |
| 131 | +``` |
| 132 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \ |
| 133 | +--add_to_config <config_file_path> --model_name <name> --model_path <model_path> |
| 134 | +``` |
| 135 | +when there is no model_repository specified. |
| 136 | + |
| 137 | +# Disable model |
| 138 | + |
| 139 | +If you want to remove model from configuration file you can do it either manually or use command: |
| 140 | + |
| 141 | +``` |
| 142 | +docker run -d --rm -v <models_repository>:/models openvino/model_server:latest \ |
| 143 | +--remove_from_config <config_file_path> --model_name <name> |
| 144 | +``` |
| 145 | + |
| 146 | +FIXME TODO TBD |
| 147 | +- adjust existing documentation to link with this doc |
| 148 | +- task, task_params to be updated explained |
| 149 | +- do we want to allow in pulling mode separately specifying model_path/repository? |
| 150 | +- we should explain the relevance of config.json to model repository (ie that config.json will work with specific dir) |
0 commit comments