|
| 1 | +# Proposal: AI Model Processor |
| 2 | + |
| 3 | +Author: [Chenyu Zhang](https://github.com/chlins) |
| 4 | + |
| 5 | +## Abstract |
| 6 | + |
| 7 | +This proposal aims to introduce an AI model processor to Harbor, which will parse AI model metadata based on a predefined [Model Spec](https://github.com/CloudNativeAI/model-spec). |
| 8 | +This will allow Harbor users to more efficiently manage, discover, and deploy AI models, similar to how Harbor manages container images. |
| 9 | + |
| 10 | +## Motivation |
| 11 | + |
| 12 | +Effective model management is increasingly crucial due to the growing popularity of AI/ML applications. |
| 13 | +While Harbor currently focuses on container image management with features like vulnerability scanning, replication, signature and more. |
| 14 | +Harbor is also natively an OCI Artifact registry. Therefore, packaging AI models as OCI Artifacts and storing them in Harbor is a viable approach. |
| 15 | +This would allow leveraging Harbor's existing features for improved AI model management, enhancing its applicability in AI/ML workloads and promoting model reusability and traceability. |
| 16 | + |
| 17 | + |
| 18 | +## Goals |
| 19 | + |
| 20 | +1. Ability to recognize that the AI model which based on `artifactType` and display the AI icon. |
| 21 | +2. Provide support for parsing AI model metadata based on the [Model Spec](https://github.com/CloudNativeAI/model-spec). |
| 22 | +3. Display the README, LICENSE and Files List of the AI model in the UI. (By existing artifact addition mechanism) |
| 23 | + |
| 24 | +## Non-Goals |
| 25 | + |
| 26 | +1. Ensure the current version is stable and there will be no subsequent changes, and no compatibility issues. |
| 27 | +Because the current model spec is still in a relatively early stage, there may be iterations and updates in the future. (But we can ensure the backward compatibility of the API.) |
| 28 | +2. Support to parse the AI model metadata from other artifact which were not bundled according to the [Model Spec](https://github.com/CloudNativeAI/model-spec). |
| 29 | + |
| 30 | +## Personas and User Stories |
| 31 | + |
| 32 | +This section lists the user stories regarding this enhancements for the different personas interacting with the AI model. |
| 33 | + |
| 34 | +* Personas |
| 35 | + |
| 36 | +Authorized users in Harbor which with the role of read permission of the artifact. |
| 37 | + |
| 38 | +* User Stories (Model-Specific) |
| 39 | + |
| 40 | +1. As a user with read permission of the model artifact, I can distinguish and view the AI model icon besides the artifact digest in the UI. |
| 41 | +2. As a user with read permission of the model artifact, I can view the AI model metadata such as model architecture, name, family, format, etc. |
| 42 | +3. As a user with read permission of the model artifact, I can click the `Readme` tab to view the README of the AI model. |
| 43 | +4. As a user with read permission of the model artifact, I can click the `License` tab to view the LICENSE of the AI model. |
| 44 | +5. As a user with read permission of the model artifact, I can click the `Files` tab to view the files list of the AI model. |
| 45 | + |
| 46 | +* User Stories (Common-Case for OCI Artifact) |
| 47 | + |
| 48 | +The following capabilities are acquired automatically as AI model artifacts following the OCI specification, so there is no additional work required. |
| 49 | + |
| 50 | +1. As a user with replication permission of the artifact, I can replicate the AI model to other OCI registries. |
| 51 | +2. As a user with preheat permission of the artifact, I can preheat the AI model to the p2p cluster. |
| 52 | +3. As a user with retention permission of the artifact, I can set the retention policy for the AI model. |
| 53 | +4. As a user with signature permission of the artifact, I can sign the AI model by 3rd-party signature tools such as cosign. |
| 54 | +5. As a user with vulnerability permission of the artifact, I can scan the AI model for vulnerabilities. (depends on the scanner capability for scanning AI model) |
| 55 | + |
| 56 | +## Scheme Changes |
| 57 | + |
| 58 | +No schema changes of database are required for this proposal. |
| 59 | + |
| 60 | +## Architecture |
| 61 | + |
| 62 | +There will be a new AI model processor added to the artifact processors in Harbor core. |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | +## UI |
| 67 | + |
| 68 | +The UI changes are as follows: |
| 69 | + |
| 70 | +1. Add the AI model icon besides the artifact digest in the artifact list page. |
| 71 | +2. Add some tag labels on top of the artifact detail page, which will display the AI model metadata such as model architecture, name, family, format, etc. |
| 72 | + |
| 73 | + |
| 74 | + |
| 75 | +3. Add a new tab named `Readme` to the artifact detail page, which will display the README of the AI model. |
| 76 | + |
| 77 | + |
| 78 | + |
| 79 | +4. Add a new tab named `License` to the artifact detail page, which will display the LICENSE of the AI model. |
| 80 | + |
| 81 | + |
| 82 | + |
| 83 | +5. Add a new tab named `Files` to the artifact detail page, which will display the files list of the AI model. |
| 84 | + |
| 85 | + |
| 86 | + |
| 87 | +## API |
| 88 | + |
| 89 | +There is no need to introduce new APIs for this proposal, as the existing "addition" API for the artifact can be utilized to extend capabilities for parsing AI model metadata. |
| 90 | + |
| 91 | +The manifest of artifact bundled based on the [Model Spec](https://github.com/CloudNativeAI/model-spec) might look like this, please note that this may differ based on the spec version. |
| 92 | + |
| 93 | +```json |
| 94 | +{ |
| 95 | + "schemaVersion": 2, |
| 96 | + "mediaType": "application/vnd.oci.image.manifest.v1+json", |
| 97 | + "artifactType": "application/vnd.cnai.model.manifest.v1+json", |
| 98 | + "config": { |
| 99 | + "mediaType": "application/vnd.cnai.model.config.v1+json", |
| 100 | + "digest": "sha256:eb9e987a8c340a40ba4354dd4ed96f7ef2e885e3d2e9f128676f995c967890ac", |
| 101 | + "size": 277 |
| 102 | + }, |
| 103 | + "layers": [ |
| 104 | + { |
| 105 | + "mediaType": "application/vnd.cnai.model.doc.v1.tar", |
| 106 | + "digest": "sha256:5a96686deb327903f4310e9181ef2ee0bc7261e5181bd23ccdce6c575b6120a2", |
| 107 | + "size": 13312, |
| 108 | + "annotations": { |
| 109 | + "org.cnai.model.filepath": "LICENSE" |
| 110 | + } |
| 111 | + }, |
| 112 | + { |
| 113 | + "mediaType": "application/vnd.cnai.model.doc.v1.tar", |
| 114 | + "digest": "sha256:44a6e989cc7084ef35aedf1dd7090204ccc928829c51ce79d7d59c346a228333", |
| 115 | + "size": 5632, |
| 116 | + "annotations": { |
| 117 | + "org.cnai.model.filepath": "README.md" |
| 118 | + } |
| 119 | + }, |
| 120 | + { |
| 121 | + "mediaType": "application/vnd.cnai.model.weight.config.v1.tar", |
| 122 | + "digest": "sha256:a4e7c313c8addcc5f8ac3d87d48a9af7eb89bf8819c869c9fa0cad1026397b0c", |
| 123 | + "size": 2560, |
| 124 | + "annotations": { |
| 125 | + "org.cnai.model.filepath": "config.json" |
| 126 | + } |
| 127 | + }, |
| 128 | + { |
| 129 | + "mediaType": "application/vnd.cnai.model.weight.config.v1.tar", |
| 130 | + "digest": "sha256:628ce381719b65598622e3f71844192f84e135d937c7b5a8116582edbe3b1f5d", |
| 131 | + "size": 2048, |
| 132 | + "annotations": { |
| 133 | + "org.cnai.model.filepath": "generation_config.json" |
| 134 | + } |
| 135 | + }, |
| 136 | + { |
| 137 | + "mediaType": "application/vnd.cnai.model.weight.v1.tar", |
| 138 | + "digest": "sha256:0b2acb3b78edf2ca31de915cb8b294951c89e5e2d1274c44c2a27dabfcc2c5da", |
| 139 | + "size": 988099584, |
| 140 | + "annotations": { |
| 141 | + "org.cnai.model.filepath": "model" |
| 142 | + } |
| 143 | + }, |
| 144 | + { |
| 145 | + "mediaType": "application/vnd.cnai.model.weight.config.v1.tar", |
| 146 | + "digest": "sha256:0480097912f4dd530382c69f00d41409bc51f62ea146a04d70c0254791f4ac32", |
| 147 | + "size": 7033344, |
| 148 | + "annotations": { |
| 149 | + "org.cnai.model.filepath": "tokenizer.json" |
| 150 | + } |
| 151 | + }, |
| 152 | + { |
| 153 | + "mediaType": "application/vnd.cnai.model.weight.config.v1.tar", |
| 154 | + "digest": "sha256:ebea935e6c2de57780addfc0262c30c2f83afb1457a124fd9b22370e6cb5bc34", |
| 155 | + "size": 9216, |
| 156 | + "annotations": { |
| 157 | + "org.cnai.model.filepath": "tokenizer_config.json" |
| 158 | + } |
| 159 | + }, |
| 160 | + { |
| 161 | + "mediaType": "application/vnd.cnai.model.weight.config.v1.tar", |
| 162 | + "digest": "sha256:3a2844a891e19d1d183ac12918a497116309ba9abe0523cdcf1874cf8aebe8e0", |
| 163 | + "size": 2778624, |
| 164 | + "annotations": { |
| 165 | + "org.cnai.model.filepath": "vocab.json" |
| 166 | + } |
| 167 | + } |
| 168 | + ] |
| 169 | +} |
| 170 | +``` |
| 171 | + |
| 172 | +### Endpoints |
| 173 | + |
| 174 | +**Please notice that these endpoints are relies on the predefined annotation `org.cnai.model.filepath` which defined in the [Model Spec Annotations](https://github.com/CloudNativeAI/model-spec/blob/main/docs/annotations.md#layer-annotation-keys).** |
| 175 | + |
| 176 | +#### GET README |
| 177 | + |
| 178 | +The layer with the following characteristics will be identified as README. |
| 179 | + |
| 180 | +| Property | Value | |
| 181 | +| --- | --- | |
| 182 | +| `mediaType` | `application/vnd.cnai.model.doc.v1.tar` | |
| 183 | +| `annotations` | `org.cnai.model.filepath`: `README` or `README.md` | |
| 184 | + |
| 185 | +Response Explanation: |
| 186 | + |
| 187 | +| Status Code | Description | |
| 188 | +| --- | --- | |
| 189 | +| 200 | normal case | |
| 190 | +| 404 | layer with README characteristics not found | |
| 191 | +| 500 | internal server error(other runtime error or unexpected error) | |
| 192 | + |
| 193 | +`GET /api/v2.0/projects/{project_name}/repositories/{repository_name}/artifacts/{reference}/additions/readme` |
| 194 | + |
| 195 | +`Content-Type: text/markdown; charset=utf-8` |
| 196 | + |
| 197 | +```text |
| 198 | +# Llama 3 Model |
| 199 | +
|
| 200 | +## Overview |
| 201 | +
|
| 202 | +Llama 3 is the next generation of our large language model, designed for enhanced performance, reasoning, and safety. It builds upon the foundations of Llama 2, incorporating architectural improvements and a larger training corpus. |
| 203 | +
|
| 204 | +### Key Features |
| 205 | +
|
| 206 | +* **Improved Performance:** Llama 3 exhibits superior performance across various benchmarks compared to its predecessor. |
| 207 | +* **Enhanced Reasoning:** The model demonstrates improved reasoning abilities, leading to more accurate and coherent responses. |
| 208 | +* **Safety-First Approach:** We've prioritized safety in Llama 3's development, incorporating rigorous testing and evaluation. |
| 209 | +* **Scalability:** Llama 3 is designed to scale to a variety of hardware configurations, enabling widespread accessibility. |
| 210 | +
|
| 211 | +...... |
| 212 | +``` |
| 213 | + |
| 214 | +#### GET LICENSE |
| 215 | + |
| 216 | +The layer with the following characteristics will be identified as LICENSE. |
| 217 | + |
| 218 | +| Property | Value | |
| 219 | +| --- | --- | |
| 220 | +| `mediaType` | `application/vnd.cnai.model.doc.v1.tar` | |
| 221 | +| `annotations` | `org.cnai.model.filepath`: `LICENSE` or `LICENSE.txt` | |
| 222 | + |
| 223 | +Response Explanation: |
| 224 | + |
| 225 | +| Status Code | Description | |
| 226 | +| --- | --- | |
| 227 | +| 200 | normal case | |
| 228 | +| 404 | layer with LICENSE characteristics not found | |
| 229 | +| 500 | internal server error(other runtime error or unexpected error) | |
| 230 | + |
| 231 | +`GET /api/v2.0/projects/{project_name}/repositories/{repository_name}/artifacts/{reference}/additions/license` |
| 232 | + |
| 233 | +`Content-Type: text/plain; charset=utf-8` |
| 234 | + |
| 235 | +```text |
| 236 | +MIT License |
| 237 | +
|
| 238 | +Copyright (c) Meta Platforms, Inc. and affiliates. |
| 239 | +
|
| 240 | +Permission is hereby granted, free of charge, to any person obtaining a copy |
| 241 | +of this software and associated documentation files (the "Software"), to deal |
| 242 | +in the Software without restriction, including without limitation the rights |
| 243 | +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |
| 244 | +copies of the Software, and to permit persons to whom the Software is |
| 245 | +furnished to do so, subject to the following conditions: |
| 246 | +
|
| 247 | +The above copyright notice and this permission notice shall be included in all |
| 248 | +copies or substantial portions of the Software. |
| 249 | +
|
| 250 | +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
| 251 | +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
| 252 | +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
| 253 | +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
| 254 | +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
| 255 | +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
| 256 | +SOFTWARE. |
| 257 | +``` |
| 258 | + |
| 259 | +#### GET Files List |
| 260 | + |
| 261 | +The layer with the annotations `org.cnai.model.filepath` will be identified as the file, the API will construct a file tree by walking through the manifest layers and return a JSON object of the files list. |
| 262 | + |
| 263 | +Response Explanation: |
| 264 | + |
| 265 | +| Status Code | Description | |
| 266 | +| --- | --- | |
| 267 | +| 200 | normal case(empty list returned if no layer with annotations `org.cnai.model.filepath`) | |
| 268 | +| 500 | internal server error(other runtime error or unexpected error) | |
| 269 | + |
| 270 | +`GET /api/v2.0/projects/{project_name}/repositories/{repository_name}/artifacts/{reference}/additions/files` |
| 271 | + |
| 272 | +`Content-Type: application/json; charset=utf-8` |
| 273 | + |
| 274 | +```json |
| 275 | +[ |
| 276 | + { |
| 277 | + "name": "config.json", |
| 278 | + "type": "file", |
| 279 | + "size": 1024, |
| 280 | + }, |
| 281 | + { |
| 282 | + "name": "tokenizer.json", |
| 283 | + "type": "file", |
| 284 | + "size": 1024, |
| 285 | + }, |
| 286 | + { |
| 287 | + "name": "README.md", |
| 288 | + "type": "file", |
| 289 | + "size": 1024, |
| 290 | + }, |
| 291 | + { |
| 292 | + "name": "LICENSE", |
| 293 | + "type": "file", |
| 294 | + "size": 1024, |
| 295 | + }, |
| 296 | + { |
| 297 | + "name": "model-000001.safetensors", |
| 298 | + "type": "file", |
| 299 | + "size": 1048576, |
| 300 | + }, |
| 301 | + { |
| 302 | + "name": "model-000002.safetensors", |
| 303 | + "type": "file", |
| 304 | + "size": 1048576, |
| 305 | + }, |
| 306 | + { |
| 307 | + "name": "dir1", |
| 308 | + "type": "directory", |
| 309 | + "children": [ |
| 310 | + { |
| 311 | + "name": "dir2", |
| 312 | + "type": "directory", |
| 313 | + "children": [ |
| 314 | + { |
| 315 | + "name": "file1.txt", |
| 316 | + "type": "file", |
| 317 | + "size": 1024, |
| 318 | + } |
| 319 | + ] |
| 320 | + }, |
| 321 | + { |
| 322 | + "name": "foo.txt", |
| 323 | + "type": "file", |
| 324 | + "size": 1024, |
| 325 | + }, |
| 326 | + { |
| 327 | + "name": "bar.txt", |
| 328 | + "type": "file", |
| 329 | + "size": 1024, |
| 330 | + }, |
| 331 | + ] |
| 332 | + }, |
| 333 | +] |
| 334 | + |
| 335 | +``` |
| 336 | + |
| 337 | +## References |
| 338 | + |
| 339 | +1. [Model Spec](https://github.com/CloudNativeAI/model-spec) (Cloud Native Artifacial Intelegence Model Format Specification) |
| 340 | +2. [modctl](https://github.com/CloudNativeAI/modctl) (Command-line tools for managing OCI model artifacts which are bundled based on Model Spec) |
| 341 | + |
| 342 | +## To Be Discussed(future features) |
| 343 | + |
| 344 | +*The following content is only for discussion to introduce future iterative features, to be implemented as phase 2 in the future, and is not part of the implementation of this proposal.* |
| 345 | + |
| 346 | +Should we think about offering the capability to open files? like `config.json` and `tokenizer.json`, but how should we handle model files? One approach could be to not display them or only show certain parameter information in the header, similar to what HuggingFace does. |
| 347 | + |
| 348 | +Example: |
| 349 | + |
| 350 | + |
0 commit comments