ModelZoo

ModelZoo is a system for managing and serving local AI Models. It provides a flexible framework for discovering, launching, and managing Language, Vision and Image generation models using different runtimes and environments.

ZooKeeper

ZooKeeper is the entry-point of ModelZoo. It's a flask application that:

Loads configuration from a YAML file.
Instantiates the configured zoos and runtimes.
Provides a web-based user interface to:
- List available models.
- Launch models with specific runtimes and configurations.
- Manage running models (viewing logs, stopping models)
Embeds a proxy server (proxy.py) that forwards requests to the appropriate running model
- Supports OpenAI protocol for text, chat and multi-modal completions
- Supports A1111 protocol for image generation
Keeps track of model launch history
- Number of times a model has been launched, and the last launch time (to sort models by most frequently used)
- Last used enviroment and parameters (provides a better user experience by pre-filling launch configurations based on previous usage)
Peers with instances of itself on other nodes to create distributed setups.

Getting Started

Clone the repository.
Install dependencies:
```
pip install -r requirements.txt
```
Create a config.yaml YAML file.
Run the ZooKeeper application:
```
python ./main.py --config config.yaml
```
Open the ZooKeeper web interface (listening at http://0.0.0.0:3333/ by default) to view, launch and manage models.

Components

ModelZoo is composed of several key components:

Zoos: Discovery systems that catalog models.
Models: Data objects representing models.
Runtimes: Backends that can serve models in specific environments.
Environments: Named GPU Configurations (environment variables).
EnvironmentSet: A collection of environments combined for model execution.
ZooKeeper: Web application to interact with zoos, use runtimes to spawn models, interface with history and host the proxy.
Proxy: A hybrid OpenAI-compatible (text+multimodal) and A1111-compatible (image) proxy server.
ModelHistory: A ZooKeeper component that tracks model launch history, including frequency of use and last used configurations.
Peers: Instances of ZooKeeper running on other hosts.

Configuration

ModelZoo (in practice, ZooKeeper) is configured using a YAML file that defines:

Zoos to be instantiated and their configurations.
Runtimes to be made available.
Predefined environments.
Remote peers for distributed model management.

Example Configuation

zoos:
   - name: SSD
     class: FolderZoo
     params:
        path: /mnt/ssd0

runtimes:
   - name: LlamaRuntime
     class: LlamaRuntime
     params:
       bin_path: /home/mike/work/llama.cpp/llama-server
       
envs:
   - name: "P40/0"
     vars:
        CUDA_VISIBLE_DEVICES: 0
   - name: "P40/1"
     vars:
        CUDA_VISIBLE_DEVICES: 1

peers:
   - host: another-host
     port: 3333

This example assumes you have some *.gguf files under /mnt/ssd0 and that you have a compiled llama.cpp server binary at the specified path and that you have a second instance of ModelZoo running on another-host.

Zoos

Zoos are responsible for discovering and cataloging models.

That the name field is optional and will default to class if not provided, but naming your Zoos is strongly encouraged.

The system supports different types of zoos:

FolderZoo: Discovers models in a specified file system folder.
- Parameters:
  - path (str): Path to folder containing models
- Example:
```
- name: LocalModels
  class: FolderZoo
  params:
    path: /path/to/models
```

StaticZoo: Returns a predefined list of models.

Parameters:
- models (List[Dict]): List of model dictionaries

Example:

- name: PredefinedModels
  class: StaticZoo
  params:
    models:
      - model_id: chatgpt
        model_name: ChatGPT
        model_format: litellm

OpenAIZoo: Fetches models from an OpenAI-compatible API.
- Parameters:
  - api_url (str): Base URL of the OpenAI-compatible API
  - api_key (str, optional): API key for authentication
  - api_key_env (str, optional): Environment variable name containing the API key
  - cache (bool): Whether to cache the model list (default: True)
  - models (List[str], optional): Optional list of models to override API exploration
- Example:
```
- name: OpenAIModels
  class: OpenAIZoo
  params:
    api_url: https://api.openai.com/v1
    api_key_env: OPENAI_API_KEY
    cache: true
```
OllamaZoo: Discovers models from a local or remote Ollama instance.
- Parameters:
  - api_url (str): Base URL of the Ollama API (default: http://localhost:11434)
- Example:
```
- name: LocalOllama
  class: OllamaZoo
  params:
    api_url: http://localhost:11434
```

Each zoo type is designed to accommodate different model discovery and management needs, allowing for flexibility in how models are sourced and cataloged within the ModelZoo system.

Runtimes

Runtimes are responsible for serving models. The name field is optional, and will default to class if not provided.

LlamaRuntime: For serving GGUF models with llama-server
- Compatible model formats: gguf
- Parameters:
  - bin_path (str): Path to the llama.cpp server binary
- Example:
```
- name: LlamaRuntime
  class: LlamaRuntime
  params:
    bin_path: /path/to/llama-server
```
LlamaSrbRuntime: For serving GGUF models with llama-srb-api
- Compatible model formats: gguf
- Parameters:
  - bin_path (str): Path to the llama.cpp server binary
- Example:
```
- class: LlamaSrbRuntime
  params:
    script_path: /path/to/llama-srb-api/api.py
```
KoboldCppRuntime: For serving GGUF models using KoboldCpp
- Compatible model formats: gguf
- Parameters:
  - bin_path (str): Path to the KoboldCpp binary
- Example:
```
- name: KoboldCppRuntime
  class: KoboldCppRuntime
  params:
    bin_path: /path/to/koboldcpp
```
TabbyRuntime: For serving GPTQ and EXL2 models using TabbyAPI
- Compatible model formats: gptq, exl2
- Parameters:
  - script_path (str): Path to the TabbyAPI start.sh script
- Example:
```
- name: TabbyRuntime
  class: TabbyRuntime
  params:
    script_path: /path/to/tabby_api/start.sh
```
LiteLLMRuntime: For proxying models using LiteLLM
- Compatible model formats: litellm
  - All formats supported by LiteLLM (including OpenAI, Azure, Anthropic, and various open-source models)
- Parameters:
  - bin_path (str, optional): Path to the LiteLLM binary (default: "litellm")
- Example:
```
- name: LiteLLMRuntime
  class: LiteLLMRuntime
  params:
    bin_path: litellm
```
SDServerRuntime: For serving Stable Diffusion models using stable-diffusion.cpp
- Compatible model formats: kcppt
- Parameters:
  - bin_path (str): Path to the sd-server binary
- Example:
```
- name: SDServerRuntime
  class: SDServerRuntime
  params:
    bin_path: /path/to/sd-server
```
- Runtime Parameters:
  - sampler_name: Sampling method (Euler, Euler A, Heun, DPM2, DPM++, LCM)
  - cfg_scale: CFG Scale for guidance (default: 1.0)
  - steps: Number of sampling steps (default: 1)
  - extra_args: Additional command line arguments

Each runtime defines compatible model formats and configurable parameters. When launching a model, you can specify additional runtime-specific parameters as needed. The choice of runtime depends on the model format and the specific features required for your use case.

Environments

Environments are configurations for running models, typically including environment variables like CUDA_VISIBLE_DEVICES.

Example:

envs:
   - name: "RTX3090"
     vars:
        CUDA_VISIBLE_DEVICES: 0
   - name: "P40"
     vars:
        CUDA_VISIBLE_DEVICES: 1

Multiple enviroments may be pre-defined in the configuration file, and multiple enviroments can be selected when launching model (any conflicting values will be merged with a comma).

Remote Models (Peers)

The remote models feature allows you to connect multiple ModelZoo instances and view the running models on remote peers. To configure remote peers:

Add a peers section to your configuration file.
For each peer, specify the host and port where the remote ModelZoo instance is running.

Example:

peers:
  - host: falcon
    port: 3333

The web interface will display the status and running models of each configured peer, allowing you to manage a distributed setup of ModelZoo instances.

Name		Name	Last commit message	Last commit date
Latest commit History 291 Commits
templates		templates
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
base.py		base.py
main.py		main.py
protocols.py		protocols.py
proxy.py		proxy.py
requirements.txt		requirements.txt
runtime.py		runtime.py
zk.py		zk.py
zoo.py		zoo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ModelZoo

ZooKeeper

Getting Started

Components

Configuration

Example Configuation

Zoos

Runtimes

Environments

Remote Models (Peers)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

the-crypt-keeper/modelzoo

Folders and files

Latest commit

History

Repository files navigation

ModelZoo

ZooKeeper

Getting Started

Components

Configuration

Example Configuation

Zoos

Runtimes

Environments

Remote Models (Peers)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages