Skip to content

Commit 51508ac

Browse files
authored
Update README
1 parent e39f8f3 commit 51508ac

File tree

1 file changed

+26
-18
lines changed

1 file changed

+26
-18
lines changed

README.md

Lines changed: 26 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,11 @@ https://github.com/user-attachments/assets/3837c4f6-45cb-43f2-9d51-a45f742424d4
77
## Features
88

99
- Uses [E2B](https://e2b.dev) for secure [Desktop Sandbox](https://github.com/e2b-dev/desktop)
10-
- Uses [Llama 3.2](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2), [3.3](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3/) and [OS-Atlas](https://osatlas.github.io/)
11-
- Operates the computer via a combination of keyboard, mouse, and shell commands
10+
- Supports [Meta Llama](https://www.llama.com/), [OS-Atlas](https://osatlas.github.io/) and [any LLM you want to integrate](#llm-support)!
11+
- Operates the computer via the keyboard, mouse, and shell commands
1212
- Live streams the display of the sandbox on the client computer
13-
- The user can pause the agent and provide feedback and any time
14-
- Designed to work on any operating system or platform
15-
- Supports a number of inference providers, including Hugging Face, Fireworks, OpenRouter, etc.
13+
- User can pause and prompt the agent at any time
14+
- Uses Ubuntu, but designed to work with any operating system
1615

1716
## Design
1817

@@ -21,6 +20,28 @@ https://github.com/user-attachments/assets/3837c4f6-45cb-43f2-9d51-a45f742424d4
2120

2221
The details of the design are laid out in this article: [How I taught an AI to use a computer](https://blog.jamesmurdza.com/how-i-taught-an-ai-to-use-a-computer)
2322

23+
## LLM support
24+
25+
Open Computer Use is designed to easily support new LLMs. The LLM and provider combinations are are defined in [models.py](/blob/master/os_computer_use/models.py). Following the comments in this file, one can easily add any LLM and provider that adheres to the OpenAI API specification.
26+
27+
The list of tested models and providers currently includes:
28+
29+
| **Type** | **Model** | **Providers** |
30+
|-----------------|----------------|---------------------------------------|
31+
| Vision | **Llama 3.2** | **Fireworks**, OpenRouter, Llama API |
32+
| Action | **Llama 3.2** | **Fireworks**, Llama API |
33+
| Action | DeepSeek | DeepSeek |
34+
| Grounding | **OS-Atlas** | **HuggingFace Spaces** |
35+
36+
The following lines of code in [models.py](/blob/master/os_computer_use/models.py) define the default LLMs and providers:
37+
38+
```
39+
vision_model = FireworksProvider(model_names["fireworks"]["llama3.2"])
40+
action_model = FireworksProvider(model_names["fireworks"]["llama3.3"])
41+
```
42+
43+
If you add a new model or provider, please make a PR to this repository!
44+
2445
## Get started
2546

2647
### Prerequisites
@@ -75,16 +96,3 @@ poetry run start
7596
```
7697

7798
The agent will start and prompt you for its first instruction.
78-
79-
## LLM support
80-
81-
Open Computer Use supports a variety of LLMs and LLM providers, which are defined in `models.py`.
82-
83-
The following lines of code can be changed to any valid combination of model and provide, so long as the new vision model supports vision input and the new action model supports tool use.
84-
85-
```
86-
vision_model = FireworksProvider(model_names["fireworks"]["llama3.2"])
87-
action_model = FireworksProvider(model_names["fireworks"]["llama3.3"])
88-
```
89-
90-
If you add models or define a new provider, feel free to make a PR to this repository.

0 commit comments

Comments
 (0)