|
| 1 | +--- |
| 2 | +layout: integration |
| 3 | +name: Cerebras |
| 4 | +description: Use LLMs served by Cerebras API |
| 5 | +authors: |
| 6 | + - name: deepset |
| 7 | + socials: |
| 8 | + github: deepset-ai |
| 9 | + twitter: Haystack_AI |
| 10 | + linkedin: https://www.linkedin.com/company/deepset-ai |
| 11 | +pypi: https://pypi.org/project/haystack-ai/ |
| 12 | +repo: https://github.com/deepset-ai/haystack |
| 13 | +type: Model Provider |
| 14 | +report_issue: https://github.com/deepset-ai/haystack/issues |
| 15 | +logo: /logos/cerebras.png |
| 16 | +version: Haystack 2.0 |
| 17 | +toc: true |
| 18 | +--- |
| 19 | + |
| 20 | +### **Table of Contents** |
| 21 | + |
| 22 | +- [Overview](#overview) |
| 23 | +- [Usage](#usage) |
| 24 | + |
| 25 | +## Overview |
| 26 | + |
| 27 | +[Cerebras](https://cerebras.ai/) is the go-to platform for fast and effortless AI training and inference. |
| 28 | + |
| 29 | +## Usage |
| 30 | + |
| 31 | +[Cerebras API](https://cerebras.ai/inference) is OpenAI compatible, making it easy to use in Haystack via OpenAI Generators. |
| 32 | + |
| 33 | +### Using `Generator` |
| 34 | + |
| 35 | +Here's an example of using `llama3.1-8b` served via Cerebras to perform question answering on a web page. |
| 36 | +You need to set the environment variable `CEREBRAS_API_KEY` and choose a [compatible model](https://inference-docs.cerebras.ai/introduction). |
| 37 | + |
| 38 | +```python |
| 39 | +from haystack import Pipeline |
| 40 | +from haystack.utils import Secret |
| 41 | +from haystack.components.fetchers import LinkContentFetcher |
| 42 | +from haystack.components.converters import HTMLToDocument |
| 43 | +from haystack.components.builders import PromptBuilder |
| 44 | +from haystack.components.generators import OpenAIGenerator |
| 45 | + |
| 46 | +fetcher = LinkContentFetcher() |
| 47 | +converter = HTMLToDocument() |
| 48 | +prompt_template = """ |
| 49 | +According to the contents of this website: |
| 50 | +{% for document in documents %} |
| 51 | + {{document.content}} |
| 52 | +{% endfor %} |
| 53 | +Answer the given question: {{query}} |
| 54 | +Answer: |
| 55 | +""" |
| 56 | +prompt_builder = PromptBuilder(template=prompt_template) |
| 57 | +llm = OpenAIGenerator( |
| 58 | + api_key=Secret.from_env_var("CEREBRAS_API_KEY"), |
| 59 | + api_base_url="https://api.cerebras.ai/v1", |
| 60 | + model="llama3.1-8b" |
| 61 | +) |
| 62 | +pipeline = Pipeline() |
| 63 | +pipeline.add_component("fetcher", fetcher) |
| 64 | +pipeline.add_component("converter", converter) |
| 65 | +pipeline.add_component("prompt", prompt_builder) |
| 66 | +pipeline.add_component("llm", llm) |
| 67 | + |
| 68 | +pipeline.connect("fetcher.streams", "converter.sources") |
| 69 | +pipeline.connect("converter.documents", "prompt.documents") |
| 70 | +pipeline.connect("prompt.prompt", "llm.prompt") |
| 71 | + |
| 72 | +result = pipeline.run({"fetcher": {"urls": ["https://cerebras.ai/inference"]}, |
| 73 | + "prompt": {"query": "Why should I use Cerebras for serving LLMs?"}}) |
| 74 | + |
| 75 | +print(result["llm"]["replies"][0]) |
| 76 | +``` |
| 77 | + |
| 78 | +### Using `ChatGenerator` |
| 79 | + |
| 80 | +See an example of engaging in a multi-turn conversation with `llama3.1-8b`. |
| 81 | +You need to set the environment variable `CEREBRAS_API_KEY` and choose a [compatible model](https://inference-docs.cerebras.ai/introduction). |
| 82 | + |
| 83 | +```python |
| 84 | +from haystack.components.generators.chat import OpenAIChatGenerator |
| 85 | +from haystack.dataclasses import ChatMessage |
| 86 | +from haystack.utils import Secret |
| 87 | + |
| 88 | +generator = OpenAIChatGenerator( |
| 89 | + api_key=Secret.from_env_var("CEREBRAS_API_KEY"), |
| 90 | + api_base_url="https://api.cerebras.ai/v1", |
| 91 | + model="llama3.1-8b", |
| 92 | + generation_kwargs = {"max_tokens": 512} |
| 93 | +) |
| 94 | + |
| 95 | +messages = [] |
| 96 | + |
| 97 | +while True: |
| 98 | + msg = input("Enter your message or Q to exit\n🧑 ") |
| 99 | + if msg=="Q": |
| 100 | + break |
| 101 | + messages.append(ChatMessage.from_user(msg)) |
| 102 | + response = generator.run(messages=messages) |
| 103 | + assistant_resp = response['replies'][0] |
| 104 | + print("🤖 "+assistant_resp.content) |
| 105 | + messages.append(assistant_resp) |
| 106 | +``` |
0 commit comments