Currently natively supports: OpenAI, Anthropic, Gemini, XAI/Grok, Ollama, Groq, DeepSeek (deepseek.com & Groq), Cohere (more to come)
Also allows a custom URL with ServiceTargetResolver
(see examples/c06-target-resolver.rs)
Provides a single, ergonomic API to many generative AI providers, such as Anthropic, OpenAI, Gemini, xAI, Ollama, Groq, and more.
NOTE: Try to use the latest version (0.4.0-alpha.4
). It is as robust as 0.3.x
, but with updated APIs (see below) and additional functionality thanks to many great PRs.
What's new: (-
fix, +
addition, !
change)
(see CHANGELOG.md for more)
!
API CHANGEChatResponse::content
is nowMessageContent
(as MessageContent is now multipart). Minor impact, as ChatResponse public API (into_text...
as before)- Now simpler, just use
let joined_text: Option<String> = chat_response.content.into_join_texts()
- Now simpler, just use
!
API CHANGEMessageContent::text(&self)
replaced by (becauseMessageContent
now flattens multi-part formats)MessageContent::into_joined_texts(self) -> Option<String>
MessageContent::joined_texts(&self) -> Option<String>
MessageContent::texts(&self) -> Vec<&str>
MessageContent::into_texts(self) -> Vec<String>
+
Custom HTTP headers inChatOptions
(#78)+
Model namespacing to specify the adapter, e.g.,openai::codex-unknown-model
will use the OpenAI adapter and sendcodex-unknown-model
as the model name. AdapterKind and model name can still be overridden byServiceTargetResolver
+
New Adapters: Zhipu (ChatGLM) (#76), Nebius!
API CHANGE NowChatResponse.content
is aVec<MessageContent>
to support responses that include ToolCalls and text messages- you can use
let text: &str = chat_response.first_text()
(wasChatResponse::content_text_as_str()
) or let texts: Vec<&str> = chat_response.texts();
let texts: Vec<String> = chat_response.into_texts();
let text: String = chat_response::into_first_text()
(wasChatResponse::content_text_into_string()
)- To get the concatenated string of all messages:
let text: String = content.into_iter().filter_map(|c| c.text_into_string()).collect::<Vec<_>>().join("\n\n")
- you can use
!
API CHANGE NowChatResponse::into_tool_calls()
andtool_calls()
returnVec<ToolCalls>
rather thanOption<Vec<ToolCalls>>
!
API CHANGE MessageContent - Now usemessage_content.text()
andmessage_content.into_text()
(rather thantext_as_str
,text_into_string
)-
Gemini ToolResponse Fix Gemini adapter wrongfully tried to parse theToolResponse.content
(see #59)!
Tool Use Streaming support, thanks to ClanceyLu, PR #58
What's new:
- Gemini Thinking Budget support
ReasoningEffort::Budget(num)
- Gemini
-zero
,-low
,-medium
, and-high
suffixes that set the corresponding budget (0
,1k
,8k
,24k
) - When set,
ReasoningEffort::Low, ...
will map to their corresponding budgets1k
,8k
,24k
API-CHANGES (minors)
- ReasoningEffort
now has an additional Budget(num)
variant
- ModelIden::with_name_or_clone
has been deprecated in favor of ModelIden::from_option_name(Option<String>)
Check CHANGELOG for more info
- ClanceyLu for Tool Use Streaming support PR #58
- @SilasMarvin for fixing content/tools issues with some Ollama models PR #55
- @una-spirito for Gemini
ReasoningEffort::Budget
support - @jBernavaPrah for adding tracing (it was long overdue). PR #45
- @GustavoWidman for the initial Gemini tool/function support! PR #41
- @AdamStrojek for initial image support PR #36
- @semtexzv for
stop_sequences
Anthropic support PR #34 - @omarshehab221 for de/serialize on structs PR #19
- @tusharmath for making webc::Error PR #12
- @giangndm for making stream Send PR #10
- @stargazing-dino for PR #2 - implement Groq completions
- Check out AIPACK, which wraps this genai library into an agentic runtime to run, build, and share AI Agent Packs. See
pro@coder
for a simple example of how I use AI PACK/genai for production coding.
Note: Feel free to send me a short description and a link to your application or library using genai.
- Native Multi-AI Provider/Model: OpenAI, Anthropic, Gemini, Ollama, Groq, xAI, DeepSeek (Direct chat and stream) (see examples/c00-readme.rs)
- DeepSeekR1 support, with
reasoning_content
(and stream support), plus DeepSeek Groq and Ollama support (andreasoning_content
normalization) - Image Analysis (for OpenAI, Gemini flash-2, Anthropic) (see examples/c07-image.rs)
- Custom Auth/API Key (see examples/c02-auth.rs)
- Model Alias (see examples/c05-model-names.rs)
- Custom Endpoint, Auth, and Model Identifier (see examples/c06-target-resolver.rs)
Examples | Thanks | Library Focus | Changelog | Provider Mapping: ChatOptions | Usage
//! Base examples demonstrating the core capabilities of genai
use genai::chat::printer::{print_chat_stream, PrintChatStreamOptions};
use genai::chat::{ChatMessage, ChatRequest};
use genai::Client;
const MODEL_OPENAI: &str = "gpt-4o-mini"; // o1-mini, gpt-4o-mini
const MODEL_ANTHROPIC: &str = "claude-3-haiku-20240307";
const MODEL_COHERE: &str = "command-light";
const MODEL_GEMINI: &str = "gemini-2.0-flash";
const MODEL_GROQ: &str = "llama-3.1-8b-instant";
const MODEL_OLLAMA: &str = "gemma:2b"; // sh: `ollama pull gemma:2b`
const MODEL_XAI: &str = "grok-beta";
const MODEL_DEEPSEEK: &str = "deepseek-chat";
// NOTE: These are the default environment keys for each AI Adapter Type.
// They can be customized; see `examples/c02-auth.rs`
const MODEL_AND_KEY_ENV_NAME_LIST: &[(&str, &str)] = &[
// -- De/activate models/providers
(MODEL_OPENAI, "OPENAI_API_KEY"),
(MODEL_ANTHROPIC, "ANTHROPIC_API_KEY"),
(MODEL_COHERE, "COHERE_API_KEY"),
(MODEL_GEMINI, "GEMINI_API_KEY"),
(MODEL_GROQ, "GROQ_API_KEY"),
(MODEL_XAI, "XAI_API_KEY"),
(MODEL_DEEPSEEK, "DEEPSEEK_API_KEY"),
(MODEL_OLLAMA, ""),
];
// NOTE: Model to AdapterKind (AI Provider) type mapping rule
// - starts_with "gpt" -> OpenAI
// - starts_with "claude" -> Anthropic
// - starts_with "command" -> Cohere
// - starts_with "gemini" -> Gemini
// - model in Groq models -> Groq
// - For anything else -> Ollama
//
// This can be customized; see `examples/c03-mapper.rs`
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let question = "Why is the sky red?";
let chat_req = ChatRequest::new(vec![
// -- Messages (de/activate to see the differences)
ChatMessage::system("Answer in one sentence"),
ChatMessage::user(question),
]);
let client = Client::default();
let print_options = PrintChatStreamOptions::from_print_events(false);
for (model, env_name) in MODEL_AND_KEY_ENV_NAME_LIST {
// Skip if the environment name is not set
if !env_name.is_empty() && std::env::var(env_name).is_err() {
println!("===== Skipping model: {model} (env var not set: {env_name})");
continue;
}
let adapter_kind = client.resolve_service_target(model)?.model.adapter_kind;
println!("\n===== MODEL: {model} ({adapter_kind}) =====");
println!("\n--- Question:\n{question}");
println!("\n--- Answer:");
let chat_res = client.exec_chat(model, chat_req.clone(), None).await?;
println!("{}", chat_res.content_text_as_str().unwrap_or("NO ANSWER"));
println!("\n--- Answer: (streaming)");
let chat_res = client.exec_chat_stream(model, chat_req.clone(), None).await?;
print_chat_stream(chat_res, Some(&print_options)).await?;
println!();
}
Ok(())
}
- examples/c00-readme.rs - Quick overview code with multiple providers and streaming.
- examples/c01-conv.rs - Shows how to build a conversation flow.
- examples/c02-auth.rs - Demonstrates how to provide a custom
AuthResolver
to provide auth data (i.e., for api_key) per adapter kind. - examples/c03-mapper.rs - Demonstrates how to provide a custom
AdapterKindResolver
to customize the "model name" to "adapter kind" mapping. - examples/c04-chat-options.rs - Demonstrates how to set chat generation options such as
temperature
andmax_tokens
at the client level (for all requests) and per-request level. - examples/c05-model-names.rs - Shows how to get model names per AdapterKind.
- examples/c06-target-resolver.rs - For custom Auth, Endpoint, and Model.
- examples/c07-image.rs - Image Analysis support
-
genai live coding, code design, & best practices
- Adding Gemini Structured Output (vid-0060)
- Adding OpenAI Structured Output (vid-0059)
- Splitting the json value extension trait to its own public crate value-ext value-ext
- (part 1/3) Module, Error, constructors/builders
- (part 2/3) Extension Traits, Project Files, Versioning
- (part 3/3) When to Async? Project Files, Versioning strategy
-
Focuses on standardizing chat completion APIs across major AI services.
-
Native implementation, meaning no per-service SDKs.
- Reason: While there are some variations across the various APIs, they all follow the same pattern and high-level flow and constructs. Managing the differences at a lower layer is actually simpler and more cumulative across services than doing SDK gymnastics.
-
Prioritizes ergonomics and commonality, with depth being secondary. (If you require a complete client API, consider using async-openai and ollama-rs; they are both excellent and easy to use.)
-
Initially, this library will mostly focus on text chat APIs; images and function calling will come later.
- (1) - OpenAI compatibles notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI
Property | OpenAI Compatibles (*1) | Anthropic | Gemini generationConfig. |
Cohere |
---|---|---|---|---|
temperature |
temperature |
temperature |
temperature |
temperature |
max_tokens |
max_tokens |
max_tokens (default 1024) |
maxOutputTokens |
max_tokens |
top_p |
top_p |
top_p |
topP |
p |
Property | OpenAI Compatibles (1) | Anthropic usage. |
Gemini usageMetadata. |
Cohere meta.tokens. |
---|---|---|---|---|
prompt_tokens |
prompt_tokens |
input_tokens (added) |
promptTokenCount (2) |
input_tokens |
completion_tokens |
completion_tokens |
output_tokens (added) |
candidatesTokenCount (2) |
output_tokens |
total_tokens |
total_tokens |
(computed) | totalTokenCount (2) |
(computed) |
prompt_tokens_details |
prompt_tokens_details |
cached/cache_creation |
N/A for now | N/A for now |
completion_tokens_details |
completion_tokens_details |
N/A for now | N/A for now | N/A for now |
-
(1) - OpenAI compatibles notes
- Models: OpenAI, DeepSeek, Groq, Ollama, xAI
- For Groq, the property
x_groq.usage.
- At this point, Ollama does not emit input/output tokens when streaming due to the Ollama OpenAI compatibility layer limitation. (see ollama #4448 - Streaming Chat Completion via OpenAI API should support stream option to include Usage)
prompt_tokens_details
andcompletion_tokens_details
will have the value sent by the compatible provider (or None)
-
(2): Gemini tokens
- Right now, with Gemini Stream API, it's not really clear if the usage for each event is cumulative or needs to be added. Currently, it appears to be cumulative (i.e., the last message has the total amount of input, output, and total tokens), so that will be the assumption. See possible tweet answer for more info.
- Will add more data on ChatResponse and ChatStream, especially metadata about usage.
- Add vision/image support to chat messages and responses.
- Add function calling support to chat messages and responses.
- Add
embed
andembed_batch
- Add the AWS Bedrock variants (e.g., Mistral, and Anthropic). Most of the work will be on "interesting" token signature scheme (without having to drag big SDKs, might be below feature).
- Add the Google VertexAI variants.
- (might) add the Azure OpenAI variant (not sure yet).
- crates.io: crates.io/crates/genai
- GitHub: github.com/jeremychone/rust-genai
- Sponsored by BriteSnow (Jeremy Chone's consulting company)