-
Notifications
You must be signed in to change notification settings - Fork 2k
Gemini embeddings #4756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemini embeddings #4756
Conversation
…AI. Update dependencies and add support for the new Gemini embedding model in the frontend interface.
- Introduced environment variables for specifying the VertexAI embedding model location and dimension. - Added new supported embedding models for Google Gemini and multilingual embeddings in the backend. - Updated frontend interface to include the new multilingual embedding model from Google.
Someone is attempting to deploy a commit to the Danswer Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR introduces Google's Gemini embedding models and updates the embedding infrastructure to use Google's Generative AI API instead of Vertex AI.
- Added
gemini-embedding-001
model in/web/src/components/embedding/interfaces.tsx
with 768 dimensions and $0.15/million pricing - Replaced
google-cloud-aiplatform
withgoogle-genai==1.15.0
in/backend/requirements/model_server.txt
for Gemini support - Modified
/backend/model_server/encoders.py
to usegenai.Client
with special batch size handling for Gemini models - Added environment variables
VERTEXAI_EMBEDDING_MODEL_LOCATION
andVERTEXAI_EMBEDDING_MODEL_DIMENSION
in/backend/shared_configs/configs.py
- Potential issue: Duplicate model entries in
SupportedEmbeddingModel
list need to be cleaned up
4 file(s) reviewed, 6 comment(s)
Edit PR Review Bot Settings | Greptile
{ | ||
provider_type: EmbeddingProvider.GOOGLE, | ||
model_name: "gemini-embedding-001", | ||
description: "Google's most recent text gemini embedding model. Worlds best performing embedding model", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Description makes unsubstantiated claim about being 'Worlds best performing embedding model'. Consider using more specific, verifiable performance claims.
description: "Google's most recent text gemini embedding model. Worlds best performing embedding model", | |
description: "Google's most recent text gemini embedding model. Offers strong performance across a wide range of tasks.", |
name="google/gemini-embedding-001", | ||
dim=768, | ||
index_name="danswer_chunk_google_gemini_embedding_001", | ||
), | ||
SupportedEmbeddingModel( | ||
name="google/gemini-embedding-001", | ||
dim=768, | ||
index_name="danswer_chunk_gemini_embedding_001", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Duplicate model entries for 'google/gemini-embedding-001' with different index names. This will cause confusion and potential issues. Keep only one entry with a consistent index name pattern.
name="google/gemini-embedding-001", | |
dim=768, | |
index_name="danswer_chunk_google_gemini_embedding_001", | |
), | |
SupportedEmbeddingModel( | |
name="google/gemini-embedding-001", | |
dim=768, | |
index_name="danswer_chunk_gemini_embedding_001", | |
), | |
name="google/gemini-embedding-001", | |
dim=768, | |
index_name="danswer_chunk_gemini_embedding_001", | |
), |
name="google/text-multilingual-embedding-002", | ||
dim=768, | ||
index_name="danswer_chunk_google_multilingual_embedding_002", | ||
), | ||
SupportedEmbeddingModel( | ||
name="google/text-multilingual-embedding-002", | ||
dim=768, | ||
index_name="danswer_chunk_multilingual_embedding_002", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Duplicate model entries for 'google/text-multilingual-embedding-002' with different index names. Similar to the Gemini entries, this needs to be consolidated.
name="google/text-multilingual-embedding-002", | |
dim=768, | |
index_name="danswer_chunk_google_multilingual_embedding_002", | |
), | |
SupportedEmbeddingModel( | |
name="google/text-multilingual-embedding-002", | |
dim=768, | |
index_name="danswer_chunk_multilingual_embedding_002", | |
), | |
name="google/text-multilingual-embedding-002", | |
dim=768, | |
index_name="danswer_chunk_multilingual_embedding_002", | |
), |
is_gemini = "gemini" in model.lower() | ||
batch_size = 1 if is_gemini else VERTEXAI_EMBEDDING_LOCAL_BATCH_SIZE # This batch size is now fixed by the function | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Gemini models require batch size of 1, which could severely impact performance. Consider adding a warning log when using Gemini models to inform users about potential performance implications.
async def embed_batch(batch: list[str]) -> list[list[float]]: | ||
embeddings = await client.aio.models.embed_content( | ||
model=model, | ||
contents=batch, | ||
config=EmbedContentConfig( | ||
task_type=embedding_type, | ||
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION | ||
) | ||
) | ||
return embeddings.embeddings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: embed_batch function could fail silently if embeddings.embeddings is None or malformed. Add explicit error checking.
async def embed_batch(batch: list[str]) -> list[list[float]]: | |
embeddings = await client.aio.models.embed_content( | |
model=model, | |
contents=batch, | |
config=EmbedContentConfig( | |
task_type=embedding_type, | |
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION | |
) | |
) | |
return embeddings.embeddings | |
async def embed_batch(batch: list[str]) -> list[list[float]]: | |
embeddings = await client.aio.models.embed_content( | |
model=model, | |
contents=batch, | |
config=EmbedContentConfig( | |
task_type=embedding_type, | |
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION | |
) | |
) | |
if not embeddings or not embeddings.embeddings: | |
raise ValueError(f"Failed to get embeddings for batch of size {len(batch)}") | |
return embeddings.embeddings |
client = genai.Client( | ||
vertexai=True, | ||
project=project_id, | ||
location=VERTEXAI_EMBEDDING_MODEL_LOCATION, | ||
credentials=credentials | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Client initialization should be wrapped in try/except to handle invalid credentials or connection errors gracefully.
client = genai.Client( | |
vertexai=True, | |
project=project_id, | |
location=VERTEXAI_EMBEDDING_MODEL_LOCATION, | |
credentials=credentials | |
) | |
try: | |
client = genai.Client( | |
vertexai=True, | |
project=project_id, | |
location=VERTEXAI_EMBEDDING_MODEL_LOCATION, | |
credentials=credentials | |
) | |
except Exception as e: | |
raise RuntimeError(f"Failed to initialize Vertex AI client: {e}") from e |
…le API access. This change ensures proper authorization when using service account credentials.
This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days. |
This PR was closed because it has been stalled for 90 days with no activity. |
Description
Enable Gemini embedding models
How Has This Been Tested?
Manual test and unit test
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.