Skip to content

Conversation

ferdinandl007
Copy link
Contributor

Description

Enable Gemini embedding models

How Has This Been Tested?

Manual test and unit test

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Your Name added 2 commits May 22, 2025 13:02
…AI. Update dependencies and add support for the new Gemini embedding model in the frontend interface.
- Introduced environment variables for specifying the VertexAI embedding model location and dimension.
- Added new supported embedding models for Google Gemini and multilingual embeddings in the backend.
- Updated frontend interface to include the new multilingual embedding model from Google.
@ferdinandl007 ferdinandl007 requested a review from a team as a code owner May 22, 2025 13:30
Copy link

vercel bot commented May 22, 2025

Someone is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR introduces Google's Gemini embedding models and updates the embedding infrastructure to use Google's Generative AI API instead of Vertex AI.

  • Added gemini-embedding-001 model in /web/src/components/embedding/interfaces.tsx with 768 dimensions and $0.15/million pricing
  • Replaced google-cloud-aiplatform with google-genai==1.15.0 in /backend/requirements/model_server.txt for Gemini support
  • Modified /backend/model_server/encoders.py to use genai.Client with special batch size handling for Gemini models
  • Added environment variables VERTEXAI_EMBEDDING_MODEL_LOCATION and VERTEXAI_EMBEDDING_MODEL_DIMENSION in /backend/shared_configs/configs.py
  • Potential issue: Duplicate model entries in SupportedEmbeddingModel list need to be cleaned up

4 file(s) reviewed, 6 comment(s)
Edit PR Review Bot Settings | Greptile

{
provider_type: EmbeddingProvider.GOOGLE,
model_name: "gemini-embedding-001",
description: "Google's most recent text gemini embedding model. Worlds best performing embedding model",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Description makes unsubstantiated claim about being 'Worlds best performing embedding model'. Consider using more specific, verifiable performance claims.

Suggested change
description: "Google's most recent text gemini embedding model. Worlds best performing embedding model",
description: "Google's most recent text gemini embedding model. Offers strong performance across a wide range of tasks.",

Comment on lines +233 to +241
name="google/gemini-embedding-001",
dim=768,
index_name="danswer_chunk_google_gemini_embedding_001",
),
SupportedEmbeddingModel(
name="google/gemini-embedding-001",
dim=768,
index_name="danswer_chunk_gemini_embedding_001",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Duplicate model entries for 'google/gemini-embedding-001' with different index names. This will cause confusion and potential issues. Keep only one entry with a consistent index name pattern.

Suggested change
name="google/gemini-embedding-001",
dim=768,
index_name="danswer_chunk_google_gemini_embedding_001",
),
SupportedEmbeddingModel(
name="google/gemini-embedding-001",
dim=768,
index_name="danswer_chunk_gemini_embedding_001",
),
name="google/gemini-embedding-001",
dim=768,
index_name="danswer_chunk_gemini_embedding_001",
),

Comment on lines +243 to +251
name="google/text-multilingual-embedding-002",
dim=768,
index_name="danswer_chunk_google_multilingual_embedding_002",
),
SupportedEmbeddingModel(
name="google/text-multilingual-embedding-002",
dim=768,
index_name="danswer_chunk_multilingual_embedding_002",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Duplicate model entries for 'google/text-multilingual-embedding-002' with different index names. Similar to the Gemini entries, this needs to be consolidated.

Suggested change
name="google/text-multilingual-embedding-002",
dim=768,
index_name="danswer_chunk_google_multilingual_embedding_002",
),
SupportedEmbeddingModel(
name="google/text-multilingual-embedding-002",
dim=768,
index_name="danswer_chunk_multilingual_embedding_002",
),
name="google/text-multilingual-embedding-002",
dim=768,
index_name="danswer_chunk_multilingual_embedding_002",
),

Comment on lines +231 to 233
is_gemini = "gemini" in model.lower()
batch_size = 1 if is_gemini else VERTEXAI_EMBEDDING_LOCAL_BATCH_SIZE # This batch size is now fixed by the function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Gemini models require batch size of 1, which could severely impact performance. Consider adding a warning log when using Gemini models to inform users about potential performance implications.

Comment on lines +237 to +246
async def embed_batch(batch: list[str]) -> list[list[float]]:
embeddings = await client.aio.models.embed_content(
model=model,
contents=batch,
config=EmbedContentConfig(
task_type=embedding_type,
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION
)
)
return embeddings.embeddings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: embed_batch function could fail silently if embeddings.embeddings is None or malformed. Add explicit error checking.

Suggested change
async def embed_batch(batch: list[str]) -> list[list[float]]:
embeddings = await client.aio.models.embed_content(
model=model,
contents=batch,
config=EmbedContentConfig(
task_type=embedding_type,
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION
)
)
return embeddings.embeddings
async def embed_batch(batch: list[str]) -> list[list[float]]:
embeddings = await client.aio.models.embed_content(
model=model,
contents=batch,
config=EmbedContentConfig(
task_type=embedding_type,
output_dimensionality=VERTEXAI_EMBEDDING_MODEL_DIMENSION
)
)
if not embeddings or not embeddings.embeddings:
raise ValueError(f"Failed to get embeddings for batch of size {len(batch)}")
return embeddings.embeddings

Comment on lines +223 to 228
client = genai.Client(
vertexai=True,
project=project_id,
location=VERTEXAI_EMBEDDING_MODEL_LOCATION,
credentials=credentials
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Client initialization should be wrapped in try/except to handle invalid credentials or connection errors gracefully.

Suggested change
client = genai.Client(
vertexai=True,
project=project_id,
location=VERTEXAI_EMBEDDING_MODEL_LOCATION,
credentials=credentials
)
try:
client = genai.Client(
vertexai=True,
project=project_id,
location=VERTEXAI_EMBEDDING_MODEL_LOCATION,
credentials=credentials
)
except Exception as e:
raise RuntimeError(f"Failed to initialize Vertex AI client: {e}") from e

…le API access. This change ensures proper authorization when using service account credentials.
Copy link

github-actions bot commented Aug 6, 2025

This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.

@github-actions github-actions bot added the Stale label Aug 6, 2025
Copy link

This PR was closed because it has been stalled for 90 days with no activity.

@github-actions github-actions bot closed this Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant