-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix(processing): Mime types for Image Summarization #5471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(processing): Mime types for Image Summarization #5471
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR fixes a critical bug in the image summarization pipeline where all images were being mislabeled with incorrect MIME types when sent to the LLM. Previously, the _encode_image_for_llm_prompt
function hardcoded all images as 'image/jpeg' regardless of their actual format (PNG, GIF, WEBP, etc.), causing LLMs to fail at processing non-JPEG images since they couldn't properly decode the mismatched format.
The solution introduces proper MIME type detection using magic number analysis through the get_image_type_from_bytes
utility function. The fix includes:
- Dynamic MIME type detection: Replacing the hardcoded 'image/jpeg' with actual format detection based on file headers
- Custom exception handling: Adding
UnsupportedImageFormatError
for graceful handling of unsupported formats - Graceful degradation: When unsupported formats are encountered, the system now logs the issue and skips summarization instead of crashing
This change integrates seamlessly with the existing image summarization pipeline in summarize_image_with_error_handling
, where the new error handling allows the system to continue processing other images even when encountering unsupported formats. The fix addresses the specific error logs mentioned in the PR description where PNG files labeled as JPEG were causing LLM processing failures.
Confidence score: 4/5
- This PR is safe to merge with minimal risk as it fixes a clear bug with proper error handling
- Score reflects a straightforward bug fix with appropriate error handling, though lacks comprehensive test coverage
- Pay close attention to the image processing pipeline and ensure the new MIME type detection works across all supported formats
1 file reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 1 file
Prompt for AI agents (all 1 issues)
Understand the root cause of the following 1 issues and fix them.
<file name="backend/onyx/file_processing/image_summarization.py">
<violation number="1" location="backend/onyx/file_processing/image_summarization.py:84">
Catching only UnsupportedImageFormatError lets other summarization failures bubble up, contradicting the wrapper’s contract and producing noisy error logs. Include ValueError so typical LLM failures return None instead of raising.
(Based on your team's feedback about avoiding logging raw exceptions that may contain sensitive URLs/tokens; broadening the catch prevents leaking exception strings into logs.) [FEEDBACK_USED]</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
return summarize_image_pipeline(llm, image_data, user_prompt, system_prompt) | ||
try: | ||
return summarize_image_pipeline(llm, image_data, user_prompt, system_prompt) | ||
except UnsupportedImageFormatError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
[Provide a brief description of the changes in this PR]
We saw a ton of logs within our docprocessing pods that looked like this:
The issue with these logs is that the way we format and sent this message to the LLM states that this is a jpeg file but in reality it is a png file which can be confirmed with the name of the file.
This PR aims to ensure that the file type that is passed through is correctly set in the message that is sent over to the LLM to ensure that the LLM can properly summarize and extract information out of the image if there is any context that is there.
How Has This Been Tested?
[Describe the tests you ran to verify your changes]
Ran locally and tested with files that were not jpeg and validated that they worked.
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.