Skip to content

Add multi-modal content support for various file types in chat models #4157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mantrakp04
Copy link

@mantrakp04 mantrakp04 commented Mar 11, 2025

  • Extend multi-modal content handling to support PDF, audio, and video uploads
  • Add new content types for documents and media in interfaces
  • Update Anthropic and Google Generative AI chat models to handle additional file types
  • Refactor multi-modal utility functions to support broader content processing
  • Improve flexibility for different LLM models with multi-modal content

discord post: https://discord.com/channels/1087698854775881778/1349020605197848690

- Extend multi-modal content handling to support PDF, audio, and video uploads
- Add new content types for documents and media in interfaces
- Update Anthropic and Google Generative AI chat models to handle additional file types
- Refactor multi-modal utility functions to support broader content processing
- Improve flexibility for different LLM models with multi-modal content
@mantrakp04 mantrakp04 marked this pull request as draft March 11, 2025 15:12
@jquinter
Copy link

jquinter commented Apr 2, 2025

I think this PR would be very useful to integrate!

+1

@marcosmarf27
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants