This is an offline-only local demo of batch prediction + context caching pipeline from the GSoC DeepMind Challenge Q4:
This repo demonstrates batch prediction with Gemini APIs, leveraging long context and context caching for efficiently answering questions about a single video. It addresses a common use case of extracting information from large content sources.
Scenario: Extracting information from a video lecture/documentary by asking multiple, potentially interconnected, questions.
Features:
- Batch Prediction: Design and optimization for submitting a batch of questions. This should minimize API calls and improve efficiency. Consider using techniques like dividing the questions into smaller batches to avoid exceeding API limits. 📦
- Long Context Handling: Demonstrate use of Gemini's long context capabilities. Show how to provide the entire video transcript (or relevant segments) as context. Consider strategies for handling transcripts that exceed the maximum context length. 📏
- Context Caching: Implement context caching to store and reuse previous interactions. This can significantly reduce the amount of data sent to the API and improve response times, especially for interconnected questions. Use a suitable caching mechanism (e.g., in-memory cache, persistent storage). 💾
- Interconnected Questions: Handle questions that build upon previous answers. The code should maintain the conversation history and use it to provide more accurate and relevant responses. 🔗
- Output Formatting: Clear and user-friendly output. Present the answers in a structured format, possibly with links to the relevant timestamps in the video. ✨
- Code Documentation: Detailed comments, setup instructions, and usage guidelines. Explain the different components of the code and how they work together. Include instructions on how to obtain and configure an API key. Provide example questions and expected outputs. 📖
- Error Handling: Implement robust error handling to gracefully handle API errors, network issues, and invalid inputs.
A work in progress code implementation from the submitted proposal paper
poetry install
poetry run python src