-
Couldn't load subscription status.
- Fork 6.5k
feat: add Airweave tool integration with advanced search features #20111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add Airweave tool integration with advanced search features #20111
Conversation
- Add AirweaveToolSpec with 5 tool functions: * search_collection: Simple search with default settings * advanced_search_collection: Full control over retrieval parameters * search_and_generate_answer: RAG-style direct answers * list_collections: Discover available collections * get_collection_info: Get collection details - Advanced search features: * Multiple retrieval strategies (hybrid, neural, keyword) * Temporal relevance weighting for recent content * Query expansion for better recall * Auto-interpret filters from natural language * LLM-based reranking for improved relevance * Natural language answer generation - 13 passing unit tests with comprehensive coverage - Full documentation with usage examples - Jupyter notebook example demonstrating all features - Follows LlamaIndex conventions (gpt-4o-mini, async patterns) - Compatible with FunctionAgent - Production tested with real data
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! Just some minor comments and I would recommend to replace function/class-level imports with top-level imports
| framework_version: Framework version for analytics | ||
|
|
||
| """ | ||
| from airweave import AirweaveSDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we place imports at the top? :)
| List of Document objects containing search results with metadata | ||
|
|
||
| """ | ||
| from airweave import SearchRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imports at the top :)
| else: | ||
| # Fallback if no answer generated | ||
| return "No answer could be generated from the search results." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise an error? or a warning maybe? I would personally be in favor of doing:
| else: | |
| # Fallback if no answer generated | |
| return "No answer could be generated from the search results." | |
| else: | |
| # Fallback if no answer generated | |
| warnings.warn("No answer could be generated from the search results", UserWarning) | |
| return None |
And then type the return type of the function as: Optional[str] so that the user knows, when the result is None, that no answer was generated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this means importing also warnings at the top)
Description
This PR adds Airweave as a tool integration, enabling LlamaIndex agents to search across data automatically synced from 30+ sources.
What is Airweave?
Airweave is an open-source platform that syncs data from multiple sources (Google Drive, Notion, GitHub, databases, APIs, etc.) and provides unified search with advanced retrieval capabilities.
Implementation:
AirweaveToolSpec with 5 tool functions:
search_collection: Simple search with default settingsadvanced_search_collection: Full control over retrieval parameterssearch_and_generate_answer: RAG-style direct answerslist_collections: Discover available collectionsget_collection_info: Get collection detailsAdvanced search features:
Testing & Documentation:
Links:
Fixes #20110
New Package?
Did I fill in the
tool.llamahubsection in thepyproject.tomland provide a detailed README.md for my new integration or package?Version Bump?
Did I bump the version in the
pyproject.tomlfile of the package I am updating? (Except for thellama-index-corepackage)Type of Change
Please delete options that are not relevant.
[x] New feature (non-breaking change which adds functionality)
How Has This Been Tested?
Testing details:
All tests use mocked Airweave SDK calls and cover:
Additionally tested with production Airweave instance and real data.
Suggested Checklist:
uv run make format; uv run make lintto appease the lint gods