Skip to content

Conversation

@EwanTauran
Copy link

@EwanTauran EwanTauran commented Oct 17, 2025

Description

This PR adds Airweave as a tool integration, enabling LlamaIndex agents to search across data automatically synced from 30+ sources.

What is Airweave?
Airweave is an open-source platform that syncs data from multiple sources (Google Drive, Notion, GitHub, databases, APIs, etc.) and provides unified search with advanced retrieval capabilities.

Implementation:

  • AirweaveToolSpec with 5 tool functions:

    • search_collection: Simple search with default settings
    • advanced_search_collection: Full control over retrieval parameters
    • search_and_generate_answer: RAG-style direct answers
    • list_collections: Discover available collections
    • get_collection_info: Get collection details
  • Advanced search features:

    • Multiple retrieval strategies (hybrid, neural, keyword)
    • Temporal relevance weighting for recent content
    • Query expansion for better recall
    • Auto-interpret filters from natural language
    • LLM-based reranking for improved relevance
    • Natural language answer generation

Testing & Documentation:

  • 13 passing unit tests with comprehensive coverage
  • Full README with usage examples
  • Jupyter notebook demonstrating all features
  • Production tested with real data

Links:

Fixes #20110

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes - Set to version 0.1.0 in pyproject.toml
  • No

Type of Change

Please delete options that are not relevant.

[x] New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • I added new unit tests to cover this change

Testing details:

cd llama-index-integrations/tools/llama-index-tools-airweave
uv run pytest tests/ -v
# Result: 13 passed, 0 failed

All tests use mocked Airweave SDK calls and cover:

  • Class inheritance and initialization
  • All 5 tool functions (search, advanced search, RAG answers, list, get info)
  • Edge cases (empty results, missing completions)
  • Both dict and object response parsing

Additionally tested with production Airweave instance and real data.

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

- Add AirweaveToolSpec with 5 tool functions:
  * search_collection: Simple search with default settings
  * advanced_search_collection: Full control over retrieval parameters
  * search_and_generate_answer: RAG-style direct answers
  * list_collections: Discover available collections
  * get_collection_info: Get collection details

- Advanced search features:
  * Multiple retrieval strategies (hybrid, neural, keyword)
  * Temporal relevance weighting for recent content
  * Query expansion for better recall
  * Auto-interpret filters from natural language
  * LLM-based reranking for improved relevance
  * Natural language answer generation

- 13 passing unit tests with comprehensive coverage
- Full documentation with usage examples
- Jupyter notebook example demonstrating all features
- Follows LlamaIndex conventions (gpt-4o-mini, async patterns)
- Compatible with FunctionAgent
- Production tested with real data
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 17, 2025
Copy link
Member

@AstraBert AstraBert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! Just some minor comments and I would recommend to replace function/class-level imports with top-level imports

framework_version: Framework version for analytics

"""
from airweave import AirweaveSDK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we place imports at the top? :)

List of Document objects containing search results with metadata

"""
from airweave import SearchRequest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imports at the top :)

Comment on lines +205 to +207
else:
# Fallback if no answer generated
return "No answer could be generated from the search results."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise an error? or a warning maybe? I would personally be in favor of doing:

Suggested change
else:
# Fallback if no answer generated
return "No answer could be generated from the search results."
else:
# Fallback if no answer generated
warnings.warn("No answer could be generated from the search results", UserWarning)
return None

And then type the return type of the function as: Optional[str] so that the user knows, when the result is None, that no answer was generated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this means importing also warnings at the top)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add Airweave Integration as a Tool

2 participants