Feature: Implement N-gram bloom filter index to improve the performance of `LIKE` queries

**Summary**

Currently, `LIKE` queries in Databend often result in full table scans, especially when the pattern includes leading wildcards (e.g., `LIKE '%keyword'`) or complex regular expressions. This can lead to unacceptable query latencies, especially on large datasets.

N-gram bloom index offers a powerful solution to this problem by pre-processing and indexing substrings (N-grams) of the text data. This allows the query engine to quickly identify potential matches based on the indexed N-grams, drastically reducing the number of rows that need to be scanned.

**Benefits of N-gram bloom index:**

*   **Significant Performance Improvement for LIKE Queries:**  Dramatically reduces query execution time for `LIKE` queries, especially those with leading wildcards or complex patterns.
*   **Reduced Resource Consumption:**  By minimizing full table scans, N-gram bloom index reduces CPU and I/O usage, leading to more efficient resource utilization.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Implement N-gram bloom filter index to improve the performance of `LIKE` queries #17724

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Implement N-gram bloom filter index to improve the performance of LIKE queries #17724

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature: Implement N-gram bloom filter index to improve the performance of `LIKE` queries #17724