Feature: Implement N-gram bloom filter index to improve the performance of LIKE
queries
#17724
Labels
C-feature
Category: feature
Summary
Currently,
LIKE
queries in Databend often result in full table scans, especially when the pattern includes leading wildcards (e.g.,LIKE '%keyword'
) or complex regular expressions. This can lead to unacceptable query latencies, especially on large datasets.N-gram bloom index offers a powerful solution to this problem by pre-processing and indexing substrings (N-grams) of the text data. This allows the query engine to quickly identify potential matches based on the indexed N-grams, drastically reducing the number of rows that need to be scanned.
Benefits of N-gram bloom index:
LIKE
queries, especially those with leading wildcards or complex patterns.The text was updated successfully, but these errors were encountered: