Skip to content

Update TieredMergePolicy's default floor segment size #129764

@jpountz

Description

@jpountz

Lucene 10.2 increased the floor segment size from 2MB to 16MB via apache/lucene#14189. I'll copy the PR description here to explain the motivation:

My motivation is that such small segment sizes don't make index structures actually helpful vs. linear scans, so we should avoid them. Furthermore, there has been progress on merging rules for segments below the floor size, in particular merge policies no longer perform quadratic merging (apache/lucene#900) so this change will not make indexing/merging absurdly slow if an application flushes tiny segments.

Finally this likely helps vector search, which likes fewer segments better.

I believe that it would make sense for Elasticsearch to update MergePolicyConfig to align its defaults with Lucene?

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.Team:Distributed IndexingMeta label for Distributed Indexing team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions