This documents the new threadpool-based merge scheduler, which is disk space aware, and blocks merges when disk space is low. The code changes were mostly introduced in #120869 and #127613 . (#130530)

albertzaharovits · web-flow · commit e310d0a4fe76 · 2025-07-03T19:10:51.000+10:00
diff --git a/docs/reference/index-modules/merge.asciidoc b/docs/reference/index-modules/merge.asciidoc
@@ -14,18 +14,32 @@ resources between merging and other activities like search.
 [[merge-scheduling]]
 === Merge scheduling
 
-The merge scheduler (ConcurrentMergeScheduler) controls the execution of merge
-operations when they are needed. Merges run in separate threads, and when the
-maximum number of threads is reached, further merges will wait until a merge
-thread becomes available.
-
-The merge scheduler supports the following _dynamic_ setting:
-
-`index.merge.scheduler.max_thread_count`::
-
-    The maximum number of threads on a single shard that may be merging at once.
-	Defaults to
-    `Math.max(1, Math.min(4, <<node.processors, node.processors>> / 2))` which
-    works well for a good solid-state-disk (SSD). If your index is on spinning
-    platter drives instead, decrease this to 1.
+The merge scheduler controls the execution of merge operations when they are needed.
+Merges run on the dedicated `merge` thread pool.
+Smaller merges are prioritized over larger ones, across all shards on the node.
+Merges are disk IO throttled so that bursts, while merging activity is otherwise low, are smoothed out in order to not impact indexing throughput.
+There is no limit on the number of merges that can be enqueued for execution on the thread pool.
+However, beyond a certain per-shard limit, after merging is completely disk IO un-throttled, indexing for the shard will itself be throttled until merging catches up.
+
+The available disk space is periodically monitored, such that no new merge tasks are scheduled for execution when the available disk space is low.
+This is in order to prevent that the temporary disk space, which is required while merges are executed, completely fills up the disk space on the node.
+
+The merge scheduler supports the following *dynamic* settings:
+
+`index.merge.scheduler.max_thread_count`
+:   The maximum number of threads on a **single** shard that may be merging at once. Defaults to `Math.max(1, Math.min(4, <<node.processors, node.processors>> / 2))` which works well for a good solid-state-disk (SSD). If your index is on spinning platter drives instead, decrease this to 1.
+
+`indices.merge.disk.check_interval`
+:   The time interval for checking the available disk space. Defaults to `5s`.
+
+`indices.merge.disk.watermark.high`
+:   Controls the disk usage watermark, which defaults to `95%`, beyond which no merge tasks can start execution.
+The disk usage tally includes the estimated temporary disk space still required by all the currently executing merge tasks.
+Any merge task scheduled *before* the limit is reached continues execution, even if the limit is exceeded while executing
+(merge tasks are not aborted).
+
+`indices.merge.disk.watermark.high.max_headroom`
+:   Controls the max headroom for the merge disk usage watermark, in case it is specified as percentage or ratio values.
+Defaults to `100GB` when `indices.merge.disk.watermark.high` is not explicitly set.
+This caps the amount of free disk space before merge scheduling is blocked.
 
diff --git a/docs/reference/modules/threadpool.asciidoc b/docs/reference/modules/threadpool.asciidoc
@@ -79,10 +79,13 @@ There are several thread pools, but the important ones include:
     default maximum size of `min(5, (`<<node.processors,
     `# of allocated processors`>>`) / 2)`.
 
+`merge`::
+    For [merge](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html) operations of all the shards on the node.
+    Thread pool type is `scaling` with a keep-alive of `5m` and a default maximum size of [`# of allocated processors`](#node.processors).
+
 `force_merge`::
-    For <<indices-forcemerge,force merge>> operations.
-    Thread pool type is `fixed` with a size of `max(1, (`<<node.processors,
-`# of allocated processors`>>`) / 8)` and an unbounded queue size.
+    For waiting on blocking [force merge](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) operations.
+    Thread pool type is `fixed` with a size of `max(1, (`[`# of allocated processors`](#node.processors)`) / 8)` and an unbounded queue size.
 
 `management`::
     For cluster management.