You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This documents the new threadpool-based merge scheduler, which is disk space aware, and blocks merges when disk space is low. The code changes were mostly introduced in #120869 and #127613 . (#130530)
Copy file name to clipboardExpand all lines: docs/reference/index-modules/merge.asciidoc
+28-14Lines changed: 28 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -14,18 +14,32 @@ resources between merging and other activities like search.
14
14
[[merge-scheduling]]
15
15
=== Merge scheduling
16
16
17
-
The merge scheduler (ConcurrentMergeScheduler) controls the execution of merge
18
-
operations when they are needed. Merges run in separate threads, and when the
19
-
maximum number of threads is reached, further merges will wait until a merge
20
-
thread becomes available.
21
-
22
-
The merge scheduler supports the following _dynamic_ setting:
23
-
24
-
`index.merge.scheduler.max_thread_count`::
25
-
26
-
The maximum number of threads on a single shard that may be merging at once.
27
-
Defaults to
28
-
`Math.max(1, Math.min(4, <<node.processors, node.processors>> / 2))` which
29
-
works well for a good solid-state-disk (SSD). If your index is on spinning
30
-
platter drives instead, decrease this to 1.
17
+
The merge scheduler controls the execution of merge operations when they are needed.
18
+
Merges run on the dedicated `merge` thread pool.
19
+
Smaller merges are prioritized over larger ones, across all shards on the node.
20
+
Merges are disk IO throttled so that bursts, while merging activity is otherwise low, are smoothed out in order to not impact indexing throughput.
21
+
There is no limit on the number of merges that can be enqueued for execution on the thread pool.
22
+
However, beyond a certain per-shard limit, after merging is completely disk IO un-throttled, indexing for the shard will itself be throttled until merging catches up.
23
+
24
+
The available disk space is periodically monitored, such that no new merge tasks are scheduled for execution when the available disk space is low.
25
+
This is in order to prevent that the temporary disk space, which is required while merges are executed, completely fills up the disk space on the node.
26
+
27
+
The merge scheduler supports the following *dynamic* settings:
28
+
29
+
`index.merge.scheduler.max_thread_count`
30
+
: The maximum number of threads on a **single** shard that may be merging at once. Defaults to `Math.max(1, Math.min(4, <<node.processors, node.processors>> / 2))` which works well for a good solid-state-disk (SSD). If your index is on spinning platter drives instead, decrease this to 1.
31
+
32
+
`indices.merge.disk.check_interval`
33
+
: The time interval for checking the available disk space. Defaults to `5s`.
34
+
35
+
`indices.merge.disk.watermark.high`
36
+
: Controls the disk usage watermark, which defaults to `95%`, beyond which no merge tasks can start execution.
37
+
The disk usage tally includes the estimated temporary disk space still required by all the currently executing merge tasks.
38
+
Any merge task scheduled *before* the limit is reached continues execution, even if the limit is exceeded while executing
39
+
(merge tasks are not aborted).
40
+
41
+
`indices.merge.disk.watermark.high.max_headroom`
42
+
: Controls the max headroom for the merge disk usage watermark, in case it is specified as percentage or ratio values.
43
+
Defaults to `100GB` when `indices.merge.disk.watermark.high` is not explicitly set.
44
+
This caps the amount of free disk space before merge scheduling is blocked.
0 commit comments