ESQL: Split large pages on load sometimes (#131053) #132036

nik9000 · 2025-07-28T18:13:00Z

This adds support for splitting Pages of large values when loading from single segment, non-descending hits. This is hottest code path as it's how we load data for aggregation. So! We had to make very very very sure this doesn't slow down the fast path of loading doc values.

Caveat - this only defends against loading large values via the row-by-row load mechanism that we use for stored fields and _source. That covers the most common kinds of large values - mostly text and geo fields. If we need to split further on docs values, we'll have to invent something for them specifically. For now, just row-by-row.

This works by flipping the order in which we load row-by-row and column-at-a-time values. Previously we loaded all column-at-a-time values first because that was simpler. Then we loaded all of the row-by-row values. Now we save the column-at-a-time values and instead load row-by-row until the Page's estimated size is larger than a "jumbo" size which defaults to a megabyte.

Once we load enough rows that we estimate the page is "jumbo", we then stop loading rows. The Page will look like this:

| txt1 | int | txt2 | long | double |
|------|-----|------|------|--------|
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        | <-- after loading this row
|      |     |      |      |        |     we crossed to "jumbo" size
|      |     |      |      |        |
|      |     |      |      |        |
|      |     |      |      |        | <-- these rows are entirely empty
|      |     |      |      |        |
|      |     |      |      |        |

Then we chop the page to the last row:

| txt1 | int | txt2 | long | double |
|------|-----|------|------|--------|
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |
| XXXX |     | XXXX |      |        |

Then fill in the column-at-a-time columns:

| txt1 | int | txt2 | long | double |
|------|-----|------|------|--------|
| XXXX |   1 | XXXX |   11 |    1.0 |
| XXXX |   2 | XXXX |   22 |   -2.0 |
| XXXX |   3 | XXXX |   33 |    1e9 |
| XXXX |   4 | XXXX |   44 |    913 |
| XXXX |   5 | XXXX |   55 | 0.1234 |
| XXXX |   6 | XXXX |   66 | 3.1415 |

And then we return that Page. On the next Driver iteration we start from where we left off.

This adds support for splitting `Page`s of large values when loading from single segment, non-descending hits. This is hottest code path as it's how we load data for aggregation. So! We had to make very very very sure this doesn't slow down the fast path of loading doc values. Caveat - this only defends against loading large values via the row-by-row load mechanism that we use for stored fields and _source. That covers the most common kinds of large values - mostly `text` and geo fields. If we need to split further on docs values, we'll have to invent something for them specifically. For now, just row-by-row. This works by flipping the order in which we load row-by-row and column-at-a-time values. Previously we loaded all column-at-a-time values first because that was simpler. Then we loaded all of the row-by-row values. Now we save the column-at-a-time values and instead load row-by-row until the `Page`'s estimated size is larger than a "jumbo" size which defaults to a megabyte. Once we load enough rows that we estimate the page is "jumbo", we then stop loading rows. The Page will look like this: ``` | txt1 | int | txt2 | long | double | |------|-----|------|------|--------| | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | <-- after loading this row | | | | | | we crossed to "jumbo" size | | | | | | | | | | | | | | | | | | <-- these rows are entirely empty | | | | | | | | | | | | ``` Then we chop the page to the last row: ``` | txt1 | int | txt2 | long | double | |------|-----|------|------|--------| | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | | XXXX | | XXXX | | | ``` Then fill in the column-at-a-time columns: ``` | txt1 | int | txt2 | long | double | |------|-----|------|------|--------| | XXXX | 1 | XXXX | 11 | 1.0 | | XXXX | 2 | XXXX | 22 | -2.0 | | XXXX | 3 | XXXX | 33 | 1e9 | | XXXX | 4 | XXXX | 44 | 913 | | XXXX | 5 | XXXX | 55 | 0.1234 | | XXXX | 6 | XXXX | 66 | 3.1415 | ``` And then we return *that* `Page`. On the next `Driver` iteration we start from where we left off.

nik9000 added backport v8.19.1 labels Jul 28, 2025

nik9000 added 2 commits July 28, 2025 14:46

fix

8035587

Merge branch '8.19' into esql_danger_zone_2_8_19

066585a

nik9000 mentioned this pull request Jul 28, 2025

ESQL: Split large pages on load sometimes #131053

Merged

nik9000 merged commit 6f2578e into elastic:8.19 Jul 28, 2025
21 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ESQL: Split large pages on load sometimes (#131053) #132036

ESQL: Split large pages on load sometimes (#131053) #132036

Uh oh!

nik9000 commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

ESQL: Split large pages on load sometimes (#131053) #132036

ESQL: Split large pages on load sometimes (#131053) #132036

Uh oh!

Conversation

nik9000 commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!