Skip to content

Conversation

@liamzwbao
Copy link
Contributor

@liamzwbao liamzwbao commented Oct 28, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

  • Updated RleDecoder::reload to return Result instead of panicking.
  • Adjusted all callers to handle the new return type accordingly.

Are these changes tested?

Covered by existing tests

Are there any user-facing changes?

No

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 28, 2025
@liamzwbao liamzwbao marked this pull request as ready for review October 28, 2025 01:43
@liamzwbao
Copy link
Contributor Author

Hi @alamb @etseidl, this PR is ready for review. PTAL, thanks!

Ran local benchmarks and most test cases showed slight improvements while some showed regression, results varied across runs tho. However, I don’t believe this change brings any meaningful performance gain or loss overall, likely measurement noise.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @liamzwbao -- this looks good to me.

I will also run some benchmarks to confirm, but I also don't expect to see anything other than noise

decoder: DictIndexDecoder::new(data, num_levels, num_values),
}
fn new(data: Bytes, num_levels: usize, num_values: Option<usize>) -> Result<Self> {
Ok(Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified this is a crate private structure, so this is not a public API change

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing issue-8632-rle-fix (7e8f0e3) to 6c3e588 diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=issue-8632-rle-fix
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖: Benchmark completed

Details

group                                                                                                      issue-8632-rle-fix                     main
-----                                                                                                      ------------------                     ----
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.16   1271.5±3.97µs        ? ?/sec    1.00   1097.1±2.03µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.07   1285.4±7.96µs        ? ?/sec    1.00  1198.3±13.18µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.16   1278.4±9.65µs        ? ?/sec    1.00   1104.8±4.17µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.06    504.4±7.65µs        ? ?/sec    1.00    476.1±3.71µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    663.3±1.80µs        ? ?/sec    1.02    676.2±3.06µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.06    508.3±2.56µs        ? ?/sec    1.00    479.9±3.29µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.00    527.5±3.30µs        ? ?/sec    1.08    567.1±2.88µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.00    725.1±2.28µs        ? ?/sec    1.02    739.6±2.17µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.00    540.6±2.30µs        ? ?/sec    1.07    580.1±5.07µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    238.3±2.83µs        ? ?/sec    1.15    273.3±2.22µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.03    253.3±1.00µs        ? ?/sec    1.00    246.2±0.85µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.00    244.2±3.20µs        ? ?/sec    1.18    288.8±3.38µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.00    293.6±1.55µs        ? ?/sec    1.20    351.1±3.31µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.00    283.5±1.12µs        ? ?/sec    1.22    344.7±1.20µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.00    285.0±0.99µs        ? ?/sec    1.01    287.0±1.46µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.00    301.0±3.04µs        ? ?/sec    1.24    373.8±2.95µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.06   1035.6±3.40µs        ? ?/sec    1.00    981.4±4.18µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.03    861.3±3.20µs        ? ?/sec    1.00    839.1±2.48µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.05   1043.0±2.52µs        ? ?/sec    1.00    989.8±3.42µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.37    415.3±2.78µs        ? ?/sec    1.00    304.0±2.76µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.16    548.6±1.94µs        ? ?/sec    1.00    474.9±2.53µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.35    419.1±1.97µs        ? ?/sec    1.00    310.6±2.39µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    154.5±0.54µs        ? ?/sec    1.31    202.3±0.36µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    286.3±2.91µs        ? ?/sec    1.20    342.5±0.68µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    159.7±1.66µs        ? ?/sec    1.30    207.7±0.55µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     78.7±0.24µs        ? ?/sec    1.51    118.7±0.34µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    249.8±0.65µs        ? ?/sec    1.20    300.2±0.87µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     84.8±0.39µs        ? ?/sec    1.47    124.7±0.40µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.00    688.1±1.95µs        ? ?/sec    1.07    735.0±1.64µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.00    512.1±1.68µs        ? ?/sec    1.14    584.6±2.93µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.00    696.3±4.62µs        ? ?/sec    1.07    741.7±2.27µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.17     64.6±4.54µs        ? ?/sec    1.00     55.1±6.05µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    205.4±1.35µs        ? ?/sec    1.19    244.5±1.92µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.23     76.0±6.78µs        ? ?/sec    1.00     61.9±4.98µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.00     86.3±0.50µs        ? ?/sec    1.09     94.1±0.18µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.00    217.6±0.82µs        ? ?/sec    1.08    235.4±0.45µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.00     91.7±0.39µs        ? ?/sec    1.09     99.5±0.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.01      9.6±0.24µs        ? ?/sec    1.00      9.5±0.14µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.00    179.2±0.48µs        ? ?/sec    1.08    192.6±0.60µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.02     15.0±0.31µs        ? ?/sec    1.00     14.7±0.16µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.00    170.1±0.79µs        ? ?/sec    1.08    184.4±0.88µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.00    334.0±1.04µs        ? ?/sec    1.03    345.0±0.99µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.00    176.0±0.45µs        ? ?/sec    1.08    190.6±2.34µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.13     14.7±0.27µs        ? ?/sec    1.00     13.0±0.39µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.00    258.6±0.74µs        ? ?/sec    1.00    259.7±1.04µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     20.2±0.29µs        ? ?/sec    1.00     20.1±0.81µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.00    340.7±2.15µs        ? ?/sec    1.07    363.5±1.14µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.00    330.2±2.81µs        ? ?/sec    1.19    391.9±1.49µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.00    347.2±2.36µs        ? ?/sec    1.07    371.6±1.12µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.08     25.1±0.43µs        ? ?/sec    1.00     23.2±0.47µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.00    171.0±0.65µs        ? ?/sec    1.27    217.7±0.71µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.13     33.0±0.26µs        ? ?/sec    1.00     29.2±0.46µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    122.1±0.38µs        ? ?/sec    1.03    126.0±0.79µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    122.7±0.55µs        ? ?/sec    1.01    123.8±1.22µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    124.2±0.32µs        ? ?/sec    1.03    128.1±0.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    173.2±0.30µs        ? ?/sec    1.06    183.1±1.09µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    205.6±0.72µs        ? ?/sec    1.01    208.6±1.34µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.00    180.6±0.49µs        ? ?/sec    1.04    188.4±1.11µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.02     77.5±0.20µs        ? ?/sec    1.00     75.6±0.19µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    151.6±0.63µs        ? ?/sec    1.02    154.9±0.71µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.02     83.8±0.28µs        ? ?/sec    1.00     82.0±0.43µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    137.3±0.89µs        ? ?/sec    1.01    138.5±0.53µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    187.4±0.49µs        ? ?/sec    1.00    187.3±1.30µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    143.1±0.44µs        ? ?/sec    1.01    144.0±0.82µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     73.1±0.26µs        ? ?/sec    1.01     74.2±0.35µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.00    149.8±0.49µs        ? ?/sec    1.01    151.9±0.63µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.00     77.8±0.24µs        ? ?/sec    1.03     79.8±0.49µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    113.7±0.36µs        ? ?/sec    1.02    115.7±1.43µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.01    134.6±1.41µs        ? ?/sec    1.00    133.6±0.57µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    117.2±0.38µs        ? ?/sec    1.00    116.9±1.38µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    170.4±0.71µs        ? ?/sec    1.00    170.3±0.58µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.00    230.1±1.30µs        ? ?/sec    1.03    236.9±3.05µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    174.3±0.46µs        ? ?/sec    1.01    176.6±2.39µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    204.3±0.52µs        ? ?/sec    1.00    203.5±0.39µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.01    252.8±2.31µs        ? ?/sec    1.00    249.9±0.61µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.01    211.3±0.46µs        ? ?/sec    1.00    209.8±0.43µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.07    155.1±0.38µs        ? ?/sec    1.00    145.4±0.50µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.02    223.1±2.23µs        ? ?/sec    1.00    218.5±0.53µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.04    157.9±0.48µs        ? ?/sec    1.00    152.4±0.65µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.02    111.9±1.26µs        ? ?/sec    1.00    109.7±0.42µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.01    199.9±1.44µs        ? ?/sec    1.00    198.8±0.62µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.02    121.1±1.81µs        ? ?/sec    1.00    118.8±1.37µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00     96.7±0.15µs        ? ?/sec    1.01     97.9±0.25µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    101.3±0.56µs        ? ?/sec    1.00    101.2±0.39µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00    100.0±2.66µs        ? ?/sec    1.01    100.6±0.28µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    132.2±0.19µs        ? ?/sec    1.03    135.7±0.42µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    163.7±0.38µs        ? ?/sec    1.01    165.1±0.47µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    136.7±0.63µs        ? ?/sec    1.02    140.0±0.37µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     44.0±0.14µs        ? ?/sec    1.00     44.2±0.11µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.01    118.9±0.37µs        ? ?/sec    1.00    117.4±0.24µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     48.5±0.16µs        ? ?/sec    1.00     48.7±0.14µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    103.3±0.52µs        ? ?/sec    1.02    105.8±0.28µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    150.8±0.36µs        ? ?/sec    1.01    152.6±0.92µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    107.8±0.49µs        ? ?/sec    1.02    109.9±0.26µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.01     38.4±0.11µs        ? ?/sec    1.00     38.2±0.12µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.00    114.5±0.38µs        ? ?/sec    1.01    115.1±0.40µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.01     43.4±0.12µs        ? ?/sec    1.00     43.0±0.22µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     95.6±0.29µs        ? ?/sec    1.04     99.1±0.18µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00     95.0±0.47µs        ? ?/sec    1.02     96.9±0.26µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     97.9±0.39µs        ? ?/sec    1.04    101.8±0.72µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    122.6±0.73µs        ? ?/sec    1.08    131.9±1.16µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    151.7±0.21µs        ? ?/sec    1.01    153.7±0.37µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    126.6±0.48µs        ? ?/sec    1.07    135.9±0.27µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.02     26.6±0.28µs        ? ?/sec    1.00     26.0±0.20µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.00     99.6±0.23µs        ? ?/sec    1.00     99.9±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.01     30.9±0.25µs        ? ?/sec    1.00     30.5±0.24µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     85.4±0.38µs        ? ?/sec    1.03     88.3±0.46µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    132.2±0.32µs        ? ?/sec    1.02    134.2±0.36µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     89.8±0.31µs        ? ?/sec    1.03     92.3±0.30µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.03     18.5±0.46µs        ? ?/sec    1.00     18.0±0.52µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.01     95.9±0.37µs        ? ?/sec    1.00     95.0±0.79µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.00     24.7±0.47µs        ? ?/sec    1.00     24.7±0.45µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     87.0±0.67µs        ? ?/sec    1.00     87.1±0.33µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00    105.6±0.47µs        ? ?/sec    1.01    106.1±1.05µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     88.6±0.88µs        ? ?/sec    1.01     89.5±0.54µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    115.9±0.49µs        ? ?/sec    1.00    115.7±0.40µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    174.2±0.49µs        ? ?/sec    1.00    174.6±0.66µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    119.1±0.62µs        ? ?/sec    1.00    119.1±0.83µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.00    149.5±0.39µs        ? ?/sec    1.00    149.5±0.28µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.00    194.9±0.49µs        ? ?/sec    1.00    194.1±0.61µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.00    154.9±0.50µs        ? ?/sec    1.00    154.7±0.37µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.08     99.6±0.96µs        ? ?/sec    1.00     92.2±0.82µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.02    167.0±2.29µs        ? ?/sec    1.00    163.8±0.35µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.07    104.4±0.59µs        ? ?/sec    1.00     97.6±0.49µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.14     48.1±3.10µs        ? ?/sec    1.00     42.2±0.68µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    137.6±0.59µs        ? ?/sec    1.00    137.3±0.53µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.12     54.2±3.97µs        ? ?/sec    1.00     48.5±0.98µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     91.8±0.49µs        ? ?/sec    1.05     96.6±2.45µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00     96.1±1.11µs        ? ?/sec    1.01     97.3±0.56µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     94.3±0.72µs        ? ?/sec    1.04     98.5±0.27µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.00    123.0±0.26µs        ? ?/sec    1.08    133.4±0.46µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    155.6±0.31µs        ? ?/sec    1.02    158.5±0.43µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    127.2±0.69µs        ? ?/sec    1.08    137.5±0.29µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.05     36.2±0.21µs        ? ?/sec    1.00     34.4±0.17µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.02    111.3±0.79µs        ? ?/sec    1.00    108.9±0.32µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.00     40.8±0.14µs        ? ?/sec    1.01     41.0±0.20µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     95.6±0.32µs        ? ?/sec    1.02     97.8±0.30µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    142.4±0.43µs        ? ?/sec    1.01    144.1±0.37µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00    100.4±0.36µs        ? ?/sec    1.02    102.2±0.25µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     30.5±0.07µs        ? ?/sec    1.00     30.5±0.22µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.00    106.7±0.76µs        ? ?/sec    1.01    107.8±0.27µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     35.3±0.13µs        ? ?/sec    1.00     35.4±0.29µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.03      7.2±0.03ms        ? ?/sec    1.00      7.0±0.04ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.00     12.9±0.12ms        ? ?/sec    1.00     13.0±0.13ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.07    512.8±4.71µs        ? ?/sec    1.00    480.8±2.95µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    662.9±1.88µs        ? ?/sec    1.00    662.6±2.17µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.05   512.9±10.41µs        ? ?/sec    1.00    486.4±3.21µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.00    638.2±1.81µs        ? ?/sec    1.15    736.8±3.84µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.00    779.6±3.12µs        ? ?/sec    1.05   819.2±13.58µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.00    648.7±2.02µs        ? ?/sec    1.14    742.7±3.61µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.00    300.3±1.30µs        ? ?/sec    1.00    300.3±1.31µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.06    379.6±1.52µs        ? ?/sec    1.00    356.9±1.60µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.02    312.4±1.11µs        ? ?/sec    1.00    305.3±1.59µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    239.5±3.32µs        ? ?/sec    1.14    272.7±2.43µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.02    244.7±0.52µs        ? ?/sec    1.00    238.9±1.35µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.00    237.5±2.26µs        ? ?/sec    1.19    282.2±3.56µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.11    501.3±1.96µs        ? ?/sec    1.00    452.6±6.23µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.16    389.4±1.59µs        ? ?/sec    1.00    334.6±0.96µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.10    509.8±1.79µs        ? ?/sec    1.00    463.1±1.93µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.01    106.9±0.25µs        ? ?/sec    1.00    105.6±0.20µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.01    106.3±0.19µs        ? ?/sec    1.00    105.4±0.44µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.01    109.3±0.23µs        ? ?/sec    1.00    108.0±0.18µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.00    140.2±0.26µs        ? ?/sec    1.05    147.2±0.44µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    169.6±0.80µs        ? ?/sec    1.02    172.5±1.31µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.00    144.9±0.35µs        ? ?/sec    1.04    151.0±0.45µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     42.3±0.15µs        ? ?/sec    1.04     44.1±0.10µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.02    118.0±0.40µs        ? ?/sec    1.00    116.2±0.51µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.00     47.1±0.08µs        ? ?/sec    1.03     48.5±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    102.6±1.48µs        ? ?/sec    1.03    105.7±0.38µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    150.6±0.29µs        ? ?/sec    1.01    151.8±0.58µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    107.5±0.28µs        ? ?/sec    1.02    109.9±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.00     38.3±0.09µs        ? ?/sec    1.00     38.3±0.15µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.00    113.9±0.33µs        ? ?/sec    1.01    115.5±0.29µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     43.3±0.50µs        ? ?/sec    1.00     43.4±0.17µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     95.2±0.82µs        ? ?/sec    1.04     99.3±0.23µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.00     94.9±0.25µs        ? ?/sec    1.02     97.1±0.77µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00     98.4±0.39µs        ? ?/sec    1.04    101.9±0.61µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.00    123.1±0.43µs        ? ?/sec    1.07    132.0±0.32µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    150.5±0.37µs        ? ?/sec    1.01    152.6±0.72µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.00    127.2±0.41µs        ? ?/sec    1.07    136.6±0.69µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.10     26.8±0.35µs        ? ?/sec    1.00     24.4±0.25µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.00     98.9±0.96µs        ? ?/sec    1.01     99.5±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.02     30.7±0.25µs        ? ?/sec    1.00     30.1±0.29µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     85.6±0.29µs        ? ?/sec    1.03     87.9±0.62µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    133.4±1.49µs        ? ?/sec    1.01    134.1±0.32µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     90.3±0.42µs        ? ?/sec    1.02     92.0±0.52µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.01     21.4±0.65µs        ? ?/sec    1.00     21.2±0.45µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.00     97.4±0.32µs        ? ?/sec    1.00     97.6±0.73µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.02     26.5±0.78µs        ? ?/sec    1.00     26.0±0.51µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     86.3±0.29µs        ? ?/sec    1.01     86.9±0.25µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00    105.5±0.63µs        ? ?/sec    1.00    105.9±0.33µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     88.8±0.43µs        ? ?/sec    1.01     89.4±0.28µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.01    116.3±0.49µs        ? ?/sec    1.00    115.6±0.32µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.04    180.9±1.89µs        ? ?/sec    1.00    173.7±0.96µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.01    120.4±0.50µs        ? ?/sec    1.00    118.8±0.49µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.00    148.6±0.22µs        ? ?/sec    1.00    149.1±0.63µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.00    194.8±0.62µs        ? ?/sec    1.00    195.0±0.47µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.00    153.8±0.32µs        ? ?/sec    1.01    154.7±0.41µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.07     99.6±0.65µs        ? ?/sec    1.00     93.0±0.71µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.01    165.8±0.38µs        ? ?/sec    1.00    164.1±0.56µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.06    103.9±0.45µs        ? ?/sec    1.00     98.3±1.08µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.15     48.9±1.94µs        ? ?/sec    1.00     42.6±0.73µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.00    138.5±0.66µs        ? ?/sec    1.00    138.1±0.70µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.10     55.0±2.00µs        ? ?/sec    1.00     50.0±0.54µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.00    101.0±0.40µs        ? ?/sec    1.02    103.0±0.20µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.00    100.8±0.15µs        ? ?/sec    1.00    100.8±0.87µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.01    105.9±0.29µs        ? ?/sec    1.00    104.6±0.18µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.00    135.7±0.36µs        ? ?/sec    1.04    140.9±0.34µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    162.7±0.79µs        ? ?/sec    1.01    164.3±0.65µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.00    140.2±0.77µs        ? ?/sec    1.04    145.3±0.53µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.05     36.2±0.08µs        ? ?/sec    1.00     34.4±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.01    110.6±0.18µs        ? ?/sec    1.00    109.8±0.27µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     40.7±0.07µs        ? ?/sec    1.00     40.6±0.13µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     95.5±0.19µs        ? ?/sec    1.02     97.5±0.26µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    143.7±1.05µs        ? ?/sec    1.00    144.2±0.33µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00    100.1±0.62µs        ? ?/sec    1.02    102.1±0.38µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.00     30.2±0.06µs        ? ?/sec    1.01     30.4±0.15µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.00    107.2±0.29µs        ? ?/sec    1.00    107.4±0.35µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.00     35.2±0.23µs        ? ?/sec    1.00     35.0±0.11µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing issue-8632-rle-fix (7e8f0e3) to 6c3e588 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=issue-8632-rle-fix
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖: Benchmark completed

Details

group                                issue-8632-rle-fix                     main
-----                                ------------------                     ----
arrow_reader_clickbench/async/Q1     1.00      2.4±0.05ms        ? ?/sec    1.01      2.4±0.03ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     12.3±0.23ms        ? ?/sec    1.03     12.7±0.30ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     14.1±0.25ms        ? ?/sec    1.03     14.5±0.37ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.00     27.1±0.30ms        ? ?/sec    1.18     31.9±0.31ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.00     38.4±0.24ms        ? ?/sec    1.17     45.1±0.31ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.00     36.3±0.33ms        ? ?/sec    1.15     41.7±0.33ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.5±0.10ms        ? ?/sec    1.01      5.6±0.09ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.00    119.4±0.55ms        ? ?/sec    1.34   160.5±14.16ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00    138.8±0.80ms        ? ?/sec    1.16   161.2±26.40ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.00    269.5±5.42ms        ? ?/sec    1.08   292.1±11.89ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    423.7±1.74ms        ? ?/sec    1.00    423.4±3.33ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.00     43.0±0.47ms        ? ?/sec    1.00     43.2±0.41ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    102.8±0.48ms        ? ?/sec    1.00    102.9±0.49ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.00    103.2±1.47ms        ? ?/sec    1.00    103.3±0.56ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.01     52.9±0.38ms        ? ?/sec    1.00     52.6±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.00    124.0±0.72ms        ? ?/sec    1.00    123.4±0.56ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.01     99.1±0.69ms        ? ?/sec    1.00     98.3±0.35ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.01     37.1±0.31ms        ? ?/sec    1.00     36.9±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     47.7±0.28ms        ? ?/sec    1.01     48.1±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.00     45.8±0.34ms        ? ?/sec    1.00     45.9±0.69ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.00     35.9±0.34ms        ? ?/sec    1.00     35.9±0.42ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.00     13.7±0.09ms        ? ?/sec    1.01     13.8±0.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.01      2.1±0.01ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.02      8.9±0.05ms        ? ?/sec    1.00      8.8±0.05ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.03     10.7±0.29ms        ? ?/sec    1.00     10.4±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     37.6±0.67ms        ? ?/sec    1.00     37.8±0.29ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     48.7±0.39ms        ? ?/sec    1.00     48.6±0.33ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     46.8±0.35ms        ? ?/sec    1.00     46.9±0.46ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.02      4.3±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    177.5±1.28ms        ? ?/sec    1.00    177.7±0.79ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    239.7±1.82ms        ? ?/sec    1.01    241.2±0.89ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    482.5±3.09ms        ? ?/sec    1.00    483.8±5.67ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   437.8±14.74ms        ? ?/sec    1.01   440.3±14.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     49.8±0.58ms        ? ?/sec    1.00     49.8±0.67ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    152.3±0.89ms        ? ?/sec    1.00    152.8±1.44ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    148.1±0.85ms        ? ?/sec    1.00    148.8±1.17ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.01     51.1±0.34ms        ? ?/sec    1.00     50.5±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.01    155.3±1.24ms        ? ?/sec    1.00    153.3±1.36ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.01     90.0±0.47ms        ? ?/sec    1.00     89.4±0.48ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     29.5±0.25ms        ? ?/sec    1.00     29.4±0.20ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.01     34.3±0.37ms        ? ?/sec    1.00     33.9±0.38ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     43.3±0.33ms        ? ?/sec    1.00     43.3±0.35ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.00     32.7±0.17ms        ? ?/sec    1.01     33.0±0.36ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.01     12.8±0.09ms        ? ?/sec    1.00     12.7±0.11ms        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing issue-8632-rle-fix (7e8f0e3) to 6c3e588 diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=issue-8632-rle-fix
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 28, 2025

🤖: Benchmark completed

Details

group                                                                                issue-8632-rle-fix                     main
-----                                                                                ------------------                     ----
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.00   1710.4±7.73µs        ? ?/sec    1.00   1717.0±9.15µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.00   1972.1±9.60µs        ? ?/sec    1.01  1982.3±16.30µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.00   1555.9±7.05µs        ? ?/sec    1.02  1579.4±11.52µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.01  1657.6±10.45µs        ? ?/sec    1.00  1643.3±10.96µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.01   1526.7±6.51µs        ? ?/sec    1.00  1513.1±10.15µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00  1858.0±13.52µs        ? ?/sec    1.01  1871.5±20.10µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.00   1325.9±8.45µs        ? ?/sec    1.02   1358.0±8.71µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00  1457.3±10.33µs        ? ?/sec    1.01  1467.1±11.54µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.02  1729.3±19.24µs        ? ?/sec    1.00  1702.6±12.95µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.01  1991.9±17.92µs        ? ?/sec    1.00  1973.6±11.28µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.01  1571.3±21.15µs        ? ?/sec    1.00   1562.3±7.35µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.00   1643.0±9.05µs        ? ?/sec    1.00  1644.6±10.37µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.01    952.3±8.58µs        ? ?/sec    1.00    946.4±5.80µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00    992.5±7.74µs        ? ?/sec    1.01    997.9±6.19µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.01   876.4±18.25µs        ? ?/sec    1.00    864.5±5.18µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.00   986.9±10.25µs        ? ?/sec    1.00    985.9±7.07µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.00      4.1±0.02ms        ? ?/sec    1.00      4.1±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.00      4.1±0.02ms        ? ?/sec    1.00      4.1±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.00      3.6±0.02ms        ? ?/sec    1.00      3.6±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.01      3.5±0.02ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.02  1971.3±10.27µs        ? ?/sec    1.00  1941.4±16.29µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.01      2.2±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00   1784.7±8.19µs        ? ?/sec    1.00  1777.9±13.89µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.01  1913.1±11.98µs        ? ?/sec    1.00  1892.9±14.66µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00   1263.1±9.51µs        ? ?/sec    1.00  1263.2±13.13µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.00   1395.2±8.35µs        ? ?/sec    1.02   1416.4±8.80µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00   1146.0±5.34µs        ? ?/sec    1.01  1155.5±14.12µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.00   1262.7±8.09µs        ? ?/sec    1.01   1279.3±8.92µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.03      4.4±0.05ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.02      5.0±0.05ms        ? ?/sec    1.00      4.9±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.02      3.7±0.01ms        ? ?/sec    1.00      3.6±0.01ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.02      3.5±0.03ms        ? ?/sec    1.00      3.5±0.04ms        ? ?/sec

@etseidl
Copy link
Contributor

etseidl commented Oct 29, 2025

When I did essentially the same thing as this PR, the only reproducible slowdown was in some of the fixed-length tests. I never really ran that to ground, and am willing to chalk it up to (once again) the arrow_reader bench being too twitchy.

Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @liamzwbao. Just a few (very minor) stylistic nits you are free to ignore 😄

Comment on lines +472 to +475
let bit_reader = self
.bit_reader
.as_mut()
.ok_or_else(|| general_err!("bit_reader should be set"))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let bit_reader = self
.bit_reader
.as_mut()
.ok_or_else(|| general_err!("bit_reader should be set"))?;
let Some(bit_reader) = self.bit_reader.as_mut() else {
return Err(general_err!("bit_reader should be set"));
};

is a bit more concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but I'd like to keep this one for consistency as we already have a lot ok_or_else in this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I meant all of them 😅. But it compiles down to the same thing so your choice.

Comment on lines 135 to 141
match &mut self.decoder {
MaybePacked::Packed(d) => d.set_data(encoding, data),
MaybePacked::Packed(d) => {
d.set_data(encoding, data);
Ok(())
}
MaybePacked::Fallback(d) => d.set_data(encoding, data),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match &mut self.decoder {
MaybePacked::Packed(d) => d.set_data(encoding, data),
MaybePacked::Packed(d) => {
d.set_data(encoding, data);
Ok(())
}
MaybePacked::Fallback(d) => d.set_data(encoding, data),
}
Ok(match &mut self.decoder {
MaybePacked::Packed(d) => d.set_data(encoding, data),
MaybePacked::Fallback(d) => d.set_data(encoding, data)?,
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will trigger Clippy: passing a unit value to a function. But the following works:

       match &mut self.decoder {
           MaybePacked::Packed(d) => d.set_data(encoding, data),
           MaybePacked::Fallback(d) => d.set_data(encoding, data)?,
       };
       Ok(())

}

let _ = self.reload();
self.reload()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want to convince myself that ignoring the result is ok here, but this isn't a behavior change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of this change is that the old code ignored the result, but the PR proposes propagating the result

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code ignored the result...the new code ignores the result less obviously and then returns Ok(()). I was wondering if it would make more sense to test the result and error when false, but that would be a behavior change. I'm not familiar enough with this code to suggest doing that. Maybe just leave a comment here that the result is deliberately ignored. Justification for doing so would be the icing on the cake. 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignores the result less obviously

I think I am going crazy -- this PR changes the function signature to return a Result and then calls self.reload()?

I thought the ? propagates the error now.

How is the error ignored 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reload() returns a bool, so if it’s Ok(), the return value is simply ignored, but any error will still propagate with ?.

In the old code, the return value was ignored and could potentially cause a panic.
In the new code, the return value is still ignored, but it will return an Err instead of crashing the program.

Might be better to keep the let _ = and add comments here

Copy link
Contributor

@alamb alamb Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you -- sorry for my ignorance

Comment on lines +331 to +333
// Initialize decoder state. The boolean only reports whether the first run contained data,
// and `get`/`get_batch` already interpret that result to drive iteration. We only need
// errors propagated here, so the flag returned is intentionally ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️❤️❤️❤️❤️

@alamb alamb merged commit 7e54bb2 into apache:main Oct 31, 2025
16 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 31, 2025

Thank you @liamzwbao and @etseidl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Return error from RleDecoder::reset rather than panic

3 participants