Skip to content

Array: ViewType gc() has bug when array sum length exceed i32::MAX #8681

@mapleFU

Description

@mapleFU

Describe the bug

This bug comes from this optimization: #7873

The reason is because, view layout, which is in https://arrow.apache.org/docs/format/Columnar.html#variable-size-binary-view-layout , would have only 4bytes for offset. If the sum off offset exceeds i32::MAX, the built array violates the StringView standard, which causing the bug.

See

let mut data_buf = Vec::with_capacity(total_large);
for detail

To Reproduce

Provide input which is longer than 2GiB, or 4GiB, the content would be buggy

Expected behavior

Produce valid data buffer

Additional context

No

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow cratebug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions