This repository was archived by the owner on Oct 10, 2025. It is now read-only.
Fix OverflowFile checkpoint corruption when no data is written #6046
+178
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes #6045
Fixes a critical bug where
OverflowFile::checkpoint()unconditionally allocated a header page even when no data had been written, causingPrimaryKeyIndexStorageInfocorruption and database reopen failures.Problem
When creating a VectorIndex without inserting any data, the database checkpoint completes successfully but corrupts the metadata. Reopening the database fails with an assertion error in
hash_index.cpp:487:Minimal Reproduction
Root Cause
In
src/storage/overflow_file.cpp:236,OverflowFile::checkpoint()was unconditionally allocating a page even when no data had been written:Sequence of events:
PrimaryKeyIndex(for STRING primary key)PrimaryKeyIndexcreates anOverflowFile(for strings >12 bytes) withheaderPageIdx = INVALID_PAGE_IDXOverflowFile::checkpoint()allocates a page unnecessarilyPrimaryKeyIndexStorageInfo.overflowHeaderPage = 1(should beINVALID_PAGE_IDX)Solution
Skip checkpoint when
headerChanged == false, following the same design pattern asNodeTable::checkpoint()andRelTable::checkpoint():The
headerChangedflag is only set totruewhen actual string data (>12 bytes) is written viaOverflowFileHandle::setStringOverflow().Benefits
Testing
Added comprehensive test suite in
test/storage/overflow_file_checkpoint_test.cppwith 5 test cases:InMemOverflowFileAlwaysAllocatesHeader- Verifies in-memory behaviorShortStringsDoNotTriggerOverflow- Verifies strings ≤12 bytes are inlinedLongStringsDoTriggerOverflow- Verifies strings >12 bytes use overflowEmptyOverflowFileHeaderNotChanged- Documents the core bug fixVectorIndexCreationSequence- Documents the bug scenarioAll tests pass:
Files Changed
src/storage/overflow_file.cpp- Added early return whenheaderChanged == falsetest/storage/CMakeLists.txt- Added new test targettest/storage/overflow_file_checkpoint_test.cpp- New test fileImpact
This fix resolves crashes when: