-
Notifications
You must be signed in to change notification settings - Fork 205
Extensible fragment metadata footer. #5686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extensible fragment metadata footer. #5686
Conversation
9d3c0ec to
17a1a24
Compare
| case DataDirectoryIdentifier::tile_global_order_offsets: | ||
| if (data_size != 2 * array_schema_->dim_num() * sizeof(uint64_t)) { | ||
| throw FragmentMetadataStatusException( | ||
| "Invalid size of tile global order bound offsets data directory"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add expected/found here so we can potentially cut a round trip out of the bug report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error would happen only if the fragment metadata file was corrupted. If somebody got it, just the expected/found sizes would not help us with diagnosing it, and we would need the whole array and/or a repro either way.
| serializer.write(DataDirectoryIdentifier::tile_global_order_offsets); | ||
| const auto num_dims = array_schema_->dim_num(); | ||
| serializer.write<uint32_t>(2 * num_dims * sizeof(uint64_t)); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| iassert(gt_offsets_.tile_global_order_min_offsets_.size() == 2 * num_dims); | |
| iassert(gt_offsets_.tile_global_order_max_offsets_.size() == 2 * num_dims); |
and then pass the size() value to write instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (modified).
| } | ||
|
|
||
| /** has_tile_global_order_bounds accessor */ | ||
| bool& has_tile_global_order_bounds() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks so odd... I see it's because the serializer requires friend access without being a friend. I've seen similarly odd non-const accessors in other places. I feel like that was the wrong paradigm to use.
I'm not suggesting anything actionable here, just musing
rroelke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look good but I am still having a hard time with the terminology "data directory" and I do think we should pick something else. And yet this is why naming things is the second hard thing in computer science.
What comes to mind is "semi-structured data". This work turns the fragment metadata footer from a highly structured binary file to a semi-structured one.
Fragment metadata can contain semi-structured attributes.
Or "self-describing data" where the entries have a key that tells you what they are. This feels more apt.
Fragment metadata footers can contain self-describing attributes.
Or even just optional, since all fields are optional going forward with this design:
Fragment metadata footers can contain any number of optional attributes which provide additional metadata.
63e346f to
d26a8af
Compare
teo-tsirpanis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed "data directory" to "optional section".
| serializer.write(DataDirectoryIdentifier::tile_global_order_offsets); | ||
| const auto num_dims = array_schema_->dim_num(); | ||
| serializer.write<uint32_t>(2 * num_dims * sizeof(uint64_t)); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (modified).
|
Merging per offline discussion… |
eb415af
into
rr/core-321-fragment-metdata-global-order-bounds
This PR builds on top of #5646, and updates the fragment metadata format specification to an extensible design that will not require another versin bump for future additions.
TYPE: NO_HISTORY