-
Notifications
You must be signed in to change notification settings - Fork 9
Add standard_dimensions for VCZ #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standard_dimensions for VCZ #389
Conversation
Sets default chunk size to min(size, default) for large dimensions. Closes sgkit-dev#368
a7dce49
to
b338e64
Compare
Centralise logic around default chunk sizes
Note that this changes the schema output so that we always explicitly list the chunk size for all dimensions: {
"format_version": "0.6",
"dimensions": {
"variants": {
"size": 9,
"chunk_size": 9
},
"samples": {
"size": 3,
"chunk_size": 3
},
"alleles": {
"size": 4,
"chunk_size": 4
},
"alt_alleles": {
"size": 3,
"chunk_size": 3
},
"filters": {
"size": 3,
"chunk_size": 3
},
"ploidy": {
"size": 2,
"chunk_size": 2
},
"genotypes": {
"size": 10,
"chunk_size": 10
},
"INFO_AC_dim": {
"size": 2,
"chunk_size": 2
},
"INFO_AF_dim": {
"size": 2,
"chunk_size": 2
},
"FORMAT_HQ_dim": {
"size": 2,
"chunk_size": 2
}
},
"fields": [
{
"name": "variant_contig",
"dtype": "i1",
"dimensions": [
"variants"
],
"description": "An identifier from the reference genome or an angle-bracketed ID string pointing to a contig in the assembly file",
"compressor": null,
"filters": null,
"source": null
},
I think this is better for now, as the logic around initialisation and defaults was quite tricky. We can always revert back to "no chunk size means chunk_size = size" for just JSON deserialisation later, if we want to make things a bit more concise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - I don't think that we need a schema version bump right? These changes look backward compatible.
Looks like we're still on 0.4 without the dimensions for the released version, so I think we're OK. |
Fixes #368 and also consolidates dimension handling to some degree