-
Notifications
You must be signed in to change notification settings - Fork 10
Iterate alleles and genotypes at the same time #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterate alleles and genotypes at the same time #363
Conversation
Looks good overall, but I don't like the include_genotypes parameter - the only reason we wouldn't include genotypes is if they don't exist, right? So we can just check this, rather than passing responsibility to client code. |
The edge case here is if genotypes exist, but they aren't in the schema. If we're ok with wasting work fetching them in this odd case then we can remove it. |
I'm fine with that |
5d6c024
to
32d6362
Compare
Ok, should be good to go now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're over complicating this - can't you define the VCF case using the existing functions?
bio2zarr/icf.py
Outdated
for field in self.metadata.fields: | ||
self.fields[field.full_name] = IntermediateColumnarFormatField(self, field) | ||
if field.name == "GT": | ||
self.gt_field = field | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the continue
for?
alt_field = self.fields["ALT"] | ||
|
||
for ref, alt, gt in zip( | ||
ref_field.iter_values(start, stop), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too much in one zip - let's break it up into simpler bits
What's wrong with
if self.gt_field is None:
for alleles in self.iter_alleles(start, stop):
yield alleles, None, None
else:
yield from zip(self.iter_alleles(start, stop), self.iter_genotypes(start, stop)):
and just keep the old definitions?
32d6362
to
f19420c
Compare
Comments addressed! |
In preparation for tskit.
Stacked on #362 clean diff at 33d1d31