0.4.0a1
Pre-release
Pre-release
##Alpha release of tsinfer 0.4.0
Features
tsinfernow supports inferring data from anvcf-zarrdataset. This allows users
to infer from VCFs via the optimised and parallel VCF parsing inbio2zarr.- The
VariantDataclass can be used to load the vcf-data and be used for inference. vcf-zarrsample_idsare inserted into individual metadata asvariant_data_sample_id
if this key does not already exist.
Breaking Changes
- Remove the
uuidfield from SampleData. SampleData equality is now purely based
on data. ({pr}748, {user}benjeffery)
Performance improvements
-
Reduce memory usage when running
match_samplesagainst large cohorts
containing sequences with substantial amounts of error.
({pr}761, {user}jeromekelleher) -
truncate_ancestorsno longer requires loading all the ancestors into RAM.
({pr}811, {user}benjeffery) -
Reduce memory requirements of the
generate_ancestorsfunction by providing
thegenotype_encoding({pr}809) andmmap_temp_dir({pr}808) options
({user}jeromekelleher). -
Increase parallelisation of
match_ancestorsby generating parallel groups from
their implied dependency graph. ({pr}828, {issue}147, {user}benjeffery)