-
Notifications
You must be signed in to change notification settings - Fork 11
Methods section on choosing compressor defaults #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sounds good. I'll be preparing/traveling to conferences for the next couple of weeks, but I will do this once it's all done. |
How is this looking @shz9? I'd like to get a draft of this into the document as soon as we can. Keen to get this preprinted in the next few weeks... |
I've been busy recently. I will try to get it done this weekend. |
OK, so I created the I'm planning to re-run the experiments as part of this new pipeline to make sure that the whole thing runs from start to finish. Once the results are in, I will describe them in the LaTeX document and do a pull request. Just need some help with the issue raised earlier. Some notes:
|
Great, thanks @shz9!
I think the simplest thing here is to download the first gigabyte or so of the full data using bcftools HTTP access features. See validation data Makefile in the bio2zarr repo for examples of how to do this. You'll need to tweak the number of lines to "head" to suit.
Yes please, that would be excellent
Yep, any usability tweaks like this just go for it. |
Sorry for the delays. The pipeline is now ready and the figures have been added to the latest version. I also added I added 3 figures that highlight 3 aspects of our discussions:
Once you get a chance to provide some feedback on the figures, I can write up our conclusions and do a pull request. |
That's great, can you open a pr please? It doesn't need to be final, and easier for me to give feedback |
Uh oh!
There was an error while loading. Please reload this page.
@shz9 has done a nice analysis of how various settings affect compression rations:
sgkit-dev/bio2zarr#74
We should incorporate this into the paper. I've made an initial Methods section "Choosing default compressor settings", where we can write a few paragraphs discussing what we did, and the basic conclusions. (Interesting to note that BitPack didn't do much, e.g.)
I guess we want some sort of supplementary figure or table summarising the results as well?
We also want to bring the code for doing this into the repo. A suggested sketch:
real_data
, and create a Makefile to download the file of interest and create the starting-point Zarr (can copy lots fromscaling/Makefile
src
plot_data
src/plot.py
Basically we want to keep everything here in the repo so it's all nice and reproducible.
How does this sound @shz9?
The text was updated successfully, but these errors were encountered: