Skip to content

Commit cfae3ff

Browse files
committed
Add inital docs system
1 parent 6fda59d commit cfae3ff

File tree

9 files changed

+169
-0
lines changed

9 files changed

+169
-0
lines changed

docs/Makefile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Need to set PYTHONPATH so that we pick up the local bio2zarr
2+
PYPATH=$(shell pwd)/../
3+
B2Z_VERSION:=$(shell PYTHONPATH=${PYPATH} \
4+
python3 -c 'import bio2zarr; print(bio2zarr.__version__.split("+")[0])')
5+
6+
BUILDDIR = _build
7+
8+
dev:
9+
PYTHONPATH=${PYPATH} ./build.sh
10+
11+
dist:
12+
@echo Building distribution for bio2zarr version ${B2Z_VERSION}
13+
cd doxygen && doxygen
14+
sed -i -e s/__BIO2ZARR_VERSION__/${B2Z_VERSION}/g _config.yml
15+
PYTHONPATH=${PYPATH} ./build.sh
16+
17+
clean:
18+
rm -fR $(BUILDDIR)

docs/_config.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Book settings
2+
# Learn more at https://jupyterbook.org/customize/config.html
3+
4+
title: bio2zarr Documentation
5+
author: sgkit developers
6+
logo: logo.png
7+
8+
# Force re-execution of notebooks on each build.
9+
# See https://jupyterbook.org/content/execute.html
10+
execute:
11+
execute_notebooks: force
12+
13+
# Define the name of the latex output file for PDF builds
14+
latex:
15+
latex_documents:
16+
targetname: bio2zarr.tex
17+
18+
# Add a bibtex file so that we can create citations
19+
bibtex_bibfiles:
20+
- references.bib
21+
22+
# Information about where the book exists on the web
23+
repository:
24+
url: https://github.com/sgkit-dev/bio2zarr # Online location of your book
25+
path_to_book: docs # Optional path to your book, relative to the repository root
26+
branch: main # Which branch of the repository should be used when creating links (optional)
27+
28+
# Add GitHub buttons to your book
29+
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
30+
html:
31+
use_issues_button: true
32+
use_repository_button: true
33+
34+
sphinx:
35+
extra_extensions:
36+
- sphinx_click.ext

docs/_toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
format: jb-book
2+
root: intro
3+
chapters:
4+
- file: cli

docs/build.sh

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
#/bin/bash
2+
3+
# Jupyter-build doesn't have an option to automatically show the
4+
# saved reports, which makes it difficult to debug the reasons for
5+
# build failures in CI. This is a simple wrapper to handle that.
6+
7+
REPORTDIR=_build/html/reports
8+
9+
jupyter-book build -Wn --keep-going .
10+
RETVAL=$?
11+
if [ $RETVAL -ne 0 ]; then
12+
if [ -e $REPORTDIR ]; then
13+
echo "Error occured; showing saved reports"
14+
cat $REPORTDIR/*
15+
fi
16+
else
17+
# Clear out any old reports
18+
rm -f $REPORTDIR/*
19+
fi
20+
exit $RETVAL

docs/cli.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Command Line Interface
2+
3+
```{eval-rst}
4+
.. click:: bio2zarr.cli:vcf2zarr
5+
:prog: vcf2zarr
6+
:show-nested:
7+
8+
.. click:: bio2zarr.cli:plink2zarr
9+
:prog: plink2zarr
10+
:show-nested:

docs/intro.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# bio2zarr Documentation
2+
3+
`bio2zarr` efficiently converts common bioinformatics formats to
4+
[Zarr](https://zarr.readthedocs.io/en/stable/) format. Initially supporting converting
5+
VCF to the [sgkit vcf-zarr specification](https://github.com/pystatgen/vcf-zarr-spec/).
6+
7+
`bio2zarr` is in early alpha development, contributions, feedback and issues are welcome
8+
at the [GitHub repository](https://github.com/sgkit-dev/bio2zarr).
9+
10+
## Installation
11+
`bio2zarr` can be installed from PyPI using pip:
12+
13+
```bash
14+
$ python3 -m pip install bio2zarr
15+
```
16+
17+
This will install the programs ``vcf2zarr``, ``plink2zarr`` and ``vcf_partition``
18+
into your local Python path. You may need to update your $PATH to call the
19+
executables directly.
20+
21+
Alternatively, calling
22+
```
23+
$ python3 -m bio2zarr vcf2zarr <args>
24+
```
25+
is equivalent to
26+
27+
```
28+
$ vcf2zarr <args>
29+
```
30+
and will always work.
31+
32+
## Basic vcf2zarr usage
33+
For modest VCF files (up to a few GB), a single command can be used to convert a VCF file
34+
(or set of VCF files) to Zarr:
35+
36+
```bash
37+
$ vcf2zarr convert <VCF1> <VCF2> ... <VCFN> <zarr>
38+
```
39+
40+
For larger files a multi-step process is recommended.
41+
42+
43+
First, convert the VCF into the intermediate format:
44+
45+
```bash
46+
$ vcf2zarr explode tests/data/vcf/sample.vcf.gz tmp/sample.exploded
47+
```
48+
49+
Then, (optionally) inspect this representation to get a feel for your dataset
50+
```bash
51+
$ vcf2zarr inspect tmp/sample.exploded
52+
```
53+
54+
Then, (optionally) generate a conversion schema to describe the corresponding
55+
Zarr arrays:
56+
57+
```bash
58+
$ vcf2zarr mkschema tmp/sample.exploded > sample.schema.json
59+
```
60+
61+
View and edit the schema, deleting any columns you don't want, or tweaking
62+
dtypes and compression settings to your taste.
63+
64+
Finally, encode to Zarr:
65+
```bash
66+
$ vcf2zarr encode tmp/sample.exploded tmp/sample.zarr -s sample.schema.json
67+
```
68+
69+
Use the ``-p, --worker-processes`` argument to control the number of workers used
70+
in the ``explode`` and ``encode`` phases.
71+
72+
73+
74+
75+
```{tableofcontents}
76+
```

docs/logo.png

105 KB
Loading

docs/references.bib

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
---
3+

docs/requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
jupyter-book
2+
sphinx-click

0 commit comments

Comments
 (0)