This repo is a dataloader tool based on LaminDB for training large-scale models using large amount of data distributed on many Anndata h5ad files.
- Create the conda/mamba environment:
conda env create -f environment.yml- Activate the environment:
conda activate lamin-dataloader- Install the package in development mode:
pip install -e .- Setup a lamindb instance according to the instructions