Skip to content

Commit 971a0f1

Browse files
committed
Add Canadian Invertebrates 1.5Mdocumentation for Bioscan dataset API
- Added Canadian Invertebrates 1.5M dataset to the Bioscan dataset API documentation. - Updated `__init__.py` in bioscan_dataset to expose the new dataset class and loader.
1 parent bfa3099 commit 971a0f1

File tree

2 files changed

+27
-5
lines changed

2 files changed

+27
-5
lines changed

bioscan_dataset/__init__.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,18 @@
22

33
__version__ = __meta__.version
44

5-
__all__ = ["BIOSCAN1M", "BIOSCAN5M", "load_bioscan1m_metadata", "load_bioscan5m_metadata"]
5+
__all__ = [
6+
"BIOSCAN1M",
7+
"BIOSCAN5M",
8+
"load_bioscan1m_metadata",
9+
"load_bioscan5m_metadata",
10+
"CanadianInvertebrates",
11+
"load_CanadianInvertebrates_metadata",
12+
]
613

714
from .bioscan1m import BIOSCAN1M, load_bioscan1m_metadata
815
from .bioscan5m import BIOSCAN5M, load_bioscan5m_metadata
16+
from .CanadianInvertebrates import (
17+
CanadianInvertebrates,
18+
load_CanadianInvertebrates_metadata,
19+
)

docs/source/api.rst

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,23 @@
11
API Reference
22
=============
33

4-
We provide :class:`~bioscan_dataset.BIOSCAN1M` and :class:`~bioscan_dataset.BIOSCAN5M` classes to load the respective `BIOSCAN-1M <BIOSCAN-1M paper_>`_ and `BIOSCAN-5M <BIOSCAN-5M paper_>`_ datasets for use within PyTorch.
4+
We provide :class:`~bioscan_dataset.BIOSCAN1M` , :class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` classes to load the respective `BIOSCAN-1M <BIOSCAN-1M paper_>`_ , `BIOSCAN-5M <BIOSCAN-5M paper_>`_ and `Canadian Invertebrates 1.5M <CanadianInvertebrates paper_>`_ datasets for use within PyTorch.
55
These classes are subclasses of :class:`torch.utils.data.Dataset` and are designed to be used with PyTorch's :class:`~torch.utils.data.DataLoader` for batching and model training.
66

7-
General usage instructions for :class:`~bioscan_dataset.BIOSCAN1M` and :class:`~bioscan_dataset.BIOSCAN5M` are provided in our :doc:`usage <index>` guide.
7+
General usage instructions for :class:`~bioscan_dataset.BIOSCAN1M` , :class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` are provided in our :doc:`usage <index>` guide.
88

99
.. tip::
1010
For new projects, we recommend using :class:`~bioscan_dataset.BIOSCAN5M` instead of :class:`~bioscan_dataset.BIOSCAN1M` since the newer dataset has cleaner labels and images.
1111
For larger scale projects, :class:`~bioscan_dataset.BIOSCAN5M` is a superset of :class:`~bioscan_dataset.BIOSCAN1M` and will provide five times more samples to train on.
1212
On the other hand, if 5 million samples is too much to handle, you can ignore the ``"pretrain"`` partition (train using the ``"train"`` partition only), which reduces the dataset to less than 400k samples.
1313

14-
The accompanying functions :func:`~bioscan_dataset.load_bioscan1m_metadata` and :func:`~bioscan_dataset.load_bioscan5m_metadata` can be used to load the metadata from the CSV files.
14+
The accompanying functions :func:`~bioscan_dataset.load_bioscan1m_metadata` , :func:`~bioscan_dataset.load_bioscan5m_metadata` and :func:`~bioscan_dataset.load_CanadianInvertebrates_metadata` can be used to load the metadata from the CSV files.
1515
This produces a :class:`~pandas.DataFrame` in the same format as is used for model training.
16-
These functions do not need to be manually called when you are using :class:`~bioscan_dataset.BIOSCAN1M` and :class:`~bioscan_dataset.BIOSCAN5M` to work with the datasets.
16+
These functions do not need to be manually called when you are using :class:`~bioscan_dataset.BIOSCAN1M` , :class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` to work with the datasets.
1717

1818
.. _BIOSCAN-1M paper: https://papers.nips.cc/paper_files/paper/2023/hash/87dbbdc3a685a97ad28489a1d57c45c1-Abstract-Datasets_and_Benchmarks.html
1919
.. _BIOSCAN-5M paper: https://arxiv.org/abs/2406.12723
20+
.. _CanadianInvertebrates paper: https://www.nature.com/articles/s41597-019-0320-2
2021

2122

2223
BIOSCAN-1M Dataset
@@ -38,3 +39,13 @@ BIOSCAN-5M Dataset
3839
:show-inheritance:
3940

4041
.. autofunction:: bioscan_dataset.load_bioscan5m_metadata
42+
43+
CANADIAN INVERTEBRATES 1.5M Dataset
44+
-----------------------------------
45+
46+
.. autoclass:: bioscan_dataset.CanadianInvertebrates
47+
:members:
48+
:special-members: __getitem__
49+
:show-inheritance:
50+
51+
.. autofunction:: bioscan_dataset.load_CanadianInvertebrates_metadata

0 commit comments

Comments
 (0)