You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Canadian Invertebrates 1.5Mdocumentation for Bioscan dataset API
- Added Canadian Invertebrates 1.5M dataset to the Bioscan dataset API documentation.
- Updated `__init__.py` in bioscan_dataset to expose the new dataset class and loader.
Copy file name to clipboardExpand all lines: docs/source/api.rst
+15-4Lines changed: 15 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,22 +1,23 @@
1
1
API Reference
2
2
=============
3
3
4
-
We provide :class:`~bioscan_dataset.BIOSCAN1M` and:class:`~bioscan_dataset.BIOSCAN5M` classes to load the respective `BIOSCAN-1M <BIOSCAN-1M paper_>`_ and `BIOSCAN-5M <BIOSCAN-5M paper_>`_ datasets for use within PyTorch.
4
+
We provide :class:`~bioscan_dataset.BIOSCAN1M` ,:class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` classes to load the respective `BIOSCAN-1M <BIOSCAN-1M paper_>`_ , `BIOSCAN-5M <BIOSCAN-5M paper_>`_ and `Canadian Invertebrates 1.5M <CanadianInvertebrates paper_>`_ datasets for use within PyTorch.
5
5
These classes are subclasses of :class:`torch.utils.data.Dataset` and are designed to be used with PyTorch's :class:`~torch.utils.data.DataLoader` for batching and model training.
6
6
7
-
General usage instructions for :class:`~bioscan_dataset.BIOSCAN1M` and:class:`~bioscan_dataset.BIOSCAN5M` are provided in our :doc:`usage <index>` guide.
7
+
General usage instructions for :class:`~bioscan_dataset.BIOSCAN1M` ,:class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` are provided in our :doc:`usage <index>` guide.
8
8
9
9
.. tip::
10
10
For new projects, we recommend using :class:`~bioscan_dataset.BIOSCAN5M` instead of :class:`~bioscan_dataset.BIOSCAN1M` since the newer dataset has cleaner labels and images.
11
11
For larger scale projects, :class:`~bioscan_dataset.BIOSCAN5M` is a superset of :class:`~bioscan_dataset.BIOSCAN1M` and will provide five times more samples to train on.
12
12
On the other hand, if 5 million samples is too much to handle, you can ignore the ``"pretrain"`` partition (train using the ``"train"`` partition only), which reduces the dataset to less than 400k samples.
13
13
14
-
The accompanying functions :func:`~bioscan_dataset.load_bioscan1m_metadata` and:func:`~bioscan_dataset.load_bioscan5m_metadata` can be used to load the metadata from the CSV files.
14
+
The accompanying functions :func:`~bioscan_dataset.load_bioscan1m_metadata` ,:func:`~bioscan_dataset.load_bioscan5m_metadata` and :func:`~bioscan_dataset.load_CanadianInvertebrates_metadata` can be used to load the metadata from the CSV files.
15
15
This produces a :class:`~pandas.DataFrame` in the same format as is used for model training.
16
-
These functions do not need to be manually called when you are using :class:`~bioscan_dataset.BIOSCAN1M` and:class:`~bioscan_dataset.BIOSCAN5M` to work with the datasets.
16
+
These functions do not need to be manually called when you are using :class:`~bioscan_dataset.BIOSCAN1M` ,:class:`~bioscan_dataset.BIOSCAN5M` and :class:`~bioscan_dataset.CanadianInvertebrates` to work with the datasets.
0 commit comments