Skip to content

Commit 44a85a2

Browse files
authored
Merge pull request #62 from bioscan-ml/doc_contributing
DOC: Add CONTRIBUTING.md
2 parents 02d2209 + f3bcb9b commit 44a85a2

File tree

1 file changed

+173
-0
lines changed

1 file changed

+173
-0
lines changed

CONTRIBUTING.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
Contributing
2+
============
3+
4+
Thanks for considering contributing to the bioscan-dataset package!
5+
6+
Please take a moment to read these guidelines which will help you contribute
7+
efficiently to the package.
8+
9+
10+
Issues
11+
------
12+
13+
If you have found an issue with the [BIOSCAN-5M data](https://zenodo.org/records/11973457),
14+
such as an invalid image or erroneous label, please
15+
[open an issue on the BIOSCAN-5M repository](https://github.com/bioscan-ml/BIOSCAN-5M/issues/new/choose)
16+
to report it.
17+
If you have found an issue with the [BIOSCAN-1M data](https://zenodo.org/records/8030065),
18+
please check whether this was fixed in BIOSCAN-5M by aligning the dataset
19+
records in the two datasets on the `processid` field before reporting the
20+
issue since some known issues with the BIOSCAN-1M data were fixed in
21+
BIOSCAN-5M.
22+
There is currently no intention to update the BIOSCAN-1M data further, but
23+
BIOSCAN-5M will be periodically updated if issues are identified.
24+
25+
If you encounter a bug with the bioscan-dataset package which you would like
26+
to report, please:
27+
1. First ensure you are using the latest release
28+
![PyPI - Version](https://img.shields.io/pypi/v/bioscan-dataset)
29+
and/or check the [changelog](https://github.com/bioscan-ml/dataset/blob/master/CHANGELOG.rst)
30+
to see if the issue was already resolved.
31+
2. Secondly, check the [open issues](https://github.com/bioscan-ml/dataset/issues)
32+
to see if the bug has already been reported.
33+
3. Otherwise, open a [new issue](https://github.com/bioscan-ml/dataset/issues/new/choose),
34+
describing the version of the package you are using and details about your
35+
environment, the behaviour you expected, and the behaviour you instead
36+
encountered.
37+
38+
39+
Pull requests
40+
-------------
41+
42+
Pull requests are welcome!
43+
44+
For enhancements, please discuss with the code owner(s) by slack, email, or a
45+
GitHub issue before authoring your enhancement. This is to check your proposal
46+
is in line with the goals of the package and avoid wasting your time
47+
implementing code which will not be integrated.
48+
49+
If you have found a bug and are willing to fix it yourself, have found an error
50+
in the documentation, or can provide an improvement to the documentation, you
51+
are welcome to open a pull request without discussing it with the code owners
52+
beforehand.
53+
54+
1. Fork the repository.
55+
2. Clone your fork to the machine where you will develop the codebase.
56+
3. Install the pre-commit stack ([see below](#pre-commit) for details).
57+
4. Checkout a new feature or bugfix branch.
58+
- The branch name should be all in lowercase,
59+
start with a [commit tag](#commit-messages) (in lowercase),
60+
followed by a few words joined by hyphens that describe high-level
61+
objective of the PR.
62+
5. Implement and commit your changes.
63+
- Commits should be atomic: don't change multiple things that aren't related to each other in the same commit.
64+
- Commit messages should be succinctly descriptive of the change they contain.
65+
- Commit messages should open with an appropriate commit tag, or in some cases, combination of commit tags.
66+
([See below](#commit-messages) for details.)
67+
- Refactor your commit history to squash any "oops" commits that fix other commits within your PR into the commit they fix.
68+
6. For new features, consider adding an example usage of the feature to README.rst.
69+
- Note that you don't need to add details of the new feature to CHANGELOG.rst as this is updated at the time of release.
70+
7. [Submit your PR](https://github.com/bioscan-ml/dataset/compare) to the master branch.
71+
8. A maintainer should review your code within a week.
72+
- If you haven't heard anything after two weeks, feel free to send a reminder or check-in message.
73+
74+
75+
### pre-commit
76+
77+
The repository comes with a [pre-commit](https://pre-commit.com/) stack.
78+
This is a set of git hooks which are executed every time you make a commit.
79+
The hooks catch errors as they occur by checking your python code is valid and
80+
[flake8](https://flake8.pycqa.org/)-compliant, and will automatically
81+
adjust your code's formatting to conform to a standardized code style
82+
([black](https://github.com/psf/black)).
83+
84+
To set up the pre-commit hooks, run the following shell code from within the repo directory:
85+
86+
```bash
87+
# Install the developmental dependencies
88+
pip install -r requirements-dev.txt
89+
# Install the pre-commit hooks
90+
pre-commit install
91+
```
92+
93+
Whenever you try to commit code which is flagged by the pre-commit
94+
hooks, the commit will *not happen*. Some of the pre-commit hooks
95+
(such as [black](https://github.com/psf/black),
96+
[isort](https://github.com/timothycrosley/isort)) will automatically
97+
modify your code to fix the issues. When this happens, you'll have to
98+
stage the changes made by the commit hooks and then try your commit
99+
again. Other pre-commit hooks will not modify your code and will just
100+
tell you about issues which you'll then have to manually fix.
101+
102+
You can also manually run the pre-commit stack on all the files at any time:
103+
```bash
104+
pre-commit run --all-files
105+
```
106+
This is particularly useful if you already committed some code before
107+
installing pre-commit and need to run the linter on it later.
108+
109+
To force a commit to go through without passing the pre-commit hooks use the `--no-verify` flag:
110+
```bash
111+
git commit --no-verify
112+
```
113+
114+
The pre-commit stack which comes with the template is highly
115+
opinionated, and includes the following operations:
116+
117+
- Code is reformatted to use the [black](https://github.com/psf/black)
118+
style. Any code inside docstrings will be formatted to black using
119+
[blackendocs](https://github.com/asottile/blacken-docs).
120+
- Imports are automatically sorted using
121+
[isort](https://github.com/timothycrosley/isort).
122+
- [flake8](https://flake8.pycqa.org/) is run to check for
123+
linting errors with [pyflakes](https://github.com/PyCQA/pyflakes)
124+
(e.g. code does not compile or a variable is used before it is defined),
125+
and for conformity to the python style guide
126+
[PEP-8](https://www.python.org/dev/peps/pep-0008/).
127+
- Several [hooks from pre-commit](https://github.com/pre-commit/pre-commit-hooks)
128+
are used to screen for non-language specific git issues, such as incomplete
129+
git merges, or overly large files being commited to the repo, etc.
130+
- Several [hooks from pre-commit specific to python](https://github.com/pre-commit/pygrep-hooks)
131+
are used to screen for rST formatting issues, ensure noqa flags always
132+
specify an error code to ignore, etc.
133+
134+
Once it is set up, the pre-commit stack will run locally on every commit.
135+
The pre-commit stack will also run on github to ensure PRs are conformal.
136+
137+
138+
### Commit messages
139+
140+
Commit messages should be clear and follow a few basic rules. Example:
141+
142+
ENH: Add <functionality-X> [to <dataset or method name>]
143+
144+
The first line of the commit message starts with a capitalized acronym
145+
(options listed below) indicating what type of commit this is. Then a blank
146+
line, then more text if needed. Lines shouldn't be longer than 72
147+
characters. If the commit is related to a ticket, indicate that with
148+
"See #123", "Closes #123" or similar.
149+
150+
Describing the motivation for a change, the nature of a bug for bug fixes or
151+
some details on what an enhancement does are also good to include in a commit
152+
message. Messages should be understandable without looking at the code changes.
153+
Simple changes need only be one line long without extended description.
154+
A commit message like `MNT: Fixed another one` is an example of what not to do;
155+
the reader has to go look for context elsewhere to understand the message.
156+
157+
Standard acronyms (commit tags) to start the commit message with are based on
158+
the [commit tags used by numpy](https://numpy.org/doc/2.2/dev/development_workflow.html#writing-the-commit-message)
159+
as follows:
160+
161+
API: an (incompatible) API change
162+
BUG: bug fix
163+
CI: continuous integration
164+
DEP: deprecate something, or remove a deprecated object
165+
DEV: development tool or utility
166+
DOC: documentation
167+
ENH: enhancement
168+
MNT: maintenance commit (refactoring, typos, etc.)
169+
REL: related to releasing bioscan-dataset
170+
REV: revert an earlier commit
171+
STY: style fix (whitespace, PEP8)
172+
TST: addition or modification of tests
173+
WIP: work in progress, do not merge

0 commit comments

Comments
 (0)