-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[docs] Fix num_threads/n_jobs semantics and default #6863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
`num_threads` was changed in microsoft#5105 for LightGBM 4.0. The docs weren't updated and I was confused why setting the OpenMP threads didn't have any effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for using LightGBM and taking the time to contribute!
Please see my suggested changes. Also, please don't edit docs/Parameters.rst
directly... make edits here instead:
https://github.com/microsoft/LightGBM/blob/master/include/LightGBM/config.h
Then run this from the root of the repo:
python .ci/parameter-generator.py
@@ -240,14 +240,18 @@ Core Parameters | |||
|
|||
- refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details | |||
|
|||
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` | |||
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``None``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``None``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` | |
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` |
This is not correct. None
is a Python-specific concept, and LightGBM is not just a Python package. It's a C++ library with a C API, a command line interface (CLI), an R package, and many more extensions.
This parameters page documents the impact of these parameters for the core LightGBM library. It's fine to add a note here that the scikit-learn
estimators in the Python package specially have a different behavior, but this default should not be changed. It is really 0:
LightGBM/include/LightGBM/config.h
Line 239 in 6437645
int num_threads = 0; |
- ``None`` means number of physical cores | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- ``None`` means number of physical cores |
Let's remove this, in favor of a single new note for the scikit-learn
interface.
- ``0`` means default number of threads in OpenMP | ||
|
||
- Negative integers are interpreted as following joblib's formula (``n_cpus + 1 + n_jobs``), just like scikit-learn (so e.g. -1 means using all cores, physical or logical). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this all the way to the bottom of the docs for this parameter, and change it to something like the following.
- **Note**: For the ``scikit-learn`` estimators in the Python package (like ``LGBMClassifier``), the values for this parameter follow ``scikit-learn``'s interpretation. See `the LGBMModel docs for n_jobs <pythonapi/lightgbm.LGBMModel.html>`__ for details.
These scikit-learn
-specific details are already documented in the docs the Python package. Look for n_jobs
at https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMModel.html (and the similar pages for LGBMClassifier
/ LGBMRanker
/ LGBMRegressor
).
I think this is worth calling out in a note pointing to those docs though, to avoid confusion like what you faced.
/AzurePipelines run |
num_threads
was changed in #5105 for LightGBM 4.0.The docs weren't updated and I was confused why setting the OpenMP threads didn't have any effect.