-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[docs] Fix num_threads/n_jobs semantics and default #6863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -240,14 +240,18 @@ Core Parameters | |||
|
||||
- refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details | ||||
|
||||
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` | ||||
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">🔗︎</a>`, default = ``None``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs`` | ||||
|
||||
- used only in ``train``, ``prediction`` and ``refit`` tasks or in correspondent functions of language-specific packages | ||||
|
||||
- number of threads for LightGBM | ||||
|
||||
- ``None`` means number of physical cores | ||||
|
||||
Comment on lines
+249
to
+250
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Let's remove this, in favor of a single new note for the |
||||
- ``0`` means default number of threads in OpenMP | ||||
|
||||
- Negative integers are interpreted as following joblib's formula (``n_cpus + 1 + n_jobs``), just like scikit-learn (so e.g. -1 means using all cores, physical or logical). | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please move this all the way to the bottom of the docs for this parameter, and change it to something like the following.
These I think this is worth calling out in a note pointing to those docs though, to avoid confusion like what you faced. |
||||
|
||||
- for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core) | ||||
|
||||
- do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows) | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct.
None
is a Python-specific concept, and LightGBM is not just a Python package. It's a C++ library with a C API, a command line interface (CLI), an R package, and many more extensions.This parameters page documents the impact of these parameters for the core LightGBM library. It's fine to add a note here that the
scikit-learn
estimators in the Python package specially have a different behavior, but this default should not be changed. It is really 0:LightGBM/include/LightGBM/config.h
Line 239 in 6437645