Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docs/Parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -240,14 +240,18 @@ Core Parameters

- refer to `Distributed Learning Guide <./Parallel-Learning-Guide.rst>`__ to get more details

- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``None``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``
- ``num_threads`` :raw-html:`<a id="num_threads" title="Permalink to this parameter" href="#num_threads">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = int, aliases: ``num_thread``, ``nthread``, ``nthreads``, ``n_jobs``

This is not correct. None is a Python-specific concept, and LightGBM is not just a Python package. It's a C++ library with a C API, a command line interface (CLI), an R package, and many more extensions.

This parameters page documents the impact of these parameters for the core LightGBM library. It's fine to add a note here that the scikit-learn estimators in the Python package specially have a different behavior, but this default should not be changed. It is really 0:

int num_threads = 0;


- used only in ``train``, ``prediction`` and ``refit`` tasks or in correspondent functions of language-specific packages

- number of threads for LightGBM

- ``None`` means number of physical cores

Comment on lines +249 to +250
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ``None`` means number of physical cores

Let's remove this, in favor of a single new note for the scikit-learn interface.

- ``0`` means default number of threads in OpenMP

- Negative integers are interpreted as following joblib's formula (``n_cpus + 1 + n_jobs``), just like scikit-learn (so e.g. -1 means using all cores, physical or logical).
Copy link
Collaborator

@jameslamb jameslamb Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this all the way to the bottom of the docs for this parameter, and change it to something like the following.

   -  **Note**: For the ``scikit-learn`` estimators in the Python package (like ``LGBMClassifier``), the values for this parameter follow ``scikit-learn``'s interpretation. See `the LGBMModel docs for n_jobs <pythonapi/lightgbm.LGBMModel.html>`__ for details.

These scikit-learn-specific details are already documented in the docs the Python package. Look for n_jobs at https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMModel.html (and the similar pages for LGBMClassifier / LGBMRanker / LGBMRegressor).

I think this is worth calling out in a note pointing to those docs though, to avoid confusion like what you faced.


- for the best speed, set this to the number of **real CPU cores**, not the number of threads (most CPUs use `hyper-threading <https://en.wikipedia.org/wiki/Hyper-threading>`__ to generate 2 threads per CPU core)

- do not set it too large if your dataset is small (for instance, do not use 64 threads for a dataset with 10,000 rows)
Expand Down