Skip to content

spatial_autocorr uses all the cores #957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
GloriaLiu28 opened this issue Feb 18, 2025 · 4 comments · May be fixed by #1008
Open

spatial_autocorr uses all the cores #957

GloriaLiu28 opened this issue Feb 18, 2025 · 4 comments · May be fixed by #1008
Assignees
Labels
bug 🐛 Something isn't working

Comments

@GloriaLiu28
Copy link

Hi,
when I run squidpy.gr.spatial_autocorr to calculate moran, it used all the 112 cores of my server, even the n_jobs were set to n_jobs=8, how to limit the number of cores it uses ? I find the old issues about occurence also talked about this. Could you please also set numba_parallell = False to squidpy.gr.spatial_autocorr function?

@maltekuehl
Copy link

For a quick fix, you should be able to set the environment variable NUMBA_NUM_THREADS.

@selmanozleyen
Copy link
Member

hi @GloriaLiu28 can you check if this still exists with the most recent version of scanpy?

@flying-sheep I noticed we use gearys stats in this spatial_autocorr and some of the gearys related metrics use a decorator from scanpy which might set the parallel=True https://github.com/scverse/scanpy/blob/b058b1792b91b1baaf2e5caf395960005098c2ac/src/scanpy/metrics/_gearys_c.py#L233-L234. Here is the decorator https://github.com/scverse/scanpy/blob/b058b1792b91b1baaf2e5caf395960005098c2ac/src/scanpy/_compat.py#L116. The function is compiled parallel if not _is_in_unsafe_thread_pool() but I am not sure if this means spatial_autocorr will run a parallel numba kernel and I think we need to make sure it doesn't.

@flying-sheep
Copy link
Member

It will not be parallelized when the current thread’s name starts with ThreadPoolExecutor, which is the case in dask.

joblib seems to use this: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.ThreadPool

so scanpy’s workaround will probably not work there. we should augment it so it does.

I also found the way joblib adapts to manually reduced thread numbers, which seems like a nice starting point for us: https://github.com/joblib/joblib/blob/ed0806a497268005ad7dad30f79e1d563927d7c6/joblib/_parallel_backends.py#L65

@ilan-gold
Copy link
Contributor

@selmanozleyen Ensure that numba only uses one core here going forward.

@selmanozleyen selmanozleyen linked a pull request May 28, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants