-
Notifications
You must be signed in to change notification settings - Fork 632
Call numba.set_num_threads
when n_jobs
is specified (e.g. in scanpy.tl.rank_genes_group
)
#2390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey, In principle this sounds good, but I'd like to hear a little bit more about the usecase. For context on our side, there are some other paths for speeding up DE available (probably some form of calculating statistics via scverse/anndata#564). There're also increased momentum on more featureful DE in the scverse ecosystem. If you are specifically looking for faster scanpy DE, this makes sense, though there may be some easier paths forward (at least to me). If you need anything fancier or even just different, it could be good to check in with other efforts. E.g. |
Hi @ivirshup Thanks for the help In terms of the use cases here: (1) Any user doing data processing or interactive analysis could benefit from multithreading here. Consider the two big for-loops which through all of the genes between compared in the samples, and the for loop which automatically does this for each "group" in the ScanPy object. I'm a bit confused why Seurat or ScanPy never did this....but then I realize that Pagoda2 didn't either: https://github.com/kharchenkolab/pagoda2/blob/main/R/Pagoda2.R#L900 (There's a bit of multithreading there at the end...) Given the file sizes nowadays and the number of "groups", this is getting fairly computationally intensive. It's one of those simple things your biologists will love ("this is so fast now!"). (2) In terms of our use case, an interactive way to run DE via the client is too slow. We've just started to implement the above ourselves. RE: pertpy Could does this relate to @davidsebfischer and diffxpy? Best, Evan |
I agree it doesn't harm to have
Diffxpy is currently being reimplemented. Once it is released, it would likely be included in pertpy as an additional method. I.e. pertpy is more general and strives to provide a consistent interface to multiple methods. |
My concern is that there will be issues if you keep the current calculations, but parallelize over the groups. Within that loop, I believe large amounts of memory can be allocated. If it's "group vs rest", at least one If you parallelize over groups, now the max memory usage can be scanpy/scanpy/tools/_rank_genes_groups.py Lines 164 to 178 in d26be44
Another memory related concern comes from So while I think we can absolutely make use of more processing power here, I think we need to consider the approach.
What is the interface here? Scanpy computes results for all groups at once, but in most interfaces I've used you can only really "ask" for one comparison at a time. This could also be much faster, if you can just reduce total computation.
Partially, I'm not sure what comparisons are actually being run. I was also wondering if you'd benefit from something fancy like a covariate.
As a heads up, I'm unaware of a timeline here |
Sorry for reviving such an old topic but, aside from wanting to check up on this, I also wanted to offer some advice to anyone that might stumble upon this in their search for faster marker gene extraction.
I have been having the same problem and the best solution i ran into is lvm-DE from
I agree, I have had to write a custom script that computes the pairwise wilcoxon values, and then average over them as the link you pointed to suggests. I also tried to parallelize, but ran into the same problems that @ivirshup mentioned. So the compute time to get ranked genes for a large dataset with lots of groups remains super large.
Any advances on this? Or maybe a suggestion for another tool? Thank you for your time. |
Sorry, I just realized that, nowadays, Though I ran into another problem, specifying the number of cores, that i managed to solve. The parameter The correct form is with |
Thanks for investigating! I agree, there are too many screws to turn here, all different parallelization libraries we use have their own way. Reading numba’s docs, it’s clear that numba tries to be smart here:
We should probably call |
numba.set_num_threads
when n_jobs
is specified (e.g. in scanpy.tl.rank_genes_group
)
Hi there! Based on the latest update of this issue, is there already a multithreads implement of
|
I found that using This Meanwhile I find this issue #2060 very useful, just replacing I benchmarked time performance on a 50k cells dataset using my PC. The time consumption decreased from 3m43s to 1m43s Considering that using |
Can I create a PR based on this @jamestwebber ? It seems that #2060 hasn't made progress since three years ago @flying-sheep |
Happy for you to do so! I only opened the issue as a suggestion as I didn't have time to make a PR myself. |
OK, findings by @Intron7 and me:
|
OK so ideally all our dependencies are just implementation details, and we take care of mapping a user’s Unless we find a better solution, I’d say
would look something like def set_thread_limit(array):
w = get_worker()
# no idea how to get N_WORKERS …
numba.set_num_threads(sc.settings.n_jobs / w.nthreads / N_WORKERS)
return array
def limit_threads_in_dask(fn):
@wraps(fn)
def wrapper(array, *args, **kw):
if isinstance(array, DaskArray):
array = array.map_blocks(set_thread_limit)
return fn(array, *args, **kw)
return wrapper
@limit_threads_in_dask
def some_func(arr, ...): ... but of course we’d have to be careful how the |
Hi ScanPy team
I emailed @ivirshup but others should be involved I think.
This function would be useful if we could specify the number of threads to use: https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.rank_genes_groups.html
Based on the number of items in the "groupby" field, we could use a basic split-merge approach here: each thread would take several of these items, the calculations are entirely independent of one another, and then when each is completed we would join + concatenate the results.
I'm happy to help write up a PR help (or participate), but I'd like to hear if this is something you'd be willing to prioritize. (It's related to a project whereby Fabian is the PI.)
Best, Evan
The text was updated successfully, but these errors were encountered: