Skip to content

Possible enhancement: multithreaded (via numba) mann-whitney tests #2060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
jamestwebber opened this issue Nov 26, 2021 · 3 comments
Open
1 task done

Comments

@jamestwebber
Copy link
Contributor

  • Additional function parameters / changed functionality / changed defaults?

I recently wrote up a parallelized implementation of the Mann-Whitney U test, for my own use (gist is here). For the types of tests we tend to do in scRNAseq (lots of different features, 2d arrays) it basically scales with the number of cores you can throw at it. When you're doing a lot of tests this is very nice!

Given that scanpy already has a dependency on numba this would be a pretty simple thing to add, if you want to do so. Thought I would just point it out!

  • James
@ivirshup
Copy link
Member

We're always up for improved performance! Would love to see improvements here. (Btw, I think I've already got your gist bookmarked on twitter)

Do you have any benchmarks of performance here? Especially against our current implementation.

@jamestwebber
Copy link
Contributor Author

I haven't benchmarked against scanpy, only against scipy.stats.mannwhitneyu (which at this point can handle arrays, I know it couldn't before). On my laptop (an 8-core Intel MacBook Pro) it's about a 10x speedup. But with more cores it can be a lot more.

Even without parallelization, you can get some improvement by just using numba.njit on some of the internal bits (e.g. tiecorrect).

Of course, your code has a lot of options that I didn't bother with, because I didn't need them. Some of them might be harder to JIT than others.

@flying-sheep
Copy link
Member

Your changes made it into rank_genes_groupswilcoxon flavor via #3529.

Scanpy doesn’t currently have mannwhitneyu, but if you want to contribute it, feel free!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants