-
Notifications
You must be signed in to change notification settings - Fork 421
Enable OpenMP in addParticles #4615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think The omp parallel region could include the for loop over levels. |
Won't this have threading issues in |
Good point - we still need to do that part in serial. |
for (int lev = 0; lev < other.numLevels(); ++lev) | ||
{ | ||
const auto& plevel_other = other.GetParticles(lev); | ||
for(MFIter mfi = other.MakeMFIter(lev); mfi.isValid(); ++mfi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related to OpeMP. But we could use this version MFIter MakeMFIter (int lev, const MFItInfo& info) const
to disable gpu device sync here and above. If we disable it here, we need to add Gpu::streamSynchronizeAll()
after the loop over levels.
Do you want to do this in a follow-up or this PR?
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try this in a follow-on PR.
I did a quick test in ImpactX with 10M particles and looking at init, e.g., with I see a 2x speedup for
|
Tested scalability a bit more for a 10M particle test, I think this looks reasonable (tapers off quickly, no benefit from threads in the old one):
|
Correctness is fine, too: ImpactX tests all pass locally. cmake --fresh -S . -B build -DImpactX_PYTHON=ON -DpyAMReX_IPO=OFF -DImpactX_PYTHON_IPO=OFF -DImpactX_FFT=ON -DImpactX_amrex_src=$HOME/src/amrex
cmake --build build -j 6
ctest --test-dir build --output-on-failure -j 6 |
Turn on OMP threading in addParticles.
The proposed changes: