-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Labels
Description
This issue is to discuss options to improve single-processor performance (multithreaded, but no MPI yet).
The plan so far:
- Solve
wrap_arrayallocates #29 and resolve all type instabilities (solved in Remove type instabilities and allocations #36). - Investigate if
thread=OrdinaryDiffEq.True()can improve performance of the time integration method once StackOverflowError in broadcasting JuliaSIMD/StrideArrays.jl#62 is resolved. - Further optimize PK1 computation for structure dynamics (solved in Compute deformation gradient in matrix form #38).
- Use symmetry of interactions? We could skip half of the fluid-fluid interaction by applying the same force with a flipped sign to the neighbor particle. However, some SPH codes don't do this because it makes the computations and memory accesses less optimal, especially on GPUs. We could also skip the solid-fluid interaction by using symmetry in the fluid-solid interaction.
- A single
@threadedloop over all particles, which then includes all fluid-* interactions could potentially improve performance. - Improve neighborhood search update. The NHS update is a bottleneck on multiple threads because the implementation does not use multithreading (How to improve neighborhood search #65).
- Rework the neighborhood search. We might be able to get a significant speedup for large simulations by using a contiguous memory layout.