-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
I recently became aware of 2 bugs inside of Update_Conserved_Variables_3D
kernel that arise when a thread crashes in a simulation with a Density Floor.
The relevant starts here. In the first part of the if-else branch, there is logic for handling the case where a cell has a positive density that is under the density-floor. The logic here seems sensible.
The bug occurs in the other branch, where the density has a negative value. Specifically, the call to Average_Cell_Single_Field to average is problematic on 2 fronts:
- There is a race-condition here just like the one addressed in PR Improving thread crash logic #406 for
Average_Cell_All_Fields
. If we still want to continue to use averaging in this case, we need to make sure the entire grid synchronizes after applying flux-divergence and before we apply averaging. In other words:- we need to split the
Update_Conserved_Variables_3D
kernel into 2 parts (that's what we did in Improving thread crash logic #406) - Or, we could start using cooperative groups. Essentially we would need to create a
cooperative_groups::grid_group
instance and call the sync method to ensure the entire grid gets synchronized.1
- we need to split the
- This only fixes the density in a thread crash. It is also plausible that the energy also crashed (and we don't properly deal with that).
Footnotes
-
Some special steps are required to ensure that all blocks of threads are executed at the same time across the grid. It's also unclear to me whether this introduces any performance overhead ↩
Metadata
Metadata
Assignees
Labels
No labels