Conversation
|
@rohanbabbar04 i didn't mean actual unittests but more tests I was doing... if you take this https://github.com/PyLops/pylops-nccl_examples/blob/main/bench_mdd.py and replace the UNSAFE_BROADCAST you will see that things start to hung, I didn't have time to investigate why but I'm quite sure it is something to do with the comm we do in setitem as skipping it (when UNSAFE_BROADCAST is used) seems to solve the problem |
|
Thanks @mrava87 , I will take a look. |
|
Looks like we got |
|
As suspected, I retrigged the CI and everything breaks because now https://github.com/PyLops/pylops-mpi/actions/runs/22067038064/job/66000565531?pr=185 Since we saw this when we were just about to release a new version #187, I suggest the following course of action:
PS: completely unrelated... @rohanbabbar04 in the future can you please avoid making branches in the main repo and work off your fork... once we sort these things I also suggest we move to PyLops' strategy of having |
|
@rohanbabbar04 let's merge this if you agree - and you can keep looking into the issue of multi-node multi-gpu setup failing with BROADCAST partition... |
Sounds good |
__setitem__