MPI_Allgatherv in `mpi_allgather` function by rohanbabbar04 · Pull Request #186 · PyLops/pylops-mpi

rohanbabbar04 · 2026-02-17T18:13:14Z

Closes #169

Use MPI_AllGatherv in mpi_allgather.

rohanbabbar04 · 2026-02-17T19:18:22Z

I made a change which was bugging me when I was implementing MPI_Allgatherv, since it works for contiguous arrays only.

In Fredholm, y1 after transpose() becomes a non-contiguous array.

y1 = (
                    ncp.matmul(x.transpose(0, 2, 1).conj(), self.G)
                    .transpose(0, 2, 1)
                    .conj()
                )

This can be simplified to y1 = ncp.matmul(self.G.transpose(0, 2, 1).conj(), x) which is contiguous as well.

mrava87 · 2026-02-17T22:53:22Z

Thanks @rohanbabbar04

When I suggested we should investigate Allgatherv I did not foresee the issue with non-contiguous arrays.

What you propose in the Fredholm code is a bit problematic though. We have used that transpose patterns for a real reason, G is usually much bigger than x, so by transposing twice x (before and after G is applied) we do effectively much fewer operations than transposing G... we do the same in PyLops' original operator, so changing this would require some proper benchmarking to do if the combo of changes you suggest is actually beneficial 😉

rohanbabbar04 · 2026-02-18T08:50:55Z

Ah!, Yes in that case we should keep it as before, because when G>>x then G transpose and conj will put pressure on memory.

I also think we should keep both Allgather and Allgatherv, whenever the shapes are variable we use Allgatherv otherwise the preferred method should be Allgather.

mrava87

@rohanbabbar04 this looks good!

Just one thing i am a bit unsure.... you mentioned about the issue with Fredholm as the array is non-contiguous and that allgatherv requires contiguity.. but we don't seem to check this in mpi_allgather... so what would happen if we have Fredholm with G that have different size in G.shape[0]... I suspect the code will crash?

You could probably test it first by doing something like

pylops-mpi/tests/test_fredholm.py

Line 120 in a0eed42

_F = np.arange(par["nsl"] * par["nx"] * par["ny"]).reshape(

nsl =par["nsl"] +1 if rank == 0 else par["nsl"]

and then use nsl instead of par["nsl"] in the rest of the test... this ensures rank0 has different size than the other ranks and we should hit to allgatherv with non-contiguous arrays...

If I am right fix the code by:

either add a check of contiguity in mpi_allgather
adding a parameter to mpi_allgather and methods that call it to let user force allgather even if the different arrays don't have the same shape with the usual padding/unrolling

pylops_mpi/utils/_mpi.py

rohanbabbar04 · 2026-03-13T18:18:18Z

@rohanbabbar04 this looks good!

Just one thing i am a bit unsure.... you mentioned about the issue with Fredholm as the array is non-contiguous and that allgatherv requires contiguity.. but we don't seem to check this in mpi_allgather... so what would happen if we have Fredholm with G that have different size in G.shape[0]... I suspect the code will crash?

You could probably test it first by doing something like

pylops-mpi/tests/test_fredholm.py

Line 120 in a0eed42

_F = np.arange(par["nsl"] * par["nx"] * par["ny"]).reshape(
nsl =par["nsl"] +1 if rank == 0 else par["nsl"]
and then use nsl instead of par["nsl"] in the rest of the test... this ensures rank0 has different size than the other ranks and we should hit to allgatherv with non-contiguous arrays...

If I am right fix the code by:
* either add a check of contiguity in `mpi_allgather`

* adding a parameter to `mpi_allgather` and methods that call it to let user force `allgather` even if the different arrays don't have the same shape with the usual padding/unrolling

Yup, what I did was use send_buf.copy() in the allgather which would resolve this issue, but we should change it to ncp.ascontiguousarray(send_buf) so that it only changes when it is non-contiguous.

mrava87 · 2026-03-13T20:54:09Z

@rohanbabbar04 this looks good!
Just one thing i am a bit unsure.... you mentioned about the issue with Fredholm as the array is non-contiguous and that allgatherv requires contiguity.. but we don't seem to check this in mpi_allgather... so what would happen if we have Fredholm with G that have different size in G.shape[0]... I suspect the code will crash?
You could probably test it first by doing something like

pylops-mpi/tests/test_fredholm.py

Line 120 in a0eed42

_F = np.arange(par["nsl"] * par["nx"] * par["ny"]).reshape(
nsl =par["nsl"] +1 if rank == 0 else par["nsl"]
and then use nsl instead of par["nsl"] in the rest of the test... this ensures rank0 has different size than the other ranks and we should hit to allgatherv with non-contiguous arrays...
If I am right fix the code by:
* either add a check of contiguity in `mpi_allgather`

* adding a parameter to `mpi_allgather` and methods that call it to let user force `allgather` even if the different arrays don't have the same shape with the usual padding/unrolling
Yup, what I did was use send_buf.copy() in the allgather which would resolve this issue, but we should change it to ncp.ascontiguousarray(send_buf) so that it only changes when it is non-contiguous.

Alright! I see this for AllGatherv but what about AllGather, is it also needed? In our current codebase, the Fredholm implementation is likely to create a non-contiguous array as you stated in one of your comments, right? But we did not do any copy/ascontiguousarray, was it because _prepare_allgather_inputs was effectively doing it?

rohanbabbar04 · 2026-03-13T21:03:52Z

@rohanbabbar04 this looks good!
Just one thing i am a bit unsure.... you mentioned about the issue with Fredholm as the array is non-contiguous and that allgatherv requires contiguity.. but we don't seem to check this in mpi_allgather... so what would happen if we have Fredholm with G that have different size in G.shape[0]... I suspect the code will crash?
You could probably test it first by doing something like

pylops-mpi/tests/test_fredholm.py

Line 120 in a0eed42

_F = np.arange(par["nsl"] * par["nx"] * par["ny"]).reshape(
nsl =par["nsl"] +1 if rank == 0 else par["nsl"]
and then use nsl instead of par["nsl"] in the rest of the test... this ensures rank0 has different size than the other ranks and we should hit to allgatherv with non-contiguous arrays...
If I am right fix the code by:
* either add a check of contiguity in `mpi_allgather`

* adding a parameter to `mpi_allgather` and methods that call it to let user force `allgather` even if the different arrays don't have the same shape with the usual padding/unrolling
Yup, what I did was use send_buf.copy() in the allgather which would resolve this issue, but we should change it to ncp.ascontiguousarray(send_buf) so that it only changes when it is non-contiguous.
Alright! I see this for AllGatherv but what about AllGather, is it also needed? In our current codebase, the Fredholm implementation is likely to create a non-contiguous array as you stated in one of your comments, right? But we did not do any copy/ascontiguousarray, was it because _prepare_allgather_inputs was effectively doing it?

Yes, you are right it was already handled by the _prepare_allgather_inputs function.

mrava87 · 2026-03-15T19:41:17Z

@rohanbabbar04 Alright, then I think we agree. Resolve conflicts and get the CI back up and running and then we should be almost ready to merge - I did also add a few minor extra comments below...

mrava87 · 2026-03-15T19:46:58Z

@tharittk can you also please have a look at this PR since it changes a few things you did, would like to have your input 😄

mrava87

@rohanbabbar04 left a few more comments and waiting for @tharittk input 😄 almost there!

pylops_mpi/utils/_mpi.py

…arrays

tharittk

Looks good @rohanbabbar04 !

I have somewhat similar feeling to @mrava87 that I much prefer to keep the array unrolling in one place. Currently, your propose mpi_allgather has two unrolling (for the variable-size and non-variable size). And previously we have the _unroll_allgather_recv which half of it becomes obsolete (the mpi part).

Hmm, I think it would be really nice to have them in one place... Maybe extending the original _unroll_allgather_recv to handle both padded and non-padded input ?

pylops_mpi/utils/_mpi.py

rohanbabbar04 · 2026-03-19T08:29:49Z

Ok, did some minor changes to _unroll_gather_recv to handle the new logic.
Also, introduced a new method for preparing the gathered data for MPI before it goes in Allgather/Allgatherv.

pylops_mpi/utils/_common.py

rohanbabbar04 · 2026-03-20T09:41:30Z

@mrava87, I think once this is merged we can make a new release as the issue with numpy>=2.4 has also not been released.

mrava87 · 2026-03-20T19:04:10Z

@rohanbabbar04 agreed! Let me have a final look at this at the weekend (and check with @hongyx11 why the CuPy CI seems to have stopped working)

rohanbabbar04 requested a review from mrava87 February 17, 2026 19:18

mrava87 requested changes Mar 11, 2026

View reviewed changes

pylops_mpi/utils/_mpi.py Outdated Show resolved Hide resolved

pylops_mpi/utils/_mpi.py Outdated Show resolved Hide resolved

pylops_mpi/utils/_mpi.py Outdated Show resolved Hide resolved

rohanbabbar04 force-pushed the allgatherv branch from da916ed to d8e9630 Compare March 13, 2026 17:53

rohanbabbar04 requested a review from mrava87 March 13, 2026 19:31

mrava87 requested a review from tharittk March 15, 2026 19:47

mrava87 requested changes Mar 15, 2026

View reviewed changes

pylops_mpi/utils/_mpi.py Outdated Show resolved Hide resolved

pylops_mpi/utils/_mpi.py Show resolved Hide resolved

rohanbabbar04 added 4 commits March 16, 2026 20:05

Add MPI_Allgatherv

1442fb3

Add MPI_Allgather for uniform and MPI_Allgatherv for variable length …

ebcbe8f

…arrays

Add ncp.ascontiguousarray in allgather

b62fcbe

Update docstring

2f1225a

rohanbabbar04 force-pushed the allgatherv branch from 2f8a63f to 2f1225a Compare March 16, 2026 14:36

Minor Fix

f5093bd

tharittk reviewed Mar 16, 2026

View reviewed changes

pylops_mpi/utils/_mpi.py Show resolved Hide resolved

Add _prepare_allgather_inputs_mpi and update unroll gather recv

6ecbfdc

rohanbabbar04 requested review from mrava87 and tharittk March 19, 2026 08:29

tharittk reviewed Mar 19, 2026

View reviewed changes

pylops_mpi/utils/_common.py Outdated Show resolved Hide resolved

Change _prepare_allgather_inputs to _prepare_allgather_inputs_nccl

9350594

rohanbabbar04 requested a review from tharittk March 19, 2026 15:14

tharittk approved these changes Mar 19, 2026

View reviewed changes

rohanbabbar04 and others added 2 commits March 21, 2026 13:57

Minor fix in tests_ncclutils

ffcfcfa

Remove int() in chunk size calculation

811506d

Conversation

rohanbabbar04 commented Feb 17, 2026

Uh oh!

rohanbabbar04 commented Feb 17, 2026

Uh oh!

mrava87 commented Feb 17, 2026

Uh oh!

rohanbabbar04 commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrava87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rohanbabbar04 commented Mar 13, 2026

Uh oh!

mrava87 commented Mar 13, 2026

Uh oh!

rohanbabbar04 commented Mar 13, 2026

Uh oh!

mrava87 commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrava87 commented Mar 15, 2026

Uh oh!

mrava87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tharittk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rohanbabbar04 commented Mar 19, 2026

Uh oh!

Uh oh!

rohanbabbar04 commented Mar 20, 2026

Uh oh!

mrava87 commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rohanbabbar04 commented Feb 18, 2026 •

edited

Loading

mrava87 commented Mar 15, 2026 •

edited

Loading