Querying for supported/unsupported datatypes #7341

dalcinl · 2025-03-15T06:52:06Z

PR #7319 broke mpi4py tests, not because of a bug, but a small change of behavior regarding unsupported datatypes.
https://github.com/mpi4py/mpi4py-testing/actions/runs/13867883975/job/38810396859

For example, if MPICH was built without Fortran, calling MPI_Type_size on let say MPI_REAL would not fail but return size=0. After that PR, now the MPI_Type_size call fails. How would a user check for the availability of a datatype without having to mess with setting/restoring the ERRORS_RETURN error handler in COMM_WORLD?

Is this something that should be addressed in the MPI Forum, or is there is a quick compromise that could be taken here like the size=0 hack I was abusing before? Maybe for the specific case of Fortran I can use the new MPI_Abi_get_fortran_info, but I'm wondering if the problem of querying datatype availability is of more broad scope than just Fortran.

cc @jeffhammond

PS: @hzhou Off-topic, have you opened a PR for the new MPI_LOGICAL<N> datatypes?

The text was updated successfully, but these errors were encountered:

jeffhammond · 2025-03-15T06:53:31Z

Type_size is supposed to return UNDEFINED if the type is missing.

jeffhammond · 2025-03-15T09:29:22Z

MPI 5.0 RC, Section 20.4 (document page 850):

24 MPI applications can discover the size of Fortran types such as MPI_INTEGER and
25 MPI_REAL using MPI_TYPE_SIZE. Lack of support in the implementation for optional
26 predefined datatypes is indicated when the type size returned is MPI_UNDEFINED.

dalcinl · 2025-03-15T09:30:45Z

Yes, I forgot about it. MPICH is currently not following these rules.

dalcinl · 2025-03-15T10:08:26Z

24 MPI applications can discover the size of Fortran types such as MPI_INTEGER and
25 MPI_REAL using MPI_TYPE_SIZE. Lack of support in the implementation for optional
26 predefined datatypes is indicated when the type size returned is MPI_UNDEFINED.

@jeffhammond This was something we added recently, right? In retrospect, I'm wondering if such behavior (i.e returning undefined) is too error prone. Perhaps a better, alternative check would have been about MPI_Type_get_name returning an empty string, which is trivial to check via the output resultlen integer. IMHO, getting an empty string from MPI_Type_get_name is much more inconsequential that getting a negative value (MPI_UNDEFINED) from MPI_Type_size.

jeffhammond · 2025-03-15T10:27:06Z

It was added in September when I had to split out all of the Fortran stuff from the ABI.

https://github.com/mpi-forum/mpi-standard/commit/c03fd3e54c25df91c29c6ed0df5cf34253a00f87

I see no difference between the user needing to check for MPI_UNDEFINED from one function and an empty string from another. It was intentional that the value MPI_UNDEFINED would cause a visible effect in programs, because it must be visible to the user if they are using unsupported datatypes.

Today, if MPI_REAL is used with an implementation that lacks Fortran support, it breaks at compilation time, because the symbol is missing. As you know, we could not do this and have a standard ABI, so we defer the failure to runtime, but it fails just the same.

There are many ways for users to handle this. I'm not sure what your usage is, but since you know that MPI_REAL is often equivalent to MPI_FLOAT, you can perform that substitution in mpi4py, as long as you have a way to detect when users promote REAL to the equivalent of double.

jeffhammond · 2025-03-15T10:29:51Z

Yes, I forgot about it. MPICH is currently not following these rules.

It looks like Hui did a massive refactoring and this feature got lost. I'm sure it's a simple fix and will be available soon enough. We have until June to get everything sorted out 😄

dalcinl · 2025-03-15T10:35:14Z

t was intentional that the value MPI_UNDEFINED would cause a visible effect in programs, because it must be visible to the user if they are using unsupported datatypes.

If we had an alternative mechanism, rather than returning a negative value, the call could just error. That's much more visible, at least with the default error handler (except for IO). Anyway, my MPI_Type_get_name proposal cannot really be used, users can already set names to empty strings for any (predefined or user-defined) datatype.

dalcinl · 2025-03-15T10:37:11Z

It looks like Hui did a massive refactoring and this feature got lost.

Actually, I think the feature was no implemented as the standard says. Rather, MPI_Type_size was returning 0 (zero) and not MPI_UNDEFINED.

hzhou · 2025-03-15T17:38:42Z

I can fix MPI_Type_size -- yeah, it is easy to fix. It just requires attention.

Returning 0 makes more sense to me. I suspect returning MPI_UNDEFINED will throw many users a surprise. 0 fail more gracefully than MPI_UNDEFINED.

From a user (who are too lazy to peruse manual), use MPI_Type_size to check a type is intuitive. Not just for existence. For example, one may want to double check an implementation is indeed using a matching type. How many users want to retrieve the name of say, MPI_REAL? If we want to repurpose an obscure API, I would rather just propose a new one, say, MPI_Type_is_supported(MPI_REAL).

hzhou · 2025-03-15T17:39:49Z

PS: @hzhou Off-topic, have you opened a PR for the new MPI_LOGICAL<N> datatypes?

The commits are in #7264. I need pick the commits into a new PR -- TODO.

dalcinl · 2025-03-15T20:37:06Z

If we want to repurpose an obscure API, I would rather just propose a new one,

We may not need a new one, but just MPI_Abi_get_info.
However, as of now,
a) MPI_Abi_get_info just returns INFO_NULL the case of the MPICH ABI, and
b) MPI_Abi_get_info does not inform about support for all Fortran datatypes, but some of them.
IMHO, a) is not correct, MPI_Abi_get_info should always be available, both for the standard ABI and for the MPICH ABI.
About b) I think this is easy to fix but adding a mpi_<type>_supported entry for every Fortran datatype, including MPI_CHARACTER.

PS: IMHO, the only API difference between the standard ABI and the MPICH ABI should be the presence of the MPI_ABI_[SUB]VERSION macros in mpi.h.

hzhou · 2025-03-16T13:05:57Z

If you use ISO_C_BINDING, also use C mpi types. You have to use c types with ISO_C_BINDING anyway

dalcinl · 2025-03-16T19:03:52Z

I deleted my comments, there is absolutely no point in continuing discussions about this topic.

FWIW, MPICH is currently not following the standard regarding MPI_Type_size returning MPI_UNDEFINED for unavailable datatypes. For the time being, I've implemented a workaround in mpi4py based in exception handling, therefore I'm in no hurry for a resolution of this issue. Thanks.

jeffhammond · 2025-03-17T07:15:08Z

The fact is, because my proposal to fix a MPI Fortran ABI was rejected by the Forum, we have a difficult situation any time an MPI implementation is built without Fortran support. However, it's not really more difficult than before. The failures just wait until runtime to appear.

Today, when a user tries to use MPICH without Fortran support, MPI_REAL fails to compile. They don't get a working program, or any program at all. Does the MPI ABI make the situation worse by allowing them to build a program that doesn't work nicely?

I think we are overestimating the impact of no-Fortran MPI builds. The only way these exist is when people like us compile them from source. All of the MPI products and all of the package managers are shipping Fortran support in their MPI libraries. The only potential issue I see is that users might need to link libmpifort_abi.so into their C/C++/Python/Rust/whatever programs, but if they want Fortran MPI_REAL, is this not logical?

Returning 0 makes more sense to me. I suspect returning MPI_UNDEFINED will throw many users a surprise. 0 fail more gracefully than MPI_UNDEFINED.

If a type is not defined, then the logical result of a query of its size is MPI_UNDEFINED, not zero.

From a user (who are too lazy to peruse manual), use MPI_Type_size to check a type is intuitive. Not just for existence. For example, one may want to double check an implementation is indeed using a matching type. How many users want to retrieve the name of say, MPI_REAL? If we want to repurpose an obscure API, I would rather just propose a new one, say, MPI_Type_is_supported(MPI_REAL).

If we decide to add MPI_Type_is_supported, users have to read the spec to know it exists and how to use it. I don't see how this improves on the existing situation with MPI_Type_size.

hzhou linked a pull request Mar 17, 2025 that will close this issue

datatype: allow MPI_Type_size on unspported types #7344

Open

4 tasks

jeffhammond mentioned this issue Mar 18, 2025

need to fix MPI_Type_size return value for unsupported datatypes in the ABI mpi-forum/mpi-issues#977

Closed

dalcinl mentioned this issue Mar 20, 2025

ABI: How to handle optional datatypes? #6877

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Querying for supported/unsupported datatypes #7341

Querying for supported/unsupported datatypes #7341

dalcinl commented Mar 15, 2025

jeffhammond commented Mar 15, 2025 •

edited

Loading

jeffhammond commented Mar 15, 2025

dalcinl commented Mar 15, 2025

dalcinl commented Mar 15, 2025

jeffhammond commented Mar 15, 2025 •

edited

Loading

jeffhammond commented Mar 15, 2025

dalcinl commented Mar 15, 2025

dalcinl commented Mar 15, 2025

hzhou commented Mar 15, 2025 •

edited

Loading

hzhou commented Mar 15, 2025

dalcinl commented Mar 15, 2025

hzhou commented Mar 16, 2025

dalcinl commented Mar 16, 2025

jeffhammond commented Mar 17, 2025

Querying for supported/unsupported datatypes #7341

Querying for supported/unsupported datatypes #7341

Comments

dalcinl commented Mar 15, 2025

jeffhammond commented Mar 15, 2025 • edited Loading

jeffhammond commented Mar 15, 2025

dalcinl commented Mar 15, 2025

dalcinl commented Mar 15, 2025

jeffhammond commented Mar 15, 2025 • edited Loading

jeffhammond commented Mar 15, 2025

dalcinl commented Mar 15, 2025

dalcinl commented Mar 15, 2025

hzhou commented Mar 15, 2025 • edited Loading

hzhou commented Mar 15, 2025

dalcinl commented Mar 15, 2025

hzhou commented Mar 16, 2025

dalcinl commented Mar 16, 2025

jeffhammond commented Mar 17, 2025

jeffhammond commented Mar 15, 2025 •

edited

Loading

jeffhammond commented Mar 15, 2025 •

edited

Loading

hzhou commented Mar 15, 2025 •

edited

Loading