Skip to content

amo/rma=direct is broken #131

@jeffhammond

Description

@jeffhammond

I speculate that this is because I set amo/rma=direct but I don't think the library should crash if I do this.

Error

~/SHMEM/bale/src/bale_classic$ $HOME/SHMEM/mpich-ch4-ucx/bin/mpirun -n 1 ./build_unknown/bin/transpose_matrix

***************************************************************
Bale Version 3.00 (OpenShmem version 1.4): 2111-07-27.05:08
Running command on 1 PEs: ./build_unknown/bin/transpose_matrix
***************************************************************

Input Graph/Matrix parameters:
----------------------------------------------------
Graph model: FLAT        (-F).
Undirected, Unweighted, No Loops
Number of rows           (-N): 500000
Avg # nnz per row        (-z): 10.00
Edge probability         (-e): 0.000040

Standard options:
----------------------------------------------------
buf_cnt (buffer size)    (-b): 1024
seed                     (-s): 122222
cores_per_node           (-c): 0
Models Mask              (-M): 15

Input matrix:
----------------------------------------------------
	500000 rows
	500000 columns
	4998270 nonzeros

           AGP:   23.690
       Exstack:    0.468
Abort(202008595) on node 0 (rank 0 in comm 0): Fatal error in internal_Test: Request pending due to failure, error stack:
internal_Test(92): MPI_Test(request=0x7f043e0efbf0, flag=0x7fff42e74f0c, status=0x7fff42e74f10) failed
internal_Test(47): Invalid MPI_Request

Application info

https://github.com/jdevinney/bale

./bootstrap.sh 
python3 ./make_bale -f -s -c CC=$HOME/SHMEM/oshmpi-v2-install/bin/oshcc
$HOME/SHMEM/mpich-ch4-ucx/bin/mpirun -n 1 ./build_unknown/bin/transpose_matrix

OSHMPI info

./configure  --enable-amo=direct  --enable-rma=direct CC=$HOME/SHMEM/mpich-ch4-ucx/bin/mpicc CXX=$HOME/SHMEM/mpich-ch4-ucx/bin/mpicxx --prefix=$HOME/SHMEM/oshmpi-v2-install
commit ba66186a4b968c3d4cdb63027d9e45e23456ab1a (HEAD -> mpi-4-configure-test, origin/mpi-4-configure-test)
Author: Jeff Hammond <>
Date:   Tue Jul 27 04:59:34 2021 -0700

    support MPI_VERSION=4
    
    configure test was MPI_VERSION != 3 when the requirement
    is MPI_VERSION >= 3.
    
    resolves issue #129
    
    Signed-off-by: Jeff Hammond <>

MPI info

~/SHMEM/bale/src/bale_classic$ $HOME/SHMEM/mpich-ch4-ucx/bin/mpichversion 
MPICH Version:    	4.0a2
MPICH Release date:	unreleased development copy
MPICH Device:    	ch4:ucx
MPICH configure: 	--prefix=$HOME/SHMEM/mpich-ch4-ucx CC=gcc --without-fortran --with-device=ch4:ucx
MPICH CC: 	gcc    -O2
MPICH CXX: 	g++   -O2
MPICH F77: 	gfortran   -O2
MPICH FC: 	gfortran   -O2
MPICH Custom Information: 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions