Skip to content

0: ABORT: Matrix not symmetric when Debug module is on #183

@maouyami

Description

@maouyami

I ran into this error today:

 ```
 [runSCHISM_test]$ mpiexec -n 8 _OLDIO_VL_DEBUG 4
--------------------------------------------------------------------------
By default, for Open MPI 4.0 and later, infiniband ports on a device
are not used by default.  The intent is to use UCX for these devices.
You can override this policy by setting the btl_openib_allow_ib MCA parameter
to true.

  Local host:              eh
  Local adapter:           mlx4_0
  Local port:              1

--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   eh
  Local device: mlx4_0
--------------------------------------------------------------------------
   2: ABORT:  Matrix not symmetric:          14           2          42  -66360199817.1660       -66360199817.1661   

...

 ``` 

When i turn on the Debug module, this error appeared. If I'm not, the model will stuck on running forever.
Is this error about my grid files?

Just In case, this is my mirror.out file when its stucked.

 ```
Run begins at 20250603, 154725.978
 You are using baroclinic model
 # of tracers in each module:           1           1           0           0
           0           0           0           0           0           0
           0           0
 Total # of tracers=           2
 Index ranges of each module:           1           1           2           2
           3           2           3           2           3           2
           3           2           3           2           3           2
           3           2           3           2           3           2
           3           2
 vclose_surf_frac is:   1.00000000000000     
 done reading param.nml
 done reading vgrid...
 lhas_quad= F
 mnei, mnei_p =            8           9
 lhas_quad= F

Global Grid Size (ne,np,ns,nvrt):       6473      3472      9945         2

**********Augmented Subdomain Sizes**********
 rank     nea      ne     neg     nea2     neg2     npa      np     npg     npa2     npg2     nsa      ns     nsg     nsa2     nsg2
    0     870     793      77     870       0     496     458      38     496       0    1365    1250     115    1365       0
    1     927     771     156     927       0     515     446      69     515       0    1441    1216     225    1441       0
    2     903     831      72     903       0     512     479      33     512       0    1414    1309     105    1414       0
    3     943     831     112     943       0     525     475      50     525       0    1467    1305     162    1467       0
    4     953     831     122     953       0     554     498      56     554       0    1505    1327     178    1505       0
    5     976     827     149     976       0     542     475      67     542       0    1517    1301     216    1517       0
    6     966     827     139     966       0     537     471      66     537       0    1502    1297     205    1502       0
    7     850     762      88     850       0     473     432      41     473       0    1322    1193     129    1322       0
 Max. dot product of 3 axes=  1.1102230E-16
 Max. deviation between ze and zs axes=  2.6777011E-06
 Max. dot prod. between ys and zs axes=  8.0571591E-12

**********Global Boundary Sizes**********
    nope    neta   nland    nvel
       1      28       4     448

**********Augmented Subdomain Boundary Sizes**********
    rank    nope    neta   nland    nvel
       0       0       0       3      87
       1       0       0       1      34
       2       1      28       4      65
       3       0       0       2      58
       4       0       0       3     100
       5       0       0       1      41
       6       0       0       1      42
       7       0       0       1      55
 
 done domain decomp...
 done msg passing table...
 Mass correction flags=           0           0           0
 done reading vgrid ivcor=1...
 Max. & min. sidelength=    50425.2996755058        167.618269747273     
 Max. pframe dev. from radial=   3.3306691E-16
 Max # of points in type II nudging=           0
 done init (1)...
 done init. tracers..
 done initializing cold start
Done initializing outputs
Done initializing time history...
 done computing initial vgrid...
 done computing initial nodal vel...
 done computing initial density...
 time stepping begins...           1         720
 done adjusting wind stress ...
 done flow b.c.
 done MYG-UB...
 done hvis... 
 done backtracking
 done 1st preparation
 done 2nd preparation
 done solver; etatot=   1.28320695781794      ; average |eta|=
  3.695872574360425E-004
 done solving momentum eq...
 done solving w

 ``` 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions