Skip to content

Crashes due to MPAS-Ocean state validation do not stop E3SM or standalone MPAS-Ocean runs #7447

@xylar

Description

@xylar

For a long time, we have been aware that state validation leads to hanging jobs in standalone MPAS-Ocean runs. I have wanted to look into this but haven't found the time. I just saw the same behavior in my E3SM run at:

/lcrc/group/e3sm/ac.xylar/scratch/chrys/20250618.GMPAS-JRA1p5-DIB-PISMF.TL319_SOwISC12to30E3r4.upwind_factor.chrysalis/run

Although the job crashes almost immediately, the allocation is not cancelled:

            764638     debug run.2025 ac.xasay  R    2:48:39      4 chr-[0495-0498] 
            764661     debug run.2025 ac.xasay  R    1:18:41      4 chr-[0490-0493] 

This happened twice in succession, leading to 2 jobs hanging and blocking scarce resources.

Metadata

Metadata

Assignees

No one assigned

    Labels

    MPAS-Ocean standaloneIssues and features for standalone MPAS-Ocean code that dont impact E3SM.bugmpas-ocean

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions