Skip to content

Conversation

bartgol
Copy link
Contributor

@bartgol bartgol commented Oct 23, 2024

This PR prepares for running actions on a SNL-hosted CPU-only machine (a container that runs on mappy).

I'm going to let the AT run on this, since it also removes one leftover instance of kokkos 4 deprecated code (so it's not just adding a new machine). I am planning to have our CI disable deprecated code by default (I'd argue we should do it everywhere, but that's another story), so we'll need the fix anyways.

Note: all the paths in the config machine are relative to the container. That said, one can always run any container mounting local volumes to those paths, so it is not, strictly speaking, only for this container image.

I will share the container recipe somewhere else (only the part I implemented, since SNL adds some secret sauce on top...long story, ping me if your interested).

After this PR is merged, I will

  • enable the AT2 "listener" script on mappy (where the ghci-snl containers are run)
  • issue another PR to add the eamxx standalone testing actions needed by AT2

* Add entry in config_machines.xml
* Renamed to ghci-snl-cpu
* Fix baselines location
@bartgol
Copy link
Contributor Author

bartgol commented Oct 23, 2024

Note: these settings seem to work well with AT2, and you can take a look to current actions runs on the SNL-hosted container here. The testing is correctly handling baselines for both standalone and v1, where baselines are stored on the physical machine (mappy) and mounted during testing (same for input data folder). The baselines location are different from the ones for actual mappy runs, so that we do not interfere with current AT1 runs.

jgfouca
jgfouca previously approved these changes Oct 24, 2024
@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6229
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5969
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: bartgol/ghci-machines-fixes
  • SHA: 5ac2f94
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6229
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5969
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS NOT BEEN REVIEWED YET!

@E3SM-Bot
Copy link
Collaborator

All Jobs Finished; status = PASSED, target_sha=26b2b1eae91247d02700e1d3edad459be7c4285e, However Inspection must be performed before merge can occur...

@bartgol bartgol requested a review from jgfouca October 25, 2024 16:01
@E3SM-Bot
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 26b2b1e
  • current base branch sha : 94ca257
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6232
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5972
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: bartgol/ghci-machines-fixes
  • SHA: 5ac2f94
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6232
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5972
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS Machine File;AT: AUTOMERGE
PULLREQUESTNUM 3060
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 5ac2f94
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 69c16bb
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Weaver # 6232 PASSED (click to see last 100 lines of console output)

142/157 Test #142: model_initial .........................................................   Passed    5.91 sec
        Start 143: model_restart
143/157 Test #143: model_restart .........................................................   Passed    7.35 sec
        Start 144: restarted_vs_monolithic_check_np1
144/157 Test #144: restarted_vs_monolithic_check_np1 .....................................   Passed    0.11 sec
        Start 145: homme_shoc_cld_spa_p3_rrtmgp_np1
145/157 Test #145: homme_shoc_cld_spa_p3_rrtmgp_np1 ......................................   Passed    6.39 sec
        Start 146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
146/157 Test #146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp .............................   Passed    0.16 sec
        Start 147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
147/157 Test #147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1 ............................   Passed    8.89 sec
        Start 148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
148/157 Test #148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1 .................   Passed    1.49 sec
        Start 149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
149/157 Test #149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp ...................   Passed    0.65 sec
        Start 150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
150/157 Test #150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1 ...............................   Passed   13.32 sec
        Start 151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
151/157 Test #151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp ......................   Passed    0.10 sec
        Start 152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1
152/157 Test #152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1 ...............................   Passed   17.81 sec
        Start 153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
153/157 Test #153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp ......................   Passed    0.24 sec
        Start 154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
154/157 Test #154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1 ............   Passed   18.69 sec
        Start 155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
155/157 Test #155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp ...   Passed    0.20 sec
        Start 156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
156/157 Test #156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1 .........................   Passed   38.72 sec
        Start 157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp
157/157 Test #157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp ................   Passed    0.16 sec

100% tests passed, 0 tests failed out of 157

Label Time Summary:
baseline_cmp = 138.95 secproc (23 tests)
baseline_gen = 321.85 sec
proc (25 tests)
bfbhash = 0.90 secproc (1 test)
check = 1.01 sec
proc (1 test)
cld = 45.74 secproc (7 tests)
cld_fraction = 1.21 sec
proc (1 test)
cxx baseline_cmp = 6.33 secproc (2 tests)
diagnostics = 42.78 sec
proc (23 tests)
driver = 84.43 secproc (16 tests)
dynamics = 8.26 sec
proc (3 tests)
fail = 30.69 secproc (5 tests)
io = 59.94 sec
proc (14 tests)
mam4_aci = 31.00 secproc (4 tests)
mam4_constituent_fluxes = 4.67 sec
proc (1 test)
mam4_drydep = 3.84 secproc (1 test)
mam4_optics = 4.80 sec
proc (1 test)
mam4_srf_online_emiss = 4.67 secproc (1 test)
mam4_wetscav = 17.54 sec
proc (2 tests)
nudging = 9.97 secproc (2 tests)
p3 = 107.09 sec
proc (12 tests)
p3_sk = 47.75 secproc (2 tests)
physics = 191.34 sec
proc (27 tests)
remap = 6.12 secproc (1 test)
rrtmgp = 45.96 sec
proc (11 tests)
shoc = 58.03 secproc (13 tests)
spa = 9.63 sec
proc (4 tests)
surface_coupling = 5.26 sec*proc (1 test)

Total Test time (real) = 794.17 sec

Testing '''24be8ebfc9ee38a22e55529b022e736097666e5c''' for test '''full_sp_debug'''

RUN: taskset -c 52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_sp_debug

Testing '''24be8ebfc9ee38a22e55529b022e736097666e5c''' for test '''release'''

RUN: taskset -c 104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/release

Testing '''24be8ebfc9ee38a22e55529b022e736097666e5c''' for test '''full_debug'''

RUN: taskset -c 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx/ctest-build/full_debug
OVERALL STATUS: PASS
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6232/scream/components/eamxx
Completed analysis on weaver'

  • [[ 0 != 0 ]]
  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    Performing Post build task...
    Match found for : : True
    Logical operation result is TRUE
    Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins14262669893804130874.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Finished: SUCCESS

SCREAM_PullRequest_Autotester_Mappy # 5972 FAILED (click to see last 100 lines of console output)

Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241025_110149_0badlh
Creating test directory /home/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241025_110149_0badlh
RUNNING TESTS:
  ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5
  SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3
  SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep
  ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5
  ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1
  SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav
  ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5
  ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5
  SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci
  SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics
  ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble
  ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97
  ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4
  ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2
  ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01
  ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu
  PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1
Starting CREATE_NEWCASE for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 procs
Starting CREATE_NEWCASE for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 1 procs
Starting CREATE_NEWCASE for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep with 1 procs
Starting CREATE_NEWCASE for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 with 1 procs
Starting CREATE_NEWCASE for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 procs
Starting CREATE_NEWCASE for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav with 1 procs
Starting CREATE_NEWCASE for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 with 1 procs
Starting CREATE_NEWCASE for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 with 1 procs
Starting CREATE_NEWCASE for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 1 procs
Starting CREATE_NEWCASE for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 1 procs
Starting CREATE_NEWCASE for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble with 1 procs
Starting CREATE_NEWCASE for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 with 1 procs
Starting CREATE_NEWCASE for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 procs
Starting CREATE_NEWCASE for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 1 procs
Starting CREATE_NEWCASE for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 with 1 procs
Starting CREATE_NEWCASE for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu with 1 procs
Starting CREATE_NEWCASE for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 procs
Finished CREATE_NEWCASE for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu in 0.928038 seconds (PASS)
Starting XML for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu with 1 procs
Finished XML for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu in 0.341219 seconds (PASS)
Starting SETUP for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu with 1 procs
Finished SETUP for test ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu in 0.245660 seconds (FAIL). [COMPLETED 1 of 17]
    Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241025_110149_0badlh
    Errors were:
        Traceback (most recent call last):
          File "/home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241025_110149_0badlh/./case.setup", line 106, in 
            _main_func(__doc__)
          File "/home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241025_110149_0badlh/./case.setup", line 102, in _main_func
            case.case_setup(clean=clean, test_mode=test_mode, reset=reset, keep=keep)
          File "/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5972/scream/cime/CIME/case/case_setup.py", line 501, in case_setup
            run_and_log_case_status(
          File "/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5972/scream/cime/CIME/status.py", line 79, in run_and_log_case_status
            append_case_status(
          File "/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5972/scream/cime/CIME/status.py", line 34, in append_case_status
            append_status(
          File "/home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5972/scream/cime/CIME/status.py", line 17, in append_status
            with open(os.path.join(caseroot, sfile), "a") as fd:
        OSError: [Errno 28] No space left on device: '\''/home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20241025_110149_0badlh/CaseStatus'\'''
++ grep -m1 -A 100000 'Waiting for tests to finish'
+ errors=
+ V1_FAILURES_DETAILS+=
+ set +x
######################################################
FAILS DETECTED:
  SCREAM V1 TESTING FAILED!

######################################################
Build step 'Execute shell' marked build as failure
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 2320726 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins15102690354861594643.sh
./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh: line 6: /ascldap/users/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5972/JENKINS_CLEAN_2024-10-25_110204: No space left on device
POST BUILD TASK : FAILURE
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: FAILURE

@bartgol
Copy link
Contributor Author

bartgol commented Oct 25, 2024

Mappy is being slammed with testing lately. Things looked good, and the setting I changed in the last commit should not affect PASS/FAIL (it's for another machine). Merging.

@bartgol bartgol merged commit e0746bf into master Oct 25, 2024
3 of 5 checks passed
@bartgol bartgol deleted the bartgol/ghci-machines-fixes branch October 25, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants