Skip to content

Conversation

jgfouca
Copy link
Member

@jgfouca jgfouca commented Oct 3, 2024

Switched the rrtmgp interface in eamxx to make maximal use of the pool allocator I wrote for rrtmgp standalone.

I just got a very nice performance result for this branch on pm-gpu:

PASS SMS_Ln362.ne30pg2_ne30pg2.F2010-SCREAMv1.pm-gpu_gnugpu RUN time=157   # YAKL
PASS SMS_Ln362.ne30pg2_ne30pg2.F2010-SCREAMv1.pm-gpu_gnugpu RUN time=163   # Kokkos (before this PR)
PASS SMS_Ln362.ne30pg2_ne30pg2.F2010-SCREAMv1.pm-gpu_gnugpu RUN time=143   # Kokkos (with this PR)

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You decide how much to do in a single PR Jim, but I think there could be other unnecessary alloc's we could remove from the run phase (i.e., all the pools allocs), as well as a pointless setup of gauss quadrature data at every time step.

NVM, the pool::alloc_raw is not actually allocating, it's just grabbing from the pool. But the gauss quadrature comment I think stands.

{0., 0., 0., 0.0311809710}
};

hview_t<RealT**> gauss_wts_host(&gauss_wts_host_raw[0][0],max_gauss_pts,max_gauss_pts);
Copy link
Contributor

@bartgol bartgol Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we copying the same gauss quadrature info to device over and over at every time step? Can we do this at init, and then pass around the pre-filled views at runtime? It seems silly to set them up every time...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this at init but I don't think this is very expensive at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's not very expensive, but so is a single device allocation (in the grand scheme of things). It's just something we can remove, and these small opt all pile up. But yeah, no need to do it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartgol , i don't think there's any allocation happening here. The C arrays will be on the stack and the views just point to that memory.

Copy link
Contributor

@bartgol bartgol Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no alloc, correct, but there are two deep_copy, which involve a small kernel launch. Nothing big, but also pointless.

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Oct 3, 2024

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6104
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5874
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: jgfouca/reduce_rrtmgp_interf_allocs
  • SHA: 3cf1406
  • Mode: TEST_REPO

Pull Request Author: jgfouca

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Oct 3, 2024

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6104
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5874
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Weaver # 6104 PASSED (click to see last 100 lines of console output)

        Start 143: model_restart
143/157 Test #143: model_restart .........................................................   Passed    6.74 sec
        Start 144: restarted_vs_monolithic_check_np1
144/157 Test #144: restarted_vs_monolithic_check_np1 .....................................   Passed    0.10 sec
        Start 145: homme_shoc_cld_spa_p3_rrtmgp_np1
145/157 Test #145: homme_shoc_cld_spa_p3_rrtmgp_np1 ......................................   Passed    5.92 sec
        Start 146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
146/157 Test #146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp .............................   Passed    0.11 sec
        Start 147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
147/157 Test #147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1 ............................   Passed    8.54 sec
        Start 148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
148/157 Test #148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1 .................   Passed    1.29 sec
        Start 149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
149/157 Test #149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp ...................   Passed    0.61 sec
        Start 150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
150/157 Test #150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1 ...............................   Passed   18.34 sec
        Start 151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
151/157 Test #151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp ......................   Passed    0.14 sec
        Start 152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1
152/157 Test #152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1 ...............................   Passed   17.42 sec
        Start 153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
153/157 Test #153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp ......................   Passed    0.17 sec
        Start 154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
154/157 Test #154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1 ............   Passed   11.72 sec
        Start 155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
155/157 Test #155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp ...   Passed    0.14 sec
        Start 156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
156/157 Test #156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1 .........................   Passed   31.76 sec
        Start 157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp
157/157 Test #157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp ................   Passed    0.18 sec

100% tests passed, 0 tests failed out of 157

Label Time Summary:
baseline_cmp = 138.13 secproc (23 tests)
baseline_gen = 317.77 sec
proc (25 tests)
bfbhash = 0.93 secproc (1 test)
check = 0.96 sec
proc (1 test)
cld = 34.22 secproc (7 tests)
cld_fraction = 1.17 sec
proc (1 test)
cxx baseline_cmp = 8.19 secproc (2 tests)
diagnostics = 43.50 sec
proc (23 tests)
driver = 69.48 secproc (16 tests)
dynamics = 8.03 sec
proc (3 tests)
fail = 30.45 secproc (5 tests)
io = 50.30 sec
proc (14 tests)
mam4_aci = 20.52 secproc (4 tests)
mam4_constituent_fluxes = 4.05 sec
proc (1 test)
mam4_drydep = 3.48 secproc (1 test)
mam4_optics = 4.15 sec
proc (1 test)
mam4_srf_online_emiss = 4.05 secproc (1 test)
mam4_wetscav = 16.91 sec
proc (2 tests)
nudging = 8.76 secproc (2 tests)
p3 = 89.59 sec
proc (12 tests)
p3_sk = 32.34 secproc (2 tests)
physics = 155.57 sec
proc (27 tests)
remap = 2.85 secproc (1 test)
rrtmgp = 34.37 sec
proc (11 tests)
shoc = 45.60 secproc (13 tests)
spa = 7.73 sec
proc (4 tests)
surface_coupling = 4.70 sec*proc (1 test)

Total Test time (real) = 745.76 sec

Testing '''3cf1406f74f2875b316cf3d06bc2622b196800a9''' for test '''full_sp_debug'''

RUN: taskset -c 52-103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_sp_debug

Testing '''3cf1406f74f2875b316cf3d06bc2622b196800a9''' for test '''full_debug'''

RUN: taskset -c 0-51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/full_debug

Testing '''3cf1406f74f2875b316cf3d06bc2622b196800a9''' for test '''release'''

RUN: taskset -c 104-155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx/ctest-build/release
OVERALL STATUS: PASS
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6104/scream/components/eamxx
Completed analysis on weaver'

  • [[ 0 != 0 ]]
  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    Performing Post build task...
    Match found for : : True
    Logical operation result is TRUE
    Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins7354508283837767135.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: SUCCESS

SCREAM_PullRequest_Autotester_Mappy # 5874 FAILED (click to see last 100 lines of console output)

        ERROR: BUILD FAIL: cmake config e3sm failed, cat /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163754

Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 16 procs
Finished SHAREDLIB_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 in 307.468479 seconds (PASS)
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 in 140.978448 seconds (FAIL). [COMPLETED 4 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20241003_163420_6mfdkf
Errors were:
Building test for ERS in directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163738

Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 11.422833 seconds (FAIL). [COMPLETED 5 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20241003_163420_6mfdkf
Errors were:
Building test for SMS in directory /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: cmake config e3sm failed, cat /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163959

Finished SHAREDLIB_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 333.347719 seconds (FAIL). [COMPLETED 6 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20241003_163420_6mfdkf
Errors were:
Building test for ERP in directory /home/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20241003_163420_6mfdkf
ERROR: /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5874/scream/share/build/buildlib.mct FAILED, cat /home/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20241003_163420_6mfdkf/bld/case2bld/mct.bldlog.241003-163853

Finished SHAREDLIB_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 329.945750 seconds (FAIL). [COMPLETED 7 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241003_163420_6mfdkf
Errors were:
Building test for ERP in directory /home/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241003_163420_6mfdkf
ERROR: /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5874/scream/share/build/buildlib.mct FAILED, cat /home/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20241003_163420_6mfdkf/bld/case2bld/mct.bldlog.241003-163908

Starting MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 16 procs
Starting MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 27.029606 seconds (FAIL). [COMPLETED 8 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20241003_163420_6mfdkf
Errors were:
Building test for SMS in directory /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163945

Starting MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 with 16 procs
Finished MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 in 43.615745 seconds (FAIL). [COMPLETED 9 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20241003_163420_6mfdkf
Errors were:
Building test for ERS in directory /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163944

Starting MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 with 16 procs
Finished MODEL_BUILD for test ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 in 15.674897 seconds (FAIL). [COMPLETED 10 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20241003_163420_6mfdkf
Errors were:
Building test for ERS in directory /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: cmake config e3sm failed, cat /home/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-164012

Starting MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 with 16 procs
Finished MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 22.659191 seconds (FAIL). [COMPLETED 11 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20241003_163420_6mfdkf
Errors were:
Building test for ERS in directory /home/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-164011

Finished MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 22.668641 seconds (FAIL). [COMPLETED 12 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20241003_163420_6mfdkf
Errors were:
Building test for SMS in directory /home/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-164011

Finished MODEL_BUILD for test ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble in 166.800041 seconds (FAIL). [COMPLETED 13 of 17]
Case dir: /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20241003_163420_6mfdkf
Errors were:
Building test for ERS in directory /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20241003_163420_6mfdkf
ERROR: BUILD FAIL: build e3sm failed, cat /home/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20241003_163420_6mfdkf/bld/e3sm.bldlog.241003-163818'

  • errors=
  • V1_FAILURES_DETAILS+=
  • set +x
    ######################################################
    FAILS DETECTED:
    SCREAM V1 TESTING FAILED!

######################################################
Build step 'Execute shell' marked build as failure
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 3529209 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins17975183460604494346.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: FAILURE

@bartgol
Copy link
Contributor

bartgol commented Oct 4, 2024

Hijaking convo: @brhillman, reviewing this PR, I noticed that sfc_emis is only used as input in rte_lw, but we keep setting it to the constant 0.98 at runtime. Is this one of those cases where we currently have a constant, but may become a var coming from somewhere else at some point? If not, can we do away with the sfc_emis view altogether, and just hardcode the constant throughout rrtmgp and rte?

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Oct 4, 2024

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Oct 4, 2024

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6113
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: RETEST;AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5879
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: RETEST;AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: jgfouca/reduce_rrtmgp_interf_allocs
  • SHA: 3cf1406
  • Mode: TEST_REPO

Pull Request Author: jgfouca

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Oct 4, 2024

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6113
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: RETEST;AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5879
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS AT: RETEST;AT: AUTOMERGE
PULLREQUESTNUM 3028
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3cf1406
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA a269ef9
TEST_REPO_ALIAS SCREAM

@E3SM-Bot E3SM-Bot merged commit 9b1b4c7 into master Oct 4, 2024
6 checks passed
@E3SM-Bot E3SM-Bot deleted the jgfouca/reduce_rrtmgp_interf_allocs branch October 4, 2024 19:26
oksanaguba added a commit that referenced this pull request Oct 23, 2024
…uce_rrtmgp_interf_allocs"

This reverts commit 9b1b4c7, reversing
changes made to d981484.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants