Skip to content

Conversation

hagertnl
Copy link
Collaborator

Description

Add in a tested Frontier config for Benchpark.

Adding/modifying a system (docs: Adding a System)

Adding all necessary files for Frontier.

@hagertnl
Copy link
Collaborator Author

Current error that I need help figuring out is:

$ pwd
....../test_workspace/test_expt/OlcfFrontier-8b2fd05/workspace
$ ramble --disable-progress-bar --worspace-dir . workspace setup
... this fails!
From the output log:
...
==> *******************************************
==> ********** Running spack Command **********
==> **     command: /lustre/orion/stf243/proj-shared/hagertnl/Scratch/benchpark-testing/test_workspace/spack/bin/spack install
==> **     with args: ['--add', '--keep-stage', 'cce18.0.1']
==> *******************************************
==>
==> Error: Unknown namespace: cce18.0
==> Error: Command exited with status 1:
    '/.../benchpark-testing/test_workspace/spack/bin/spack' 'install' '--add' '--keep-stage' 'cce18.0.1'
==> Error: Error running spack command: /.../benchpark-testing/test_workspace/spack/bin/spack install --add --keep-stage cce18.0.1
==> Error: For more details, see the log file: /.../benchpark-testing/test_workspace/test_expt/OlcfFrontier-8b2fd05/workspace/logs/setup.2025-02-20_13.22.46/saxpy.problem.saxpy_problem_single_node_rocm_caliper_none_128.out

Not sure where "cce18.0" is coming from, it's not in any of my configs.

@hagertnl
Copy link
Collaborator Author

The commands I used to get to this error:

$ benchpark system init --dest=test_frontier olcf-frontier compiler=cce18.0.1
$ benchpark experiment init --dest=test_expt saxpy +rocm
$ benchmark setup ./test_expt/ ./test_frontier/ ./test_workspace
$ cd test_workspace/test_expt/test_frontier/workspace/
$ ramble --disable-progress-bar --worspace-dir . workspace setup

@github-actions github-actions bot added the system New or modified system config label May 26, 2025
Copy link
Collaborator

@pearce8 pearce8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfhaque Please work with @hagertnl to get this working and merged.

@pearce8 pearce8 added the changes requested Changes requested label May 26, 2025
@pearce8 pearce8 mentioned this pull request Jun 16, 2025
9 tasks
Copy link
Collaborator

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiler specification changed with merge of #953

- cray-pmi/6.1.15
- libfabric
- xpmem
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defs appear similar to compilers used by https://github.com/LLNL/benchpark/blob/develop/systems/llnl-elcapitan/system.py. The compiler specifications for benchpark changed in #953. These would need to be rewritten using the compiler_def/compiler_section_for functions introduced there (along with merge_dicts to get the final config. I think the elcap system definition linked here would be a good example to extrapolate from (if you want me to more explicitly detail what those changes would look like, let me know).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changes requested Changes requested system New or modified system config
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants