Skip to content

Conversation

davidozog
Copy link
Member

SOS currently opens 2 libfabric endpoints (EPs) per process, which can lead to Rx/TX resource exhaustion in scale-up scenarios (like with 1 process per core, when there are more cores on the CPU than there are contexts on the NIC - see #1127). This PR merges those 2 endpoints, which benefits scaling (i.e., it improves the maximum PPN by 2x in typical cases), and seems to improve overall performance due to better runtime progress.

In this PR I propose we make single-EP the default behavior for SOS over OFI transport. The optimization can be disabled by setting the new env var SHMEM_OFI_DISABLE_SINGLE_EP.

This also has the benefit of enabling FI_MANUAL_PROGRESS for a few providers, like OPX, TCP, and sockets - when SOS is configured with --enable-ofi-manual-progress and --enable-manual-progress.

@davidozog
Copy link
Member Author

Polite ping! Any thoughts on this change? FWIW, I'm seeing great performance improvement and vastly improved support for FI_MANUAL_PROGRESS with this change.

Copy link
Collaborator

@markbrown314 markbrown314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a nitpick comment, but it is fine otherwise.

@davidozog davidozog force-pushed the pr/ofi_single_endpoint branch from 75808b5 to 2eab81c Compare April 10, 2025 20:59
@davidozog davidozog merged commit d68f419 into Sandia-OpenSHMEM:main Apr 10, 2025
36 checks passed
@davidozog davidozog deleted the pr/ofi_single_endpoint branch April 15, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants