Skip to content

CI: Target IB-capable nodes for tests#11173

Open
Alexey-Rivkin wants to merge 8 commits intoopenucx:masterfrom
Alexey-Rivkin:gtest_ib_infra
Open

CI: Target IB-capable nodes for tests#11173
Alexey-Rivkin wants to merge 8 commits intoopenucx:masterfrom
Alexey-Rivkin:gtest_ib_infra

Conversation

@Alexey-Rivkin
Copy link
Contributor

@Alexey-Rivkin Alexey-Rivkin commented Feb 9, 2026

What?

Update UCX gtest k8s jobs to request CX8 RDMA devices and IB network attachment.

Why?

Ensure tests run on IB-capable nodes with required RDMA access.

How?

Add annotations, limits/requests with rdma/hca_cx8: 1, and caps_add: [IPC_LOCK] in test_matrix.yaml.

@Alexey-Rivkin
Copy link
Contributor Author

/build

@Alexey-Rivkin Alexey-Rivkin changed the title Gtest ib infra CI: Target IB-capable nodes for tests Feb 9, 2026
@Alexey-Rivkin Alexey-Rivkin force-pushed the gtest_ib_infra branch 20 times, most recently from ffa991a to 09c7c1a Compare February 12, 2026 11:06
Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
Gtest failures when running in k8s env, as unlimites
max_threads cause resource exhaustion. Setting the CPU
affinity will limit max_threads to 2 dynamically.

Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
@Alexey-Rivkin Alexey-Rivkin marked this pull request as ready for review February 15, 2026 13:13
Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
Signed-off-by: Alexey Rivkin <arivkin@nvidia.com>
@Alexey-Rivkin
Copy link
Contributor Author

/build

@Alexey-Rivkin
Copy link
Contributor Author

/build

@Alexey-Rivkin
Copy link
Contributor Author

/build

1 similar comment
@dpressle
Copy link
Contributor

dpressle commented Mar 1, 2026

/build

@dpressle dpressle self-requested a review March 1, 2026 14:52
@dpressle
Copy link
Contributor

dpressle commented Mar 2, 2026

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants