Skip to content

Comments

Use cgroup CPU limit to set concurrency#40997

Merged
ggreenway merged 33 commits intoenvoyproxy:mainfrom
premnathnenavath:cgroup-cpu-detection
Oct 22, 2025
Merged

Use cgroup CPU limit to set concurrency#40997
ggreenway merged 33 commits intoenvoyproxy:mainfrom
premnathnenavath:cgroup-cpu-detection

Conversation

@premnathnenavath
Copy link
Contributor

@premnathnenavath premnathnenavath commented Sep 5, 2025

Commit Message:
This addresses container CPU limit issues reported in #40914
Implement CPU detection to improve behavior in containerized environments where cgroup limits are more restrictive than affinity/hardware thread counts.

Additional Description:
Envoy currently ignores cgroup CPU limits when determining worker thread count, leading to overprovisioning in containers:

  • Hardware/Affinity reports 64 CPUs but container has resources.limits.cpu: "2"
  • Envoy creates 64 worker threads instead of 2, causing CPU throttling

This adds cgroup-aware CPU detection following Go's proven algorithm:

effective_cpu = min(hardware_threads, affinity_count, cgroup_limit)
worker_threads = max(1, effective_cpu) 

Risk Level:
High - This changes CPU detection behavior in containerized environments, potentially affecting worker thread count and performance characteristics.

Testing:
Docs Changes:
Release Notes:
Platform Specific Features:

This adds Go-style GOMAXPROCS algorithm to Envoy's CPU detection:
- Combines hardware threads, CPU affinity, and cgroup limits
- Supports both cgroup v1 (cpu.cfs_quota_us) and v2 (cpu.max)
- Uses leaf-only cgroup detection matching Go runtime behavior
- Changes minimum CPU count from 2 to 1 for single-core containers
- Includes comprehensive test suite with filesystem mocking

The implementation improves Envoy's behavior in Kubernetes and Docker
environments where cgroup limits are more restrictive than hardware
thread counts.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
Fix formatting issues identified by clang-format to ensure
compliance with Envoy's coding standards.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
- Remove Go-specific references from comments and documentation
- Replace 'hw_threads' with 'hardware_threads' for better readability
- Rename test from 'GoMaxprocsAlgorithm' to 'CpuCountIntegration'
- Add missing technical terms to spelling dictionary (CFS, CPUSET, TASKSET, hw)
- Apply code formatting fixes for long lines
- Simplify comments to focus on container-aware CPU detection functionality
- Keep algorithm behavior unchanged, only improve documentation and naming

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
- Add CgroupCpuUtil for cgroup v1/v2 CPU limit parsing
- Enhance getCpuCount() with 5-step container-aware algorithm
- Add comprehensive test coverage with filesystem mocking
- Respect hardware threads, CPU affinity, and cgroup constraints

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@kyessenov
Copy link
Contributor

@ggreenway as a senior approver (please feel free to delegate for first pass, while we recover repokitteh).

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the code yet, but I agree that changing the default from hw-cpus to container-allowed-cpus makes sense. We should add a runtime guard for this, and a clear release note in "incompatible behavior". But I can't think of any situations where the current behavior is better than the new behavior.

- Add extensive comments explaining cgroup v1/v2 precedence and behavior
- Improve algorithm documentation with step-by-step explanations
- Add detailed examples for CPU limit calculations (1.5 CPUs → 2 CPUs)
- Enhance error handling comments for safety and clarity
- Fix formatting and code style consistency
- Reference Go runtime cgroup implementation for context
- Add two-pass cgroup path parsing with proper v1/v2 precedence
- Fix missing words in comments and improve readability

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

/retest

- Enable container-aware CPU detection by default with runtime guard
- Add envoy.reloadable_features.enable_cgroup_cpu_detection flag (enabled by default)
- Update behavior_changes release note to reflect new default behavior
- Maintain backward compatibility with runtime flag override option
- Position runtime guard in correct alphabetical order in runtime_features.cc

This addresses reviewer feedback to make container-aware CPU detection
the default behavior while providing an escape hatch for rollback.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #40997 was synchronize by premnathnenavath.

see: more, trace.

@premnathnenavath
Copy link
Contributor Author

/retest

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change a lot. Thanks for clearly documenting the cgroup behavior and expected content of various files.

Ideally I'd like there to be an integration test for this, but I need to do some research to see if that's possible in our CI environment or not.

/wait

- Change runtime guard to restart feature: envoy_restart_features_enable_cgroup_cpu_detection
- Update API to return absl::optional<uint32_t> without hw_threads parameter
- Remove step comments as requested by reviewer
- Add Linux-only and --concurrency behavior notes to changelog
- Add proper ENVOY_LOG_MISC warning logs for malformed cgroup files
- Refactor to use optional throughout, remove UINT32_MAX sentinels
- Change string constants to constexpr absl::string_view
- Add Go source permalink with specific line number
- Remove isV1Available/isV2Available functions for simpler API
- Apply clang-format improvements for consistent code style
- Update all tests to use explicit absl::optional<uint32_t> types
- Simplify test mocks to match new direct file reading approach

All reviewer feedback has been addressed and code passes formatting checks.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@nezdolik
Copy link
Member

looks like ci is still failing

@premnathnenavath
Copy link
Contributor Author

Im working on it and the linux documentation. Will get back by this week

This commit introduces a robust cgroup CPU limit detection system that works
across both cgroup v1 and v2 hierarchies with comprehensive error handling.

Key features:
- Cross-platform CPU limit detection (cgroup v1/v2, Docker, Kubernetes, systemd)
- Modular architecture with separate helper functions for each cgroup version
- Optional-based API design using absl::optional<CpuLimitResult>
- Comprehensive test suite with 81 test cases covering edge cases
- Performance optimizations with file content caching
- Detailed logging for debugging container environments

Changes:
- Add new cgroup_cpu_util.h/.cc with getCpuLimit() and helper functions
- Implement accessCgroupV1Files/accessCgroupV2Files for file access
- Implement readActualLimitsV1/readActualLimitsV2 for parsing
- Add comprehensive unit tests covering all scenarios
- Update BUILD files for new test dependencies
- Enhance options_impl_test.cc integration

This enables Envoy to automatically detect CPU limits in containerized
environments, improving resource management and performance tuning.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
- Resolved conflicts in changelogs/current.yaml by keeping both behavior changes
- Resolved conflicts in source/common/runtime/runtime_features.cc by including both runtime guards
- Successfully merged 176 commits from upstream main
- Maintained our cgroup CPU detection functionality alongside new upstream features

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

premnathnenavath commented Oct 1, 2025

The implementation almost perfectly matches go behaviour

Optimizations of zero heap, BSS allocation and in place parsing for cgroup detection were skipped for simplicity since pathological escaped and huge paths are rare. The impact is however no GC, efficient stack memory allocation & performance in such scenarios. example netblue30/firejail#6450

Let me get back on this if the complication actually makes sense for performance. Meanwhile please review the PR

- Simplified API by removing CpuLimitResult struct wrapper
- Changed return type from absl::optional<CpuLimitResult> to absl::optional<double>
- Removed 4 unused functions and 40+ unused test cases
- Added missing words to spelling dictionary
- Fixed formatting issues
- Net reduction: 334 lines of cleaner code

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

/retest

1 similar comment
@premnathnenavath
Copy link
Contributor Author

/retest

Copy link
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace empty --test_tag_filters= with --test_tag_filters=- in CI configs
to address reviewer concerns about broad filter overrides.

The empty filter (--test_tag_filters=) completely clears ALL tag filtering,
potentially breaking other CI filtering logic. The new approach using
--test_tag_filters=- means 'exclude nothing' and is more targeted,
allowing engflow_only tests to run in CI while preserving other
tag filtering behavior.

This implements the positive filtering approach suggested in code review
to make cgroup CPU detection tests run only in appropriate CI environments.

Signed-off-by: Premnath Nenavath <premnath.nenavath@gmail.com>

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
Remove broad CI config tag filter overrides and only keep the override
for the EngFlow RBE configuration (rbe-envoy-engflow).

Based on code review feedback, we should only run cgroup CPU detection
tests in environments where we know ahead of time that the necessary
cgroup resources are available. EngFlow RBE is the only such environment
we can trust.

This ensures:
- Local builds exclude engflow_only tests (for development)
- Most CI environments exclude engflow_only tests (unknown resources)
- Only EngFlow RBE includes engflow_only tests (known good resources)

Addresses reviewer concerns about making assumptions about resource
availability in different CI environments.

Signed-off-by: Premnath Nenavath <premnath.nenavath@gmail.com>

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

/retest

Remove redundant CI environment check from cgroup test since tag filtering
already controls when these tests run. The engflow_only tag ensures tests
only execute in appropriate EngFlow RBE environments.

Signed-off-by: Premnath Nenavath <premnath.nenavath@samsara.com>
Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
This change adds CPU detection from cgroup limits to enable container-aware
concurrency configuration. When running in containers with CPU limits (e.g.,
Docker --cpus flag), Envoy can now detect and use those limits instead of
the host's CPU count.

Implementation:
- Added CgroupCpuUtil to read CPU limits from cgroup v1 and v2 filesystems
- Whitelisted /proc/self/mountinfo, /proc/self/cgroup, and /sys/fs/cgroup/*
  paths in filesystem security checks to allow reading cgroup files
- Added integration test requiring Docker with CPU limits
- Created bazel-test-cgroup.sh wrapper to run tests with Docker --cpus=2
- Added GitHub Actions workflow for automated testing

Testing:
- Integration test validates CPU limit detection in Docker containers
- Filesystem whitelist tests verify cgroup path access
- Tests run in privileged Docker containers with CPU limits set

Fixes envoyproxy/envoy#issue-number

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
Update actions/cache from v3 to v4 and actions/upload-artifact from v3 to v4
to fix deprecation warning. The v3 versions were deprecated on April 16, 2024.

Signed-off-by: Premnath Nenavath <premnath.nenavath@example.com>
Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
Changes:
- Add --test_tag_filters=containerized to bazel-test-cgroup.sh to explicitly
  run only containerized tests (overrides global -engflow_only filter)
- Quote DOCKER_EXTRA_ARGS in tools/docker_wrapper.sh to fix shellcheck warnings
- Remove trailing whitespace from .github/workflows/cgroup-tests.yml

This fixes the GitHub Actions error 'No test targets were found' while
maintaining the expected behavior that containerized tests are excluded
from regular local builds.

Tested-on: EC2 (working)
Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
The bazel-test-docker.sh was using 'envoyproxy/envoy-build' but the
correct image name from .bazelrc is 'envoyproxy/envoy-build-ubuntu'.

This was causing GitHub Actions to fail with:
  docker: Error response from daemon: manifest for
  envoyproxy/envoy-build:f4a881a1205e8e6db1a57162faf3df7aed88eae8
  not found: manifest unknown

Fixed by updating IMAGE variable to use envoy-build-ubuntu.

Tested-on: EC2 (test passes)
Fixes: GitHub Actions Docker image not found error
Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
…ix YAML formatting

- Add -containerized to test_tag_filters for asan, tsan, msan, test-coverage, and fuzz-coverage configs in .bazelrc
- Fix YAML indentation in .github/workflows/cgroup-tests.yml to comply with GitHub Actions formatting standards
- Ensures containerized tests only run when explicitly requested via bazel-test-cgroup.sh
- Prevents Docker-in-Docker issues during sanitizer and coverage test runs

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
The compile_time_options CI target was overriding the global
test_tag_filters with its own -nofips filter, which allowed
containerized tests to run. Added -containerized to both
test_tag_filters in the compile_time_options target to prevent
cgroup integration tests from running in those builds.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

/retest

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
- Changed test tag from 'containerized' to 'runtime-cpu' to match upstream
- Updated all test_tag_filters in .bazelrc to use -runtime-cpu
- Removed .github/workflows/cgroup-tests.yml (upstream uses main CI)
- Removed tools/bazel-test-cgroup.sh (upstream uses ci/do_ci.sh cpu-detection)
- Reverted tools/docker_wrapper.sh and tools/bazel-test-docker.sh to upstream
- Updated test comments to reflect Docker CPU limits execution
- Added 'no-remote' tag to prevent RBE execution
- Added quotes around 'cgroups' in comments for clarity

This aligns with maintainer's approach in PR envoyproxy#41646.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
The compile_time_options CI target was still using -containerized
instead of -runtime-cpu, causing the cgroup test to run in
compile-time-options builds where it shouldn't.

Updated both test_tag_filters in the compile_time_options target
to use -runtime-cpu instead of -containerized.

Signed-off-by: premnath.nenavath <premnath.nenavath@gomotive.com>
@premnathnenavath
Copy link
Contributor Author

Thanks @phlax for adding the IT in main. @ggreenway @phlax please review again

@phlax
Copy link
Member

phlax commented Oct 22, 2025

/retest flakey tsan

Copy link
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ci lgtm, thanks @premnathnenavath for working on this and iterating

Copy link
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks for all the iterating on the integration test.

@ggreenway ggreenway merged commit 2ed4ed0 into envoyproxy:main Oct 22, 2025
29 checks passed
eric846 pushed a commit to envoyproxy/nighthawk that referenced this pull request Oct 28, 2025
- Update the ENVOY_COMMIT and ENVOY_SHA in bazel/repositories.bzl to the latest Envoy's commit.
- Update .bazelrc to envoyproxy/envoy#41705, envoyproxy/envoy#40997
- Update .bazelversion to envoyproxy/envoy#41661
- Update ci/docker-compose.yml to envoyproxy/envoy#41646
- Modifications to python test files to accommodate bazelbuild/bazel#18128
- Adjustments to docker image creation to avoid copying symlinks from virtualenv

Signed-off-by: tomjzzhang <4367421+tomjzzhang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants