-
Notifications
You must be signed in to change notification settings - Fork 120
Open
Labels
Description
Summary
E2E test instrumentation has revealed that 10.9% of GitHub API calls are duplicates - the same endpoint called multiple times within a single webhook event processing cycle. This tracking issue coordinates the optimization effort.
Problem Statement
When processing GitHub webhook events, Pipelines-as-Code makes redundant API calls due to:
- No caching for blob/file content fetches
- No caching for repository tree traversals
- Multiple event handlers independently fetching the same PR/commit data
- Provider instances not being reused across processing paths
Impact
- 466 API calls observed across 52 e2e test runs
- 51 calls (10.9%) are wasted duplicates
- Unnecessary rate limit consumption, especially problematic for:
- High-traffic repositories
- Organizations with many repos sharing the same GitHub App
- GitHub Enterprise instances with stricter rate limits
Analysis Data
API Calls by Operation Type
| Operation | Total | Duplicates | Waste % |
|---|---|---|---|
get_blob |
136 | 18 | 13.2% |
get_tekton_tree |
54 | 8 | 14.8% |
get_root_tree |
50 | 4 | 8.0% |
get_commit_files |
34 | 7 | 20.6% |
list_check_runs_for_ref |
34 | 3 | 8.8% |
get_pull_request |
24 | 3 | 12.5% |
list_pull_request_files |
23 | 5 | 21.7% |
Worst Offending Tests
| Test | Total Calls | Duplicates |
|---|---|---|
| Github PullRequest (multi-pipeline) | 21 | 9 (43%) |
| Github PullRequest | 19 | 7 (37%) |
| Github Single Comment Strategy Webhook | 15 | 5 (33%) |
| Github PullRequest onWebhook | 14 | 5 (36%) |
Sub-Issues
- Add blob caching to GitHub provider to reduce duplicate API calls #2376 - Add blob caching to GitHub provider (Priority 1)
- Add tree caching to GitHub provider to reduce duplicate API calls #2377 - Add tree caching to GitHub provider (Priority 2)
- Centralize PR data fetching to prevent duplicate API calls #2378 - Centralize PR data fetching to prevent duplicate calls (Priority 3)
Expected Outcome
After implementing all optimizations:
- Current efficiency: 89.1%
- Target efficiency: 98%+
- Estimated API calls saved per webhook event: 3-9 calls
How to Measure
Re-run e2e tests with PAC_API_INSTRUMENTATION_DIR set and compare:
# Before optimization
Total API calls: 466
Duplicate calls: 51
# After optimization (target)
Total API calls: ~415
Duplicate calls: ~6Related Files
pkg/provider/github/github.go- Provider struct and API methodspkg/provider/github/parse_payload.go- Event parsing logictest/pkg/github/instrumentation.go- API call instrumentation
Reactions are currently unavailable