-
Notifications
You must be signed in to change notification settings - Fork 7
feat: implement SummaryRule backfill functionality #876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Defines backfill spec with start/end datetime and operationId tracking - Uses serial execution to prevent overwhelming Kusto with expensive queries - Implements progress tracking via spec mutation with natural completion - Includes 6-task implementation plan with comprehensive testing - Maintains backwards compatibility with existing SummaryRules - Adds 30-day max lookback validation and proper error handling
- Add BackfillSpec struct with start, end, and operationId fields - Add optional Backfill field to SummaryRuleSpec with validation - Add 30-day max lookback period constant - Generate updated CRD manifests with kubebuilder validation - Maintain backwards compatibility with existing SummaryRules
- Add HasActiveBackfill() to check for active backfill configuration - Add IsBackfillComplete() to detect completion - Add GetNextBackfillWindow() for time window calculation with interval alignment - Add AdvanceBackfillProgress() to move backfill position forward - Add operation management methods (Set/Get/Clear BackfillOperation) - Add RemoveBackfill() for cleanup when complete - Include comprehensive unit tests covering all edge cases - Handle ingestion delay correctly in window calculations
- Add handleBackfillExecution() method for backfill operation management - Add handleBackfillOperationStatus() to track running backfill operations - Add submitNextBackfillWindow() to submit next backfill window - Integrate backfill execution into main Run() loop alongside normal execution - Handle backfill completion and automatic cleanup - Support operation retry and error handling for backfill operations - Log backfill progress for visibility and debugging
- Add TestSummaryRuleTask_BackfillIntegration with 9 test scenarios - Cover operation submission, progress advancement, error handling - Test retry logic, completion detection, and edge cases - Include ingestion delay support and missing operation handling - Follow existing test patterns with proper mocking and assertions
…s advancement - Add GetOperation function field to SummaryRuleTask for single-operation queries - Implement parameterized KQL query following databaseExists pattern for injection prevention - Update handleBackfillOperationStatus to use efficient single-operation lookup - Factor out shared calculateTimeWindow method for backfill operations - Add comprehensive comments explaining why NextExecutionWindow cannot share logic - Fix AdvanceBackfillProgress to properly reverse delay calculation - Update all backfill tests to mock new GetOperation function - Maintain backward compatibility with existing execution patterns
- Update CRD reference with comprehensive backfill documentation - Add backfill field to SummaryRule key fields and examples - Create detailed backfill feature section with workflow explanation - Add examples directory with three comprehensive YAML examples: - Basic hourly aggregation with 30-day backfill - Advanced example with ingestion delay and criteria - Large-scale daily migration example - Include comprehensive troubleshooting guide covering: - Common issues and solutions - Progress monitoring techniques - Performance optimization strategies - Advanced troubleshooting commands - Add examples reference to main documentation index - Document limitations, best practices, and monitoring approaches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a comprehensive backfill feature for SummaryRules that enables historical data processing by specifying date ranges. The feature allows SummaryRules to automatically advance through time windows from a start datetime to an end datetime while continuing normal interval execution in parallel.
Key changes:
- API Extension: Added
BackfillSpec
to the SummaryRule CRD with start/end datetime fields and operation tracking - Core Logic: Implemented backfill helper methods for window calculation, progress tracking, and completion detection
- Task Integration: Extended SummaryRuleTask to handle backfill operations alongside normal execution with serial processing to prevent Kusto overload
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
operator/manifests/crds/summaryrules_crd.yaml |
Added backfill field schema with validation rules to CRD manifest |
kustomize/bases/summaryrules_crd.yaml |
Identical backfill field addition to base CRD definition |
api/v1/summaryrule_types.go |
Core backfill implementation with BackfillSpec type and helper methods |
api/v1/summaryrule_types_test.go |
Comprehensive unit tests for all backfill helper methods |
api/v1/zz_generated.deepcopy.go |
Auto-generated deepcopy methods for BackfillSpec |
ingestor/adx/tasks.go |
Task execution integration with backfill operation handling |
ingestor/adx/tasks_test.go |
Integration tests for backfill task execution |
docs/crds.md |
Updated CRD documentation with backfill feature details |
docs/index.md |
Added link to examples directory |
docs/examples/*.yaml |
Example configurations for basic, advanced, and migration scenarios |
docs/examples/backfill-troubleshooting.md |
Comprehensive troubleshooting guide |
docs/examples/README.md |
Overview of backfill examples and quick start guide |
.github/plans/summaryrules/backfill.md |
Detailed implementation plan and design decisions |
- Format long KQL queries using multi-line strings for better readability - Add comprehensive documentation to helper functions with actual value examples - Decompose large functions into single-responsibility helpers - Extract shared utilities (ensureClockIsSet, getIngestionDelay, capWindowEndToCurrentTime) - Maintain architectural separation between backfill and normal execution time calculations - All tests passing, no behavioral changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed review feedback! I've addressed all four comments with comprehensive improvements:
Summary of Changes
Code Quality Improvements:
- Multi-line KQL formatting - Enhanced readability of complex Kusto queries
- Enhanced documentation - Added actual values (30 days) to constant comments
- Function decomposition - Broke down large functions into focused, single-responsibility helpers
- Strategic code reuse - Identified and implemented reuse opportunities while preserving architectural differences
Testing & Validation:
- All existing tests continue to pass
- No behavioral changes introduced
- Function decomposition enables better testability
Architecture Preservation:
- Maintained documented separation between
calculateTimeWindow
andNextExecutionWindow
- Created shared utilities for peripheral concerns (clock validation, delay calculation)
- Followed coding guidelines for focused single-purpose functions
The refactoring significantly improves code maintainability while preserving all existing functionality. Ready for another review cycle!
Summary
This PR implements a comprehensive backfill feature for SummaryRules that allows users to process historical data by specifying a date range. The feature enables SummaryRules to automatically advance through time windows from a specified start datetime until reaching an end datetime.
Features
Core Backfill Functionality
start
andend
datetimes in SummaryRule specsAPI Design
Key Capabilities
Implementation Details
Commits Included
feat: add backfill specification to SummaryRule CRD
- Core API extensionfeat: add backfill helper methods to SummaryRule
- Time window calculation logicfeat: integrate backfill execution into SummaryRuleTask
- Task manager integrationtest: add comprehensive unit tests for backfill task integration
- Full test coveragerefactor: optimize backfill operation status checking and fix progress advancement
- Performance improvementsdocs: add backfill feature documentation and examples
- User documentationArchitecture Decisions
Performance Optimizations
Testing
Unit Test Coverage
Validation Scenarios
Documentation
User-Facing Documentation
Example Usage
Safety & Reliability
Backward Compatibility
Error Resilience
Resource Protection
Next Steps
Related Issues
This PR addresses the need for historical data processing in SummaryRules, enabling users to:
Note: This is a draft PR for initial review. Please provide feedback on the implementation approach and any concerns before marking ready for review.