Releases: greensh16/rudu
Releases · greensh16/rudu
v1.4.0
[1.4.0] - 2025-08-18
Major Features Added
Memory Limiting System
- Memory usage limits with
--memory-limit MB
option for resource-constrained environments - Real-time memory monitoring using RSS (Resident Set Size) tracking
- Graceful degradation - automatically disables caching when approaching 95% of memory limit
- Early termination - stops scanning when memory limit is exceeded to prevent system issues
- Platform-aware monitoring - bypasses limits gracefully on platforms without RSS support
- Configurable check intervals with hidden
--memory-check-interval-ms
option for fine-tuning
HPC Cluster Support
- Memory-conscious scanning designed for High-Performance Computing environments
- Job scheduler integration with examples for SLURM, PBS/Torque, and LSF
- Resource-constrained operation that respects allocated memory limits
- Batch job compatibility with conservative memory usage patterns
Enhanced Features
Memory Management
- Intelligent cache disabling when memory pressure is detected
- Partial result handling when scans are terminated early due to memory limits
- Memory status reporting in scan results with
MemoryLimitStatus
enum - Cross-platform compatibility with fallback behavior on unsupported systems
CLI Improvements
- New
--memory-limit
option for setting memory usage limits in megabytes - Enhanced help text with clear memory limiting documentation
- Memory status output showing when limits are approached or exceeded
- Profile integration showing memory usage alongside performance metrics
Performance Improvements
Memory Efficiency
- Reduced memory allocations when operating under memory constraints
- Optimized data structures for memory-limited environments
- Throttled memory checks to minimize monitoring overhead (default: 200ms intervals)
- Smart caching decisions based on available memory headroom
Resource Management
- Thread pool optimization when memory limits are active
- Incremental scanning with memory-aware cache management
- Early exit strategies to prevent resource exhaustion
- Memory-conscious progress reporting with reduced overhead
Use Cases and Examples
HPC Integration
# SLURM job with 2GB memory allocation
#SBATCH --mem=2G
rudu /lustre/project --memory-limit 1800 --threads 4
# PBS job with conservative memory usage
#PBS -l mem=1gb
rudu /data --memory-limit 900 --no-cache
# Memory-constrained deep scan
rudu /filesystem --memory-limit 256 --depth 5 --profile
Memory Monitoring Behavior
Memory Usage | System Behavior |
---|---|
< 95% limit | Normal operation with all features enabled |
95-100% limit | Disables caching, reduces memory allocations |
> 100% limit | Terminates scan early, returns partial results |
Platform unsupported | Disables monitoring, continues normally |
Documentation Updates
- New "Memory Limiting for HPC Clusters" section in README
- Comprehensive usage examples for different HPC schedulers
- Best practices guide for memory-constrained environments
- Platform compatibility matrix for memory monitoring support
- Integration examples with SLURM, PBS, and LSF job schedulers
CLI Options
--memory-limit MB
- Set memory usage limit in megabytes--memory-check-interval-ms MS
- Hidden option for tuning check frequency
Backward Compatibility
- Full backward compatibility with all existing command-line options
- No breaking changes to existing APIs or output formats
- Optional memory limiting - all existing workflows continue to work unchanged
- Graceful fallback on platforms without memory monitoring support
Bug Fixes and Stability Improvements
Caching System Fixes
- Cache test reliability improvements with better test isolation and cleanup
- Cache file handling robustness improvements for edge cases
- Memory-mapped cache stability enhancements
- Cache invalidation logic fixes for better reliability
Code Quality and Linting
- Clippy warnings resolved - all code now passes strict linting requirements
- Code formatting standardized across all modules and benchmarks
- Benchmark consistency improvements across all performance tests
- Example code cleaned up and validated
CI/CD Pipeline Improvements
- GitHub Actions workflow optimization for faster CI runs
- Test reliability improvements with better resource management
- Build process streamlining and dependency management
Documentation and Examples
- New tutorial documentation added in
docs/basic-usage.md
- Comprehensive exclusion guide in
docs/exclude_tutorial.md
with 490+ lines of examples - Memory monitor demo example showing practical memory limiting usage
- Cache disable demo for testing memory-constrained environments
- Enhanced benchmarking with new overhead benchmark suite
Rudu v1.3.0
Major Features Added
Intelligent Caching System
- Memory-mapped cache files for near-instantaneous repeated scans
- Automatic cache invalidation based on directory modification times
- Configurable TTL with
--cache-ttl
option (default: 7 days) - Cache location fallback from local directory to XDG cache directory
- Graceful cache corruption handling with automatic fallback
Incremental Scanning
- Skip unchanged directories based on metadata comparison (mtime, nlink)
- Preserves cached aggregated values for unchanged subtrees
- Dramatic performance improvements for repeated scans (3-10x faster)
- Intelligent cache hit/miss tracking with profiling integration
Performance Profiling
- Detailed timing breakdowns with
--profile
flag - Memory usage tracking (RSS) for each phase
- Cache hit/miss statistics for optimization insights
- JSON export support for automated performance analysis
- Phase-by-phase analysis (Setup, Cache-load, WalkDir, Disk I/O, etc.)
Enhanced Features
Advanced Threading
- Work-stealing algorithms for uneven directory structures
- Local thread pool optimization when
--threads
is specified - Multiple thread pool strategies (experimental
--threads-strategy
) - NUMA-aware processing improvements
Improved CLI
- New caching options:
--no-cache
,--cache-ttl
- Performance profiling:
--profile
- Enhanced help text with performance guidance
- Better error handling for cache operations
Documentation
- Comprehensive performance guide in
docs/performance.md
- Detailed benchmark results with cache performance metrics
- Optimization strategies for different use cases
- Troubleshooting guide for common performance issues
Performance Improvements
Caching Performance
- O(1) cache loading using memory-mapped files
- Sub-millisecond cache access for small to medium projects
- Efficient cache serialization with bincode
- Automatic cache compression for large datasets
Scanning Optimizations
- Reduced memory allocations through better data structure reuse
- Improved I/O patterns for better cache locality
- Optimized parent path traversal with caching
- Single-pass inode counting during directory traversal
Updated rudu
Added the ability to select the amount of CPUs to use in the parallel file scanning.
Updated version of rudu
Now includes:
- Option to show directory and file ownership (--show-owner)
- Option to output to csv file (--output report.csv)
- Uses DashMap for better parallelisation in file/directory scanning
- Moved parts of main.rs into seperate files for a cleaner look
- Added comments throughout code and Rustdoc-style module-level comments in each file.
Initial release of rudu
This is the initial release of rudu.
rudu is a high-performance, Rust-powered replacement for the traditional Unix du (disk usage) command. It was built to provide a safer, faster, and more extensible alternative for scanning and analyzing directory sizes — especially for large-scale or deep filesystem structures.