You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Advanced Model Management with Hot Loading and Storage Tiers
π₯ HOT LOADING & MEMORY TIER MANAGEMENT:
New Model Manager Features:
β Hot loading/unloading models without system restart
β Intelligent memory tier placement (RAM/SWAP/STORAGE)
β Automatic memory optimization with LRU eviction
β Storage tier caching for instant model swapping
β Real-time memory monitoring across all tiers
β Force-tier loading for performance optimization
Memory Management:
- RAM Tier: 6.0GB limit (80% of system RAM)
- SWAP Tier: 7.0GB limit (60% of system swap)
- STORAGE Tier: Unlimited disk-based caching
- Automatic tier selection based on available memory
- LRU eviction when memory limits exceeded
Enhanced MCP Tools:
- manage_model_loading: Hot load/unload with tier control
- get_memory_status: Real-time memory usage across tiers
- hot_swap_models: Instant model swapping for optimization
- optimize_memory: Intelligent memory optimization strategies
Usage Examples:
- Load model to specific tier: force_tier='RAM'
- Hot swap models: unload to storage, load new model
- Memory optimization: aggressive/balanced strategies
- Real-time monitoring: RAM/SWAP/Storage usage
Test Results:
β Hot loading: gpt2-small->RAM, gpt-j-6b->SWAP, llama-7b->STORAGE
β Memory tracking: 0.5GB RAM, 6.0GB SWAP usage
β Hot swapping: gpt2-small->storage, bert-large->RAM
β Optimization: balanced strategy with 1 optimization
This enables dynamic model management for optimal performance
on resource-constrained Jetson devices with intelligent tiering.
0 commit comments