Improved Hugging Face Model Downloading Logs

UtkarshTheDev · UtkarshTheDev · commit ec8da0bb260f · 2025-05-16T19:05:20.000+05:30
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,7 +2,21 @@
 
 All notable changes to LocalLab will be documented in this file.
 
-## [0.6.2] - 2024-05-04
+## [0.6.3] - 2025-05-16
+
+### Improved
+
+- Enhanced model downloading experience by using HuggingFace's native progress bars instead of custom logger
+- Fixed issue with Hugging Face download logs being intercepted by custom logger
+- Ensured Hugging Face progress bars display in their original, visually appealing format
+- Improved configuration of Hugging Face Hub progress bars for better visual experience
+- Completely bypassed custom logging for Hugging Face download-related logs
+- Configured transformers library to use native progress bars for model downloads
+- Disabled logger propagation for HuggingFace-related modules during downloads
+- Added proper spacing before and after progress bars for better readability
+- Enhanced progress bar detection to catch all download-related progress indicators
+
+## [0.6.2] - 2025-05-04
 
 ### Improved
 
@@ -16,13 +30,13 @@ All notable changes to LocalLab will be documented in this file.
 - Updated default values for all optimization settings to be enabled by default
 - Ensured consistency between displayed optimization settings and saved configuration
 
-## [0.6.1] - 2024-05-02
+## [0.6.1] - 2025-05-02
 
 ### Fixed
 
 - Fixed CLI config environment variable issue
 
-## [0.6.0] - 2024-05-02
+## [0.6.0] - 2025-05-02
 
 ### Added
 
@@ -39,7 +53,7 @@ All notable changes to LocalLab will be documented in this file.
 - Fixed client error with `do_sample` parameter by adding it to all client methods
 - Updated client package version to 1.0.9 to reflect these fixes
 
-## [0.5.9] - 2024-05-01
+## [0.5.9] - 2025-05-01
 
 ### Fixed
 
@@ -59,7 +73,7 @@ All notable changes to LocalLab will be documented in this file.
 - Maintained top and bottom borders for visual separation while removing side borders
 - Enhanced overall visual consistency across all banners
 
-## [0.5.8] - 2024-05-01
+## [0.5.8] - 2025-05-01
 
 ### Added
 
@@ -92,7 +106,7 @@ All notable changes to LocalLab will be documented in this file.
 - Increased retry counts for better reliability
 - Added top_k parameter to all generation methods
 
-## [0.5.7] - 2024-05-01
+## [0.5.7] - 2025-05-01
 
 ### Improved
 
@@ -103,7 +117,7 @@ All notable changes to LocalLab will be documented in this file.
 - Improved overall visual consistency and readability across all UI elements
 - Enhanced color scheme for better visual appeal and readability
 
-## [0.5.6] - 2024-05-01
+## [0.5.6] - 2025-05-01
 
 ### Fixed
 
@@ -113,15 +127,15 @@ All notable changes to LocalLab will be documented in this file.
 - Enhanced logging during model downloads for better readability
 - Improved visual clarity of download progress information
 
-## [0.5.5] - 2024-04-30
+## [0.5.5] - 2025-04-30
 
 ### Fixed
 
 - Fixed extra spacing in the boundary of status banners
 - Improved alignment of INITIALIZING and RUNNING status boxes
 - Enhanced visual consistency across all UI elements
 
-## [0.5.4] - 2024-04-30
+## [0.5.4] - 2025-04-30
 
 ### Improved
 
@@ -131,7 +145,7 @@ All notable changes to LocalLab will be documented in this file.
 - Added automatic width adjustment for banners based on content length
 - Fine-tuned color scheme to ensure all logs remain visible while not competing with important banners
 
-## [0.5.3] - 2024-04-30
+## [0.5.3] - 2025-04-30
 
 ### Improved
 
@@ -149,7 +163,7 @@ All notable changes to LocalLab will be documented in this file.
 - Improved overall visual consistency across all UI elements
 - Made server status much easier to distinguish at a glance
 
-## [0.5.2] - 2024-04-30
+## [0.5.2] - 2025-04-30
 
 ### Fixed
 
@@ -161,7 +175,7 @@ All notable changes to LocalLab will be documented in this file.
 - Added proper cleanup of existing handlers before adding new ones
 - Improved compatibility with different terminal environments
 
-## [0.5.1] - 2024-04-21
+## [0.5.1] - 2025-04-21
 
 ### Added
 
@@ -181,7 +195,7 @@ All notable changes to LocalLab will be documented in this file.
 - Added proper timeout handling in streaming operations
 - Enhanced connection state management
 
-## [0.5.0] - 2024-04-21
+## [0.5.0] - 2025-04-21
 
 ### Fixed
 
@@ -191,15 +205,15 @@ All notable changes to LocalLab will be documented in this file.
 - Improved package import reliability
 - Ensured both LocalLabClient and SyncLocalLabClient are properly exported
 
-## [0.5.01] - 2024-04-21
+## [0.5.01] - 2025-04-21
 
 ### Fixed
 
 - Fixed SyncLocalLabClient not being exported from locallab_client package
 - Added proper exports for both LocalLabClient and SyncLocalLabClient in package **init**.py
 - Ensured both sync and async clients are available through the main package import
 
-## [0.4.50] - 2024-04-21
+## [0.4.50] - 2025-04-21
 
 ### Changed
 
@@ -208,7 +222,7 @@ All notable changes to LocalLab will be documented in this file.
 - Changed client package structure to use direct imports instead of nested packages
 - Improved client package documentation with correct import examples
 
-## [0.4.49] - 2024-04-21
+## [0.4.49] - 2025-04-21
 
 ### Fixed
 
@@ -221,7 +235,7 @@ All notable changes to LocalLab will be documented in this file.
 - Improved task cancellation with proper timeout handling
 - Enhanced force exit mechanism to ensure clean termination
 
-## [0.4.48] - 2024-03-15
+## [0.4.48] - 2025-03-15
 
 ### Client Library Changes (v0.2.1)
 
@@ -270,7 +284,7 @@ All notable changes to LocalLab will be documented in this file.
 - Removed text cleaning and formatting from all generation endpoints
 - Improved error handling in streaming responses
 
-## [0.4.47] - 2024-03-15
+## [0.4.47] - 2025-03-15
 
 ### Added
 
@@ -295,7 +309,7 @@ All notable changes to LocalLab will be documented in this file.
 - Enhanced Python client with better error handling for streaming
 - Added proper error message propagation from server to client
 
-## [0.4.46] - 2024-03-14
+## [0.4.46] - 2025-03-14
 
 ### Added
 
@@ -310,7 +324,7 @@ All notable changes to LocalLab will be documented in this file.
 - Improved error handling in streaming generation
 - Enhanced token cleanup for better readability
 
-## [0.4.45] - 2024-03-14
+## [0.4.45] - 2025-03-14
 
 ### Fixed
 
@@ -319,7 +333,7 @@ All notable changes to LocalLab will be documented in this file.
 - Bumped client package version to 1.0.2
 - Updated documentation with correct client initialization examples
 
-## [0.4.31] - 2024-03-14
+## [0.4.31] - 2025-03-14
 
 ### Fixed
 
diff --git a/locallab/__init__.py b/locallab/__init__.py
@@ -2,7 +2,7 @@
 LocalLab - A lightweight AI inference server for running LLMs locally
 """
 
-__version__ = "0.6.2"  # Updated to improve model downloading experience and fix CLI settings
+__version__ = "0.6.3"  # Updated to improve model downloading experience and fix CLI settings
 
 # Only import what's necessary initially, lazy-load the rest
 from .logger import get_logger
diff --git a/locallab/logger/__init__.py b/locallab/logger/__init__.py
@@ -113,30 +113,36 @@ def format(self, record):
         # HuggingFace progress bars use tqdm which writes directly to stdout/stderr
         # We need to completely bypass our logger for these messages
 
+        # First, check if this is a HuggingFace-related log
+        is_hf_log = False
+        if hasattr(record, 'name') and isinstance(record.name, str):
+            # HuggingFace Hub logs typically come from these modules
+            hf_modules = ['huggingface_hub', 'filelock', 'transformers', 'tqdm', 'accelerate', 'bitsandbytes']
+            is_hf_log = any(module in record.name for module in hf_modules)
+
+        # Also check if the message contains download-related content
+        is_download_log = False
+        if hasattr(record, 'msg') and isinstance(record.msg, str):
+            download_patterns = ['download', 'fetch', 'safetensors', '.bin', '.json', 'model-', 'pytorch_model',
+                               'Fetching', 'files:', 'it/s', 'B/s', '%', 'MB/s', 'GB/s']
+            is_download_log = any(pattern in str(record.msg).lower() for pattern in download_patterns)
+
+        # If this is a HuggingFace download log or tqdm progress bar, skip it completely
+        # This ensures HuggingFace's native progress bars are displayed correctly
+        if is_hf_log or is_download_log or (hasattr(record, 'msg') and '%' in str(record.msg) and ('/' in str(record.msg))):
+            return ""
+
         # Check if we're currently downloading a model
         try:
             from ..utils.progress import is_model_downloading
 
-            # Check if this is a HuggingFace progress bar log
-            is_hf_progress_log = False
-            if hasattr(record, 'name') and isinstance(record.name, str):
-                # HuggingFace Hub logs typically come from these modules
-                hf_modules = ['huggingface_hub', 'filelock', 'transformers', 'tqdm']
-                is_hf_progress_log = any(module in record.name for module in hf_modules)
-
-            # If we're downloading a model and this is a HuggingFace log, skip our formatting
-            if is_model_downloading() and is_hf_progress_log:
-                # Return empty string to skip this log in our logger
-                # HuggingFace will handle displaying its own progress bars
-                return ""
-
-            # For non-HuggingFace logs during model download, only show critical and model-related logs
-            elif is_model_downloading() and record.levelno < logging.ERROR:
+            # During model downloads, only show critical logs and important model-related logs
+            if is_model_downloading() and record.levelno < logging.ERROR:
                 # Check if this is a model-related log that should be shown
                 is_model_log = False
                 if hasattr(record, 'msg') and isinstance(record.msg, str):
-                    model_patterns = ['model', 'download', 'tokenizer', 'weight']
-                    is_model_log = any(pattern in record.msg.lower() for pattern in model_patterns)
+                    model_patterns = ['model loaded', 'tokenizer loaded', 'loading complete']
+                    is_model_log = any(pattern in str(record.msg).lower() for pattern in model_patterns)
 
                 # Skip non-critical and non-model logs during model download
                 if not is_model_log:
diff --git a/locallab/model_manager.py b/locallab/model_manager.py
@@ -22,9 +22,21 @@
 import tempfile
 import json
 
-# Configure HuggingFace Hub progress bars
+# Configure HuggingFace Hub progress bars to use native display
+# This ensures we see the visually appealing progress bars from HuggingFace
 configure_hf_hub_progress()
 
+# Also configure transformers to use HuggingFace Hub's progress bars
+try:
+    import transformers
+    transformers.utils.logging.enable_progress_bar()
+    # Set transformers logging to only show warnings and errors
+    transformers.logging.set_verbosity_warning()
+except ImportError:
+    logger.debug("Could not configure transformers progress bars")
+except Exception as e:
+    logger.debug(f"Error configuring transformers progress bars: {str(e)}")
+
 QUANTIZATION_SETTINGS = {
     "fp16": {
         "load_in_8bit": False,
@@ -274,10 +286,31 @@ async def _load_model_with_optimizations(self, model_id: str):
                 # Access the module's global variable
                 import locallab.utils.progress
                 locallab.utils.progress.is_downloading = True
+
+                # Ensure HuggingFace Hub's progress bars are enabled
+                from huggingface_hub.utils import logging as hf_logging
+                hf_logging.enable_progress_bars()
+
+                # Configure transformers to use progress bars
+                import transformers
+                transformers.utils.logging.enable_progress_bar()
+
+                # Also ensure tqdm is properly configured for nice display
+                import tqdm
+                tqdm.tqdm.monitor_interval = 0  # Disable monitor thread which can cause issues
+
+                # Temporarily disable our custom logger for HuggingFace logs
+                import logging
+                for logger_name in ['tqdm', 'huggingface_hub', 'transformers', 'filelock']:
+                    logging.getLogger(logger_name).handlers = []  # Remove any handlers
+                    logging.getLogger(logger_name).propagate = False  # Don't propagate to parent loggers
             except:
                 # Fallback if import fails
                 pass
 
+            # Add an empty line before progress bars start
+            print("\n")
+
             # Load tokenizer first
             logger.info(f"Loading tokenizer for {model_id}...")
             self.tokenizer = AutoTokenizer.from_pretrained(
@@ -1078,10 +1111,31 @@ async def load_custom_model(self, model_name: str, fallback_model: Optional[str]
                 # Access the module's global variable
                 import locallab.utils.progress
                 locallab.utils.progress.is_downloading = True
+
+                # Ensure HuggingFace Hub's progress bars are enabled
+                from huggingface_hub.utils import logging as hf_logging
+                hf_logging.enable_progress_bars()
+
+                # Configure transformers to use progress bars
+                import transformers
+                transformers.utils.logging.enable_progress_bar()
+
+                # Also ensure tqdm is properly configured for nice display
+                import tqdm
+                tqdm.tqdm.monitor_interval = 0  # Disable monitor thread which can cause issues
+
+                # Temporarily disable our custom logger for HuggingFace logs
+                import logging
+                for logger_name in ['tqdm', 'huggingface_hub', 'transformers', 'filelock']:
+                    logging.getLogger(logger_name).handlers = []  # Remove any handlers
+                    logging.getLogger(logger_name).propagate = False  # Don't propagate to parent loggers
             except:
                 # Fallback if import fails
                 pass
 
+            # Add an empty line before progress bars start
+            print("\n")
+
             self.tokenizer = AutoTokenizer.from_pretrained(model_name)
             logger.info(f"Tokenizer loaded successfully")
 
diff --git a/locallab/utils/progress.py b/locallab/utils/progress.py
@@ -171,11 +171,27 @@ def configure_hf_hub_progress():
         # 3. Make sure we're NOT overriding HuggingFace's progress callback
         # This is critical - we want to use their native implementation
         from huggingface_hub import file_download
-        if hasattr(file_download, "_tqdm_callback") and file_download._tqdm_callback == custom_progress_callback:
-            # Reset to default if we previously set it to our custom callback
+        if hasattr(file_download, "_tqdm_callback"):
+            # Reset to default - we don't want any custom callback
             file_download._tqdm_callback = None
 
-        # 4. Set a flag to indicate we're using HuggingFace's native progress bars
+        # 4. Ensure HuggingFace Hub's own logging is properly configured
+        # This ensures HF's own progress bars are displayed correctly
+        import huggingface_hub
+        if hasattr(huggingface_hub, "enable_progress_bars"):
+            huggingface_hub.enable_progress_bars()
+
+        # 5. Configure tqdm directly to ensure proper display
+        import tqdm
+        tqdm.tqdm.monitor_interval = 0  # Disable monitor thread which can cause issues
+
+        # 6. Ensure we're not capturing tqdm output in our logger
+        # This is critical for allowing tqdm to directly write to stdout
+        import logging
+        tqdm_logger = logging.getLogger("tqdm")
+        tqdm_logger.setLevel(logging.WARNING)  # Only show warnings and errors from tqdm
+
+        # 7. Set a flag to indicate we're using HuggingFace's native progress bars
         global is_downloading
         is_downloading = True