Skip to content

Feature: Improve download safety and search accuracy #11

@slahiri

Description

@slahiri

Feature Request

Based on analysis of similar extensions, here are improvements to make downloads safer and search more accurate.


1. Temp File Downloads (Safety)

Problem: If a download fails mid-way or is interrupted, we may end up with a corrupted partial file.

Solution: Download to a .tmp file first, then rename on success.

temp_path = full_path + '.tmp'
try:
    with open(temp_path, 'wb') as file:
        for chunk in response.iter_content(chunk_size=1024*1024):
            file.write(chunk)
    # Only rename if download completed successfully
    shutil.move(temp_path, full_path)
except Exception as e:
    # Clean up temp file on failure
    if os.path.exists(temp_path):
        os.remove(temp_path)
    raise e

Benefits:

  • Prevents corrupted partial files
  • Safe to retry failed downloads
  • No manual cleanup needed

2. Filename Component Extraction (Better Search)

Problem: Current HuggingFace search uses full filename which may not match repo naming.

Solution: Extract components from filename for smarter search queries.

def extract_model_components(filename):
    """
    Extract components from model filename for better search
    
    Example: 'flux-dev-fp16-v2.safetensors'
    Returns: {
        'core_name': 'flux',
        'version': 'v2',
        'tags': ['dev', 'fp16']
    }
    """
    name_without_ext = re.sub(r'\.[^/.]+$', '', filename)
    parts = re.split(r'[-_]', name_without_ext)
    
    core_name = []
    version = None
    tags = []
    
    for part in parts:
        if re.match(r'v?\d+(-\d+)?', part):
            version = part
        elif re.match(r'[a-zA-Z]+', part):
            if not core_name:
                core_name.append(part)
            else:
                tags.append(part)
    
    return {
        'core_name': '_'.join(core_name),
        'version': version,
        'tags': tags
    }

Search strategy:

  1. Search: {core_name}_{version} (e.g., "flux_v2")
  2. Search: {core_name} (e.g., "flux")
  3. Search: {core_name}_{tags} (e.g., "flux_dev_fp16")

Benefits:

  • Better match rate on HuggingFace
  • Handles renamed/versioned files
  • More flexible than exact filename search

3. Workflow Hash for Change Detection

Problem: Re-scanning unchanged workflows wastes time.

Solution: Hash workflow content to detect changes.

import hashlib
import json

def get_workflow_hash(workflow_data):
    """Generate hash of workflow to detect changes"""
    prompt_str = json.dumps(workflow_data, sort_keys=True)
    return hashlib.md5(prompt_str.encode()).hexdigest()

# Usage
current_hash = get_workflow_hash(workflow)
if current_hash != last_workflow_hash:
    # Workflow changed, rescan
    models = scan_workflow(workflow)
    last_workflow_hash = current_hash

Benefits:

  • Skip redundant scans
  • Faster UI response
  • Less API calls

Implementation Priority

  1. High: Temp file downloads - Prevents data corruption
  2. Medium: Filename component extraction - Improves search success rate
  3. Low: Workflow hash - Performance optimization

Reference

Inspired by analysis of: https://github.com/ciri/comfyui-model-downloader


4. Token Source Flexibility

Problem: Users must manually paste API tokens in the UI, which is inconvenient for CI/CD, Docker, and shared machines.

Solution: Support multiple token sources - plain text, environment variable, or file.

TOKEN_SOURCE_PLAIN = 'plain'
TOKEN_SOURCE_ENV = 'environment_variable'
TOKEN_SOURCE_FILE = 'file'

def get_token(value, source_type):
    """
    Get token from various sources
    
    Args:
        value: Token string, env var name, or file path
        source_type: One of TOKEN_SOURCE_PLAIN, TOKEN_SOURCE_ENV, TOKEN_SOURCE_FILE
    """
    if source_type == TOKEN_SOURCE_ENV:
        return os.environ.get(value, '')
    
    elif source_type == TOKEN_SOURCE_FILE:
        # Support common token file locations
        if value == 'auto':
            # Check default locations
            default_paths = [
                os.path.expanduser('~/.huggingface/token'),
                os.path.expanduser('~/.cache/huggingface/token'),
            ]
            for path in default_paths:
                if os.path.exists(path):
                    with open(path, 'r') as f:
                        return f.read().strip()
            return ''
        
        if os.path.exists(value):
            with open(value, 'r') as f:
                return f.read().strip()
        return ''
    
    # Plain text
    return value

Use cases:

  • HF_TOKEN env var for CI/CD pipelines
  • ~/.huggingface/token file (created by huggingface-cli login)
  • Docker containers with tokens mounted as files
  • Shared machines where tokens shouldn't be in UI

Implementation Priority: Medium - Nice quality-of-life improvement


Reference

Token flexibility inspired by: https://github.com/stavsap/comfyui-downloader

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions