Description
Got this error:
D:\ComfyUI_windows_portable>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --use-flash-attention
Adding extra search path checkpoints D:\webui_forge_cu121_torch231\webui\models\Stable-diffusion
Adding extra search path configs D:\webui_forge_cu121_torch231\webui\models\Stable-diffusion
Adding extra search path vae D:\webui_forge_cu121_torch231\webui\models\VAE
Adding extra search path vae_approx D:\webui_forge_cu121_torch231\webui\models\VAE-approx
Adding extra search path loras D:\webui_forge_cu121_torch231\webui\models\Lora
Adding extra search path loras D:\webui_forge_cu121_torch231\webui\models\LyCORIS
Adding extra search path hypernetworks D:\webui_forge_cu121_torch231\webui\models\hypernetworks
Adding extra search path diffusers D:\webui_forge_cu121_torch231\webui\models\diffusers
Adding extra search path controlnet D:\webui_forge_cu121_torch231\webui\models\ControlNet
Adding extra search path clip D:\webui_forge_cu121_torch231\webui\models\text_encoder
Adding extra search path embeddings D:\webui_forge_cu121_torch231\webui\embeddings
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\ESRGAN
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\RealESRGAN
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\SwinIR
[START] Security scan
[DONE] Security scan
ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2025-05-01 14:50:32.336
** Platform: Windows
** Python version: 3.12.9 (tags/v3.12.9:fdb8142, Feb 4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)]
** Python executable: D:\ComfyUI_windows_portable\python_embeded\python.exe
** ComfyUI Path: D:\ComfyUI_windows_portable\ComfyUI
** ComfyUI Base Folder Path: D:\ComfyUI_windows_portable\ComfyUI
** User directory: D:\ComfyUI_windows_portable\ComfyUI\user
** ComfyUI-Manager config path: D:\ComfyUI_windows_portable\ComfyUI\user\default\ComfyUI-Manager\config.ini
** Log path: D:\ComfyUI_windows_portable\ComfyUI\user\comfyui.log
Prestartup times for custom nodes:
1.5 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-manager
Checkpoint files will always be loaded safely.
Total VRAM 24564 MB, total RAM 65129 MB
pytorch version: 2.7.0+cu128
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
Using Flash Attention
Python version: 3.12.9 (tags/v3.12.9:fdb8142, Feb 4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)]
ComfyUI version: 0.3.30
ComfyUI frontend version: 1.17.11
[Prompt Server] web root: D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 2128, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 995, in exec_module
File "", line 1132, in get_code
File "", line 1190, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI\init.py'
Cannot import D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI module for custom nodes: [Errno 2] No such file or directory: 'D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI\init.py'
Loading: ComfyUI-Manager (V3.31.13)
[ComfyUI-Manager] network_mode: public
ComfyUI Revision: 163 [a97f2f85] *DETACHED | Released on '2025-04-24'
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
PyTorch version 2.7.0+cu128 available.
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
INFO ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Optimum library found. GPTQ model loading enabled (requires suitable backend).
HiDream: Successfully registered with ComfyUI memory management
HiDream Sampler Node Initialized
Available Models: ['full-nf4', 'dev-nf4', 'fast-nf4', 'full', 'dev', 'fast']
Import times for custom nodes:
0.0 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds (IMPORT FAILED): D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI
0.1 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-manager
0.8 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler
Starting server
To see the GUI go to: http://127.0.0.1:8188
FETCH ComfyRegistry Data: 5/83
FETCH ComfyRegistry Data: 10/83
FETCH ComfyRegistry Data: 15/83
FETCH ComfyRegistry Data: 20/83
FETCH ComfyRegistry Data: 25/83
FETCH ComfyRegistry Data: 30/83
FETCH ComfyRegistry Data: 35/83
FETCH ComfyRegistry Data: 40/83
got prompt
Failed to validate prompt for output 2:
- HiDreamSamplerAdvanced 19:
- Value 77.0 bigger than max of 5.0: llama_weight
- Value 256 bigger than max of 218: max_length_openclip
Output will be ignored
Successfully parsed resolution: 1024x1024
Using fixed resolution: 1024x1024 (1024 × 1024 (Square))
HiDream: Initial VRAM usage: 0.00 MB
Loading model for dev-nf4...
--- Loading Model Type: dev-nf4 ---
Model Path: azaneko/HiDream-I1-Dev-nf4
NF4: True, Requires BNB: False, Requires GPTQ deps: True
Using Uncensored LLM: None
(Start VRAM: 0.00 MB)
Cache check for key: dev-nf4_standard
Cache contains: []
[1a] Preparing LLM (GPTQ): ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit
Setting max memory limit: 9GiB of 24.0GiB
Using device_map='auto'.
[1b] Loading Tokenizer: ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit...
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub
cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--ModelCloud--Meta-Llama-3.1-8B-Instruct-gptq-4bit. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING
environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Tokenizer loaded.
FETCH ComfyRegistry Data: 45/83
[1c] Loading Text Encoder: ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit... (May download files)
FETCH ComfyRegistry Data: 50/83
FETCH ComfyRegistry Data: 55/83
FETCH ComfyRegistry Data: 60/83
FETCH ComfyRegistry Data: 65/83
FETCH ComfyRegistry Data: 70/83
FETCH ComfyRegistry Data: 75/83
FETCH ComfyRegistry Data: 80/83
FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
INFO Kernel: Auto-selection: adding candidate TritonV2QuantLinear
loss_type=None
was set in the config but it is unrecognised.Using the default loss: ForCausalLMLoss
.
INFO Format: Converting checkpoint_format
from gptq
to internal gptq_v2
.
INFO Format: Converting GPTQ v1 to v2
INFO Format: Conversion complete: 0.008009910583496094s
INFO Optimize: TritonV2QuantLinear
compilation triggered.
✅ Text encoder loaded! (VRAM: 5467.26 MB)
[2] Preparing Transformer from: azaneko/HiDream-I1-Dev-nf4
Type: NF4
Loading Transformer... (May download files)
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub
cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--azaneko--HiDream-I1-Dev-nf4. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING
environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Moving Transformer to CUDA...
✅ Transformer loaded! (VRAM: 14646.96 MB)
[3] Preparing Scheduler: FlashFlowMatchEulerDiscreteScheduler (Default shift: 6.0)
Using Scheduler: FlashFlowMatchEulerDiscreteScheduler
[4] Loading Pipeline from: azaneko/HiDream-I1-Dev-nf4
Passing pre-loaded components...
Fetching 24 files: 100%|█████████████████████████████████████████████████████████████████████████████████████| 24/24 [07:49<00:00, 19.56s/it]
Keyword arguments {'transformer': None} are not expected by HiDreamImagePipeline and will be ignored.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 99.86it/s]
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 24.68it/s]
Pipeline structure loaded.
[5] Finalizing Pipeline...
Assigning transformer...
Moving pipeline object to CUDA (final check)...
Attempting CPU offload for NF4...
✅ CPU offload enabled.
✅ Pipeline ready! (VRAM: 12642.45 MB)
Model dev-nf4 loaded & cached!
Selected Shift Value: 0.0 (Override: 0.0, Default: 6.0)
Using model's default scheduler type: FlashFlowMatchEulerDiscreteScheduler with shift=0.0
Creating Generator on: cuda:0
--- Starting Generation ---
Model: dev-nf4, Res: 1024x1024, Steps: 28, CFG: 0.0, Shift: 0.0, Seed: 42
Using standard sequence lengths: CLIP-L: 77, OpenCLIP: 150, T5: 256, Llama: 256
Skipping pipe.to(cuda:0) (CPU offload enabled).
Executing pipeline inference...
!!! ERROR during execution: DLL load failed while importing cuda_utils: The specified module could not be found.
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hidreamsampler.py", line 679, in generate
pipeline_output = pipe(
^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 646, in call
) = self.encode_prompt(
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 331, in encode_prompt
prompt_embeds, pooled_prompt_embeds = self._encode_prompt(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 480, in _encode_prompt
llama3_prompt_embeds = self._get_llama3_prompt_embeds(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 278, in _get_llama3_prompt_embeds
outputs = self.text_encoder_4(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 821, in forward
outputs: BaseModelOutputWithPast = self.model(
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 571, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 318, in forward
hidden_states, self_attn_weights = self.self_attn(
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 252, in forward
query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\qlinear\tritonv2.py", line 146, in forward
out = QuantLinearFunction.apply(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\amp\autocast_mode.py", line 510, in decorate_fwd
return fwd(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 134, in forward
output = quant_matmul(input, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 125, in quant_matmul
W = dequant(input.dtype, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 109, in dequant
dequant_kernel[grid](
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 345, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\autotuner.py", line 171, in run
ret = self.fn.run(
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 607, in run
device = driver.active.get_current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 23, in getattr
self._initialize_obj()
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 9, in _create_driver
return actives0
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 412, in init
self.utils = CudaUtils() # TODO: make static
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 90, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 72, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 813, in module_from_spec
File "", line 1293, in create_module
File "", line 488, in _call_with_frames_removed
ImportError: DLL load failed while importing cuda_utils: The specified module could not be found.
Original image dimensions: 1024x1024, aspect ratio: 1.000
Selected target resolution: 1024x1024
Processed to: 1024x1024 (divisible by 16)
HiDream: Initial VRAM usage: 12682.58 MB
Clearing img2img cache before loading dev-nf4...
Removing 'dev-nf4'...
Cache cleared.
Loading model for dev-nf4 img2img...
--- Loading Model Type: dev-nf4 ---
Model Path: azaneko/HiDream-I1-Dev-nf4
NF4: True, Requires BNB: False, Requires GPTQ deps: True
Using Uncensored LLM: True
(Start VRAM: 48.45 MB)
Cache check for key: dev-nf4_uncensored
Cache contains: []
[1a] Preparing Uncensored LLM (GPTQ): shuttercat/DarkIdol-Llama3.1-NF4-GPTQ
Setting max memory limit: 9GiB of 24.0GiB
Using device_map='auto'.
[1b] Loading Tokenizer: shuttercat/DarkIdol-Llama3.1-NF4-GPTQ...
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub
cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--shuttercat--DarkIdol-Llama3.1-NF4-GPTQ. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING
environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Tokenizer loaded.
[1c] Loading Text Encoder: shuttercat/DarkIdol-Llama3.1-NF4-GPTQ... (May download files)
Fetching 2 files: 100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [03:56<00:00, 118.40s/it]
INFO Kernel: Auto-selection: adding candidate TritonV2QuantLinear
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.18s/it]
INFO Format: Converting checkpoint_format
from gptq
to internal gptq_v2
.
INFO Format: Conversion complete: 0.003000974655151367s
✅ Text encoder loaded! (VRAM: 5515.72 MB)
[2] Preparing Transformer from: azaneko/HiDream-I1-Dev-nf4
Type: NF4
Loading Transformer... (May download files)
Moving Transformer to CUDA...
✅ Transformer loaded! (VRAM: 14695.41 MB)
[3] Preparing Scheduler: FlashFlowMatchEulerDiscreteScheduler (Default shift: 6.0)
Using Scheduler: FlashFlowMatchEulerDiscreteScheduler
[4] Loading Pipeline from: azaneko/HiDream-I1-Dev-nf4
Passing pre-loaded components...
Keyword arguments {'transformer': None} are not expected by HiDreamImagePipeline and will be ignored.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 153.62it/s]
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 33.98it/s]
Pipeline structure loaded.
[5] Finalizing Pipeline...
Assigning transformer...
Moving pipeline object to CUDA (final check)...
Attempting CPU offload for NF4...
✅ CPU offload enabled.
✅ Pipeline ready! (VRAM: 12690.91 MB)
Creating img2img pipeline from loaded txt2img pipeline...
Model dev-nf4 loaded & cached for img2img!
Selected Shift Value: 0.0 (Override: 0.0, Default: 6.0)
Using model's default scheduler: FlashFlowMatchEulerDiscreteScheduler with shift=0.0
Creating Generator on: cuda:0
--- Starting Img2Img Generation ---
Model: dev-nf4 (uncensored), Input Size: 1024x1024
Denoising: 0.8000000000000002, Steps: 28, CFG: 0.0, Shift: 0.0, Seed: 532986874756016
Skipping pipe.to(cuda:0) (CPU offload enabled).
Executing pipeline inference...
!!! ERROR during execution: DLL load failed while importing cuda_utils: The specified module could not be found.
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hidreamsampler.py", line 1472, in generate
output_images = pipe(
^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image_to_image.py", line 96, in call
) = self.encode_prompt(
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 331, in encode_prompt
prompt_embeds, pooled_prompt_embeds = self._encode_prompt(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 480, in _encode_prompt
llama3_prompt_embeds = self._get_llama3_prompt_embeds(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 278, in _get_llama3_prompt_embeds
outputs = self.text_encoder_4(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 821, in forward
outputs: BaseModelOutputWithPast = self.model(
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 571, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 318, in forward
hidden_states, self_attn_weights = self.self_attn(
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 252, in forward
query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\qlinear\tritonv2.py", line 146, in forward
out = QuantLinearFunction.apply(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\amp\autocast_mode.py", line 510, in decorate_fwd
return fwd(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 134, in forward
output = quant_matmul(input, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 125, in quant_matmul
W = dequant(input.dtype, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 109, in dequant
dequant_kernel[grid](
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 345, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\autotuner.py", line 171, in run
ret = self.fn.run(
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 607, in run
device = driver.active.get_current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 23, in getattr
self._initialize_obj()
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 9, in _create_driver
return actives0
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 412, in init
self.utils = CudaUtils() # TODO: make static
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 90, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 72, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 813, in module_from_spec
File "", line 1293, in create_module
File "", line 488, in _call_with_frames_removed
ImportError: DLL load failed while importing cuda_utils: The specified module could not be found.
Prompt executed in 1402.83 seconds