Skip to content

DLL load failed while importing cuda_utils: The specified module could not be found. #135

Open
@Code4SAFrankie

Description

@Code4SAFrankie

Got this error:

D:\ComfyUI_windows_portable>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --use-flash-attention
Adding extra search path checkpoints D:\webui_forge_cu121_torch231\webui\models\Stable-diffusion
Adding extra search path configs D:\webui_forge_cu121_torch231\webui\models\Stable-diffusion
Adding extra search path vae D:\webui_forge_cu121_torch231\webui\models\VAE
Adding extra search path vae_approx D:\webui_forge_cu121_torch231\webui\models\VAE-approx
Adding extra search path loras D:\webui_forge_cu121_torch231\webui\models\Lora
Adding extra search path loras D:\webui_forge_cu121_torch231\webui\models\LyCORIS
Adding extra search path hypernetworks D:\webui_forge_cu121_torch231\webui\models\hypernetworks
Adding extra search path diffusers D:\webui_forge_cu121_torch231\webui\models\diffusers
Adding extra search path controlnet D:\webui_forge_cu121_torch231\webui\models\ControlNet
Adding extra search path clip D:\webui_forge_cu121_torch231\webui\models\text_encoder
Adding extra search path embeddings D:\webui_forge_cu121_torch231\webui\embeddings
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\ESRGAN
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\RealESRGAN
Adding extra search path upscale_models D:\webui_forge_cu121_torch231\webui\models\SwinIR
[START] Security scan
[DONE] Security scan

ComfyUI-Manager: installing dependencies done.

** ComfyUI startup time: 2025-05-01 14:50:32.336
** Platform: Windows
** Python version: 3.12.9 (tags/v3.12.9:fdb8142, Feb 4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)]
** Python executable: D:\ComfyUI_windows_portable\python_embeded\python.exe
** ComfyUI Path: D:\ComfyUI_windows_portable\ComfyUI
** ComfyUI Base Folder Path: D:\ComfyUI_windows_portable\ComfyUI
** User directory: D:\ComfyUI_windows_portable\ComfyUI\user
** ComfyUI-Manager config path: D:\ComfyUI_windows_portable\ComfyUI\user\default\ComfyUI-Manager\config.ini
** Log path: D:\ComfyUI_windows_portable\ComfyUI\user\comfyui.log

Prestartup times for custom nodes:
1.5 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-manager

Checkpoint files will always be loaded safely.
Total VRAM 24564 MB, total RAM 65129 MB
pytorch version: 2.7.0+cu128
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
Using Flash Attention
Python version: 3.12.9 (tags/v3.12.9:fdb8142, Feb 4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)]
ComfyUI version: 0.3.30
ComfyUI frontend version: 1.17.11
[Prompt Server] web root: D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 2128, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 995, in exec_module
File "", line 1132, in get_code
File "", line 1190, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI\init.py'

Cannot import D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI module for custom nodes: [Errno 2] No such file or directory: 'D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI\init.py'

Loading: ComfyUI-Manager (V3.31.13)

[ComfyUI-Manager] network_mode: public

ComfyUI Revision: 163 [a97f2f85] *DETACHED | Released on '2025-04-24'

[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
PyTorch version 2.7.0+cu128 available.
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json

INFO ENV: Auto setting CUDA_DEVICE_ORDER=PCI_BUS_ID for correctness.
Optimum library found. GPTQ model loading enabled (requires suitable backend).
HiDream: Successfully registered with ComfyUI memory management

HiDream Sampler Node Initialized
Available Models: ['full-nf4', 'dev-nf4', 'fast-nf4', 'full', 'dev', 'fast']

Import times for custom nodes:
0.0 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\websocket_image_save.py
0.0 seconds (IMPORT FAILED): D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI
0.1 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-manager
0.8 seconds: D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler

Starting server

To see the GUI go to: http://127.0.0.1:8188
FETCH ComfyRegistry Data: 5/83
FETCH ComfyRegistry Data: 10/83
FETCH ComfyRegistry Data: 15/83
FETCH ComfyRegistry Data: 20/83
FETCH ComfyRegistry Data: 25/83
FETCH ComfyRegistry Data: 30/83
FETCH ComfyRegistry Data: 35/83
FETCH ComfyRegistry Data: 40/83
got prompt
Failed to validate prompt for output 2:

  • HiDreamSamplerAdvanced 19:
    • Value 77.0 bigger than max of 5.0: llama_weight
    • Value 256 bigger than max of 218: max_length_openclip
      Output will be ignored
      Successfully parsed resolution: 1024x1024
      Using fixed resolution: 1024x1024 (1024 × 1024 (Square))
      HiDream: Initial VRAM usage: 0.00 MB
      Loading model for dev-nf4...
      --- Loading Model Type: dev-nf4 ---
      Model Path: azaneko/HiDream-I1-Dev-nf4
      NF4: True, Requires BNB: False, Requires GPTQ deps: True
      Using Uncensored LLM: None
      (Start VRAM: 0.00 MB)
      Cache check for key: dev-nf4_standard
      Cache contains: []

[1a] Preparing LLM (GPTQ): ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit
Setting max memory limit: 9GiB of 24.0GiB
Using device_map='auto'.
[1b] Loading Tokenizer: ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit...
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--ModelCloud--Meta-Llama-3.1-8B-Instruct-gptq-4bit. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Tokenizer loaded.
FETCH ComfyRegistry Data: 45/83
[1c] Loading Text Encoder: ModelCloud/Meta-Llama-3.1-8B-Instruct-gptq-4bit... (May download files)
FETCH ComfyRegistry Data: 50/83
FETCH ComfyRegistry Data: 55/83
FETCH ComfyRegistry Data: 60/83
FETCH ComfyRegistry Data: 65/83
FETCH ComfyRegistry Data: 70/83
FETCH ComfyRegistry Data: 75/83
FETCH ComfyRegistry Data: 80/83
FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
INFO Kernel: Auto-selection: adding candidate TritonV2QuantLinear
loss_type=None was set in the config but it is unrecognised.Using the default loss: ForCausalLMLoss.
INFO Format: Converting checkpoint_format from gptq to internal gptq_v2.
INFO Format: Converting GPTQ v1 to v2
INFO Format: Conversion complete: 0.008009910583496094s
INFO Optimize: TritonV2QuantLinear compilation triggered.
✅ Text encoder loaded! (VRAM: 5467.26 MB)

[2] Preparing Transformer from: azaneko/HiDream-I1-Dev-nf4
Type: NF4
Loading Transformer... (May download files)
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--azaneko--HiDream-I1-Dev-nf4. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Moving Transformer to CUDA...
✅ Transformer loaded! (VRAM: 14646.96 MB)

[3] Preparing Scheduler: FlashFlowMatchEulerDiscreteScheduler (Default shift: 6.0)
Using Scheduler: FlashFlowMatchEulerDiscreteScheduler

[4] Loading Pipeline from: azaneko/HiDream-I1-Dev-nf4
Passing pre-loaded components...
Fetching 24 files: 100%|█████████████████████████████████████████████████████████████████████████████████████| 24/24 [07:49<00:00, 19.56s/it]
Keyword arguments {'transformer': None} are not expected by HiDreamImagePipeline and will be ignored.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 99.86it/s]
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 24.68it/s]
Pipeline structure loaded.

[5] Finalizing Pipeline...
Assigning transformer...
Moving pipeline object to CUDA (final check)...
Attempting CPU offload for NF4...
✅ CPU offload enabled.
✅ Pipeline ready! (VRAM: 12642.45 MB)
Model dev-nf4 loaded & cached!
Selected Shift Value: 0.0 (Override: 0.0, Default: 6.0)
Using model's default scheduler type: FlashFlowMatchEulerDiscreteScheduler with shift=0.0
Creating Generator on: cuda:0

--- Starting Generation ---
Model: dev-nf4, Res: 1024x1024, Steps: 28, CFG: 0.0, Shift: 0.0, Seed: 42
Using standard sequence lengths: CLIP-L: 77, OpenCLIP: 150, T5: 256, Llama: 256
Skipping pipe.to(cuda:0) (CPU offload enabled).
Executing pipeline inference...
!!! ERROR during execution: DLL load failed while importing cuda_utils: The specified module could not be found.
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hidreamsampler.py", line 679, in generate
pipeline_output = pipe(
^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 646, in call
) = self.encode_prompt(
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 331, in encode_prompt
prompt_embeds, pooled_prompt_embeds = self._encode_prompt(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 480, in _encode_prompt
llama3_prompt_embeds = self._get_llama3_prompt_embeds(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 278, in _get_llama3_prompt_embeds
outputs = self.text_encoder_4(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 821, in forward
outputs: BaseModelOutputWithPast = self.model(
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 571, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 318, in forward
hidden_states, self_attn_weights = self.self_attn(
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 252, in forward
query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\qlinear\tritonv2.py", line 146, in forward
out = QuantLinearFunction.apply(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\amp\autocast_mode.py", line 510, in decorate_fwd
return fwd(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 134, in forward
output = quant_matmul(input, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 125, in quant_matmul
W = dequant(input.dtype, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 109, in dequant
dequant_kernel[grid](
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 345, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\autotuner.py", line 171, in run
ret = self.fn.run(
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 607, in run
device = driver.active.get_current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 23, in getattr
self._initialize_obj()
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 9, in _create_driver
return actives0
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 412, in init
self.utils = CudaUtils() # TODO: make static
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 90, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 72, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 813, in module_from_spec
File "", line 1293, in create_module
File "", line 488, in _call_with_frames_removed
ImportError: DLL load failed while importing cuda_utils: The specified module could not be found.
Original image dimensions: 1024x1024, aspect ratio: 1.000
Selected target resolution: 1024x1024
Processed to: 1024x1024 (divisible by 16)
HiDream: Initial VRAM usage: 12682.58 MB
Clearing img2img cache before loading dev-nf4...
Removing 'dev-nf4'...
Cache cleared.
Loading model for dev-nf4 img2img...
--- Loading Model Type: dev-nf4 ---
Model Path: azaneko/HiDream-I1-Dev-nf4
NF4: True, Requires BNB: False, Requires GPTQ deps: True
Using Uncensored LLM: True
(Start VRAM: 48.45 MB)
Cache check for key: dev-nf4_uncensored
Cache contains: []

[1a] Preparing Uncensored LLM (GPTQ): shuttercat/DarkIdol-Llama3.1-NF4-GPTQ
Setting max memory limit: 9GiB of 24.0GiB
Using device_map='auto'.
[1b] Loading Tokenizer: shuttercat/DarkIdol-Llama3.1-NF4-GPTQ...
D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\huggingface_hub\file_download.py:144: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\fdlou.cache\huggingface\hub\models--shuttercat--DarkIdol-Llama3.1-NF4-GPTQ. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Tokenizer loaded.
[1c] Loading Text Encoder: shuttercat/DarkIdol-Llama3.1-NF4-GPTQ... (May download files)
Fetching 2 files: 100%|███████████████████████████████████████████████████████████████████████████████████████| 2/2 [03:56<00:00, 118.40s/it]
INFO Kernel: Auto-selection: adding candidate TritonV2QuantLinear
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.18s/it]
INFO Format: Converting checkpoint_format from gptq to internal gptq_v2.
INFO Format: Conversion complete: 0.003000974655151367s
✅ Text encoder loaded! (VRAM: 5515.72 MB)

[2] Preparing Transformer from: azaneko/HiDream-I1-Dev-nf4
Type: NF4
Loading Transformer... (May download files)
Moving Transformer to CUDA...
✅ Transformer loaded! (VRAM: 14695.41 MB)

[3] Preparing Scheduler: FlashFlowMatchEulerDiscreteScheduler (Default shift: 6.0)
Using Scheduler: FlashFlowMatchEulerDiscreteScheduler

[4] Loading Pipeline from: azaneko/HiDream-I1-Dev-nf4
Passing pre-loaded components...
Keyword arguments {'transformer': None} are not expected by HiDreamImagePipeline and will be ignored.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 153.62it/s]
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 33.98it/s]
Pipeline structure loaded.

[5] Finalizing Pipeline...
Assigning transformer...
Moving pipeline object to CUDA (final check)...
Attempting CPU offload for NF4...
✅ CPU offload enabled.
✅ Pipeline ready! (VRAM: 12690.91 MB)
Creating img2img pipeline from loaded txt2img pipeline...
Model dev-nf4 loaded & cached for img2img!
Selected Shift Value: 0.0 (Override: 0.0, Default: 6.0)
Using model's default scheduler: FlashFlowMatchEulerDiscreteScheduler with shift=0.0
Creating Generator on: cuda:0

--- Starting Img2Img Generation ---
Model: dev-nf4 (uncensored), Input Size: 1024x1024
Denoising: 0.8000000000000002, Steps: 28, CFG: 0.0, Shift: 0.0, Seed: 532986874756016
Skipping pipe.to(cuda:0) (CPU offload enabled).
Executing pipeline inference...
!!! ERROR during execution: DLL load failed while importing cuda_utils: The specified module could not be found.
Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hidreamsampler.py", line 1472, in generate
output_images = pipe(
^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image_to_image.py", line 96, in call
) = self.encode_prompt(
^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 331, in encode_prompt
prompt_embeds, pooled_prompt_embeds = self._encode_prompt(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 480, in _encode_prompt
llama3_prompt_embeds = self._get_llama3_prompt_embeds(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui_HiDream-Sampler\hi_diffusers\pipelines\hidream_image\pipeline_hidream_image.py", line 278, in _get_llama3_prompt_embeds
outputs = self.text_encoder_4(
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 821, in forward
outputs: BaseModelOutputWithPast = self.model(
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 965, in wrapper
output = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 571, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 318, in forward
hidden_states, self_attn_weights = self.self_attn(
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 252, in forward
query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1751, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1762, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\qlinear\tritonv2.py", line 146, in forward
out = QuantLinearFunction.apply(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\autograd\function.py", line 575, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\amp\autocast_mode.py", line 510, in decorate_fwd
return fwd(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 134, in forward
output = quant_matmul(input, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 125, in quant_matmul
W = dequant(input.dtype, qweight, scales, qzeros, g_idx, bits, pack_bits, maxq)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\gptqmodel\nn_modules\triton_utils\dequant.py", line 109, in dequant
dequant_kernel[grid](
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 345, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\autotuner.py", line 171, in run
ret = self.fn.run(
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\jit.py", line 607, in run
device = driver.active.get_current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 23, in getattr
self._initialize_obj()
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\runtime\driver.py", line 9, in _create_driver
return actives0
^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 412, in init
self.utils = CudaUtils() # TODO: make static
^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 90, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\triton\backends\nvidia\driver.py", line 72, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 813, in module_from_spec
File "", line 1293, in create_module
File "", line 488, in _call_with_frames_removed
ImportError: DLL load failed while importing cuda_utils: The specified module could not be found.
Prompt executed in 1402.83 seconds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions