You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[rank0]: File "/home/guomingz/.local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 279, in _wrapper
[rank0]: return func(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/guomingz/.local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4401, in from_pretrained
[rank0]: ) = cls._load_pretrained_model(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/guomingz/.local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4830, in _load_pretrained_model
[rank0]: disk_offload_index, cpu_offload_index = _load_state_dict_into_meta_model(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/guomingz/.local/lib/python3.12/site-packages/transformers/modeling_utils.py", line 777, in _load_state_dict_into_meta_model
[rank0]: shard_and_distribute_module(
[rank0]: File "/home/guomingz/.local/lib/python3.12/site-packages/transformers/integrations/tensor_parallel.py", line 669, in shard_and_distribute_module
[rank0]: param = torch.nn.Parameter(param)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.12/dist-packages/torch/nn/parameter.py", line 46, in __new__
[rank0]: return torch.Tensor._make_subclass(cls, data, requires_grad)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: RuntimeError: Only Tensors of floating point and complex dtype can require gradients
[rank0]:[W424 16:00:01.852231233 ProcessGroupNCCL.cpp:1497] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
If I unset WORLD_SIZE, the above code snippet could run succesfully.
nv-guomingz
changed the title
Failed to load starcoder model with transformer 4.51.3, it's a similar issue like #37737
Failed to load santacoder model with transformer 4.51.3, it's a similar issue like #37737
Apr 24, 2025
nv-guomingz
pushed a commit
to nv-guomingz/transformers
that referenced
this issue
Apr 24, 2025
System Info
Hi, I just installed transformers 4.51.3 on linux system with nvidia gpu.
When I tried to run below code snippet
Meanwhile, I set below envs
I got below error message:
If I unset WORLD_SIZE, the above code snippet could run succesfully.
It looks like one similar issue like #37737
Who can help?
@Cyrilvallez
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
pip install transformers
`import transformers
model = transformers.AutoModelForCausalLM.from_pretrained('bigcode/starcoder', device_map='auto')`
Expected behavior
Load model sucessfully.
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Load model sucessfully.
The text was updated successfully, but these errors were encountered: