When running merge_lora.sh I get 'RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory'

```
[2025-07-16 00:38:57,576] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Loading Qwen2-VL from base model...
Fetching 5 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 66365.57it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  2.41it/s]
Loading additional Qwen2-VL weights...
Traceback (most recent call last):
  File "/qwen_finetune/src/merge_lora_weights.py", line 22, in <module>
    merge_lora(args)
  File "/qwen_finetune/src/merge_lora_weights.py", line 6, in merge_lora
    processor, model = load_pretrained_model(model_path=args.model_path, model_base=args.model_base,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/qwen_finetune/src/utils.py", line 61, in load_pretrained_model
    non_lora_trainables = torch.load(os.path.join(model_path, 'non_lora_state_dict.bin'), map_location='cpu')
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda/envs/train/lib/python3.11/site-packages/torch/serialization.py", line 1432, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda/envs/train/lib/python3.11/site-packages/torch/serialization.py", line 763, in __init__
    super().__init__(torch._C.PyTorchFileReader(name_or_buffer))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
```

For some loras it worked after trying a few times, but I got bad results for them so not sure if related. For others it is persisting. Also I had to manually copy the config.json file since it wasn't present in the checkpoint folders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When running merge_lora.sh I get 'RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory' #165

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

When running merge_lora.sh I get 'RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory' #165

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions