Impossibility to use num_workers and prefetch_factor when using StatefulDataLoader (use_stateful_dataloader=True)

### System Info

```Shell
- `Accelerate` version: 0.34.2
- Platform: Linux-5.15.0-1057-aws-x86_64-with-glibc2.31
- `accelerate` bash location: /fsx/umar/miniconda3/envs/memory-efficient-transformers/bin/accelerate
- Python version: 3.10.14
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.3.1+cu121 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- PyTorch MLU available: False
- PyTorch MUSA available: False
- System RAM: 1999.99 GB
- GPU type: NVIDIA H100 80GB HBM3
- `Accelerate` default config:
        Not found
```


### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported `no_trainer` script in the `examples` folder of the `transformers` repo (such as `run_no_trainer_glue.py`)
- [X] My own task or dataset (give details below)

### Reproduction

```

dataset_streaming = True
ds_train = ... # Dataset loaded with streaming=True
train_batch_size = 12
collator = DataCollatorForLanguageModeling(...)
dataloader_num_workers = 4
dataloader_prefetch_factor = 10

dl_trainer = DataLoader(
        ds_train,
        batch_size=train_batch_size,
        collate_fn=collator,
        shuffle=not dataset_streaming,
        drop_last=True,
        num_workers=dataloader_num_workers,
        prefetch_factor=dataloader_prefetch_factor,
        pin_memory=True,
    )

model, optimizer, scheduler, dl_eval, dl_trainer = accelerator.prepare(
        model, optimizer, scheduler, dl_eval, dl_trainer
    )

for _, batch in enumerate(dl_trainer):
     training_loop()
```

A DataLoader initialized with `num_workers` results in the following errors when iterating through the wrapper DataLoader:

```
[rank0]:     for _, batch in batch_enumerator:
[rank0]:   File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
[rank0]:     for obj in iterable:
[rank0]:   File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 798, in __iter__
[rank0]:     next_batch, next_batch_info = self._fetch_batches(main_iterator)
[rank0]:   File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 751, in _fetch_batches
[rank0]:     self._update_state_dict()
[rank0]:   File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 479, in _update_state_dict
[rank0]:     self.adjust_state_dict_for_prefetch()
[rank0]:   File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 459, in adjust_state_dict_for_prefetch
[rank0]:     if self.dl_state_dict["_sampler_iter_yielded"] > 0:
[rank0]: KeyError: '_sampler_iter_yielded'
```

I also tried with the latest development version of accelerate (`https://github.com/huggingface/accelerate@9f9951325c69f0a6c7c8ab00df2ab8af23b3c1fa`) but I still get the same error.

@muellerzr is aware of this issue.

### Expected behavior

I'd like the possibility to prefetch multiple samples and that is only possible by specifying `num_workers` to a number greater than 0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Impossibility to use num_workers and prefetch_factor when using StatefulDataLoader (use_stateful_dataloader=True) #3110

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Impossibility to use num_workers and prefetch_factor when using StatefulDataLoader (use_stateful_dataloader=True) #3110

Description

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions