Open
Description
System Info
- `Accelerate` version: 0.34.2
- Platform: Linux-5.15.0-1057-aws-x86_64-with-glibc2.31
- `accelerate` bash location: /fsx/umar/miniconda3/envs/memory-efficient-transformers/bin/accelerate
- Python version: 3.10.14
- Numpy version: 1.26.4
- PyTorch version (GPU?): 2.3.1+cu121 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- PyTorch MLU available: False
- PyTorch MUSA available: False
- System RAM: 1999.99 GB
- GPU type: NVIDIA H100 80GB HBM3
- `Accelerate` default config:
Not found
Information
- The official example scripts
- My own modified scripts
Tasks
- One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
) - My own task or dataset (give details below)
Reproduction
dataset_streaming = True
ds_train = ... # Dataset loaded with streaming=True
train_batch_size = 12
collator = DataCollatorForLanguageModeling(...)
dataloader_num_workers = 4
dataloader_prefetch_factor = 10
dl_trainer = DataLoader(
ds_train,
batch_size=train_batch_size,
collate_fn=collator,
shuffle=not dataset_streaming,
drop_last=True,
num_workers=dataloader_num_workers,
prefetch_factor=dataloader_prefetch_factor,
pin_memory=True,
)
model, optimizer, scheduler, dl_eval, dl_trainer = accelerator.prepare(
model, optimizer, scheduler, dl_eval, dl_trainer
)
for _, batch in enumerate(dl_trainer):
training_loop()
A DataLoader initialized with num_workers
results in the following errors when iterating through the wrapper DataLoader:
[rank0]: for _, batch in batch_enumerator:
[rank0]: File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
[rank0]: for obj in iterable:
[rank0]: File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 798, in __iter__
[rank0]: next_batch, next_batch_info = self._fetch_batches(main_iterator)
[rank0]: File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 751, in _fetch_batches
[rank0]: self._update_state_dict()
[rank0]: File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 479, in _update_state_dict
[rank0]: self.adjust_state_dict_for_prefetch()
[rank0]: File "/fsx/umar/miniconda3/envs/memory-efficient-transformers/lib/python3.10/site-packages/accelerate/data_loader.py", line 459, in adjust_state_dict_for_prefetch
[rank0]: if self.dl_state_dict["_sampler_iter_yielded"] > 0:
[rank0]: KeyError: '_sampler_iter_yielded'
I also tried with the latest development version of accelerate (https://github.com/huggingface/accelerate@9f9951325c69f0a6c7c8ab00df2ab8af23b3c1fa
) but I still get the same error.
@muellerzr is aware of this issue.
Expected behavior
I'd like the possibility to prefetch multiple samples and that is only possible by specifying num_workers
to a number greater than 0.