CUDA Error while training in DDP and using workers in dataloaders

### Bug description

When training a model with multiple GPUs (e.g. 2) and setting `num_workers` of Dataloader to be greater than zero, the Dataloader worker will say `CUDA error: initialization error`.

It is worth noticing that when passing only the train_loader to `trainer.fit`, it works fine.

### What version are you seeing the problem on?

v2.2

### How to reproduce the bug

```python
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torch.nn.utils.rnn import pad_sequence

class Model(pl.LightningModule):

    def __init__(self):
        super().__init__()
        self.a = nn.Linear(1, 1)

    def calc_loss(self, x):
        loss = F.mse_loss(self.a(x.view(-1, 1)), x.view(-1, 1))
        return loss

    def training_step(self, batch, batch_idx):
        loss = self.calc_loss(batch)
        self.log(f'train_loss', loss)
        return loss

    def validation_step(self, batch, batch_idx):
        loss = self.calc_loss(batch)
        self.log(f'val_loss', loss)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.AdamW(self.parameters())
        return optimizer


def collate_char_batch(batch: list[str]):
    vocab = lambda s: [ord(c) for c in s]
    x_tensors = [torch.tensor(vocab(list(s)), dtype=torch.float) for s in batch]
    return pad_sequence(x_tensors, True)

def main():
    model = Model()

    train_loader = DataLoader(['123456'] * 3000000, 2600, True, collate_fn=collate_char_batch, pin_memory=True, num_workers=1)
    val_loader = DataLoader(['123456'] * 3000000, 2600, collate_fn=collate_char_batch, pin_memory=True, num_workers=1)

    trainer = pl.Trainer(
        strategy='ddp',
        default_root_dir=os.path.dirname(__file__),
        max_epochs=10,
    )
    trainer.fit(model, train_loader, val_loader)  # passing only the train_loader is fine

if __name__ == '__main__':
    main()
```


### Error messages and logs

```
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 2 processes
----------------------------------------------------------------------------------------------------

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [2,3]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [2,3]

  | Name | Type   | Params
--------------------------------
0 | a    | Linear | 2
--------------------------------
2         Trainable params
0         Non-trainable params
2         Total params
0.000     Total estimated model params size (MB)
Sanity Checking: |                                                                                                                                                                                                                                                                                    | 0/? [00:00<?, ?it/s]/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=39` in the `DataLoader` to improve performance.
Sanity Checking DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 16.95it/s]/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:441: It is recommended to use `self.log('val_loss', ..., sync_dist=True)` when logging on epoch level in distributed setting to accumulate the metric across devices.
/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=39` in the `DataLoader` to improve performance.
Epoch 0:   0%|                                                                                                                                                                                                                                                                                      | 0/577 [00:00<?, ?it/s]terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: initialization error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f3288b81d87 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f3288b3275f in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f3288f8d8a8 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::cuda::ExchangeDevice(int) + 0x8a (0x7f3288f8dd0a in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0xfa4fea (0x7f323e4cffea in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so)
frame #5: <unknown function> + 0x543010 (0x7f32874bf010 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0x649bf (0x7f3288b669bf in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #7: c10::TensorImpl::~TensorImpl() + 0x21b (0x7f3288b5fc8b in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #8: c10::TensorImpl::~TensorImpl() + 0x9 (0x7f3288b5fe39 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #9: <unknown function> + 0x80b718 (0x7f3287787718 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #10: THPVariable_subclass_dealloc(_object*) + 0x2f6 (0x7f3287787a96 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #11: /workspace/venvs/py311_torch/bin/python() [0x56e57e]
frame #12: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #13: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #14: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #15: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #16: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #17: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #18: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #19: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #20: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #21: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #22: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #23: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #24: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #25: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #26: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #27: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #28: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #29: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #30: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #31: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #32: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #33: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #34: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #35: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #36: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #37: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #38: /workspace/venvs/py311_torch/bin/python() [0x56d278]
frame #39: /workspace/venvs/py311_torch/bin/python() [0x52561a]
frame #40: /workspace/venvs/py311_torch/bin/python() [0x522e55]
frame #41: /workspace/venvs/py311_torch/bin/python() [0x60e6be]
frame #42: /workspace/venvs/py311_torch/bin/python() [0x57da37]
frame #43: PyObject_GetIter + 0x18 (0x52bbb8 in /workspace/venvs/py311_torch/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x2334 (0x53ef04 in /workspace/venvs/py311_torch/bin/python)
frame #45: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #46: _PyEval_EvalFrameDefault + 0x4a81 (0x541651 in /workspace/venvs/py311_torch/bin/python)
frame #47: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #48: /workspace/venvs/py311_torch/bin/python() [0x56d8e6]
frame #49: _PyObject_MakeTpCall + 0x23b (0x52f0ab in /workspace/venvs/py311_torch/bin/python)
frame #50: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #51: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #52: /workspace/venvs/py311_torch/bin/python() [0x56d8e6]
frame #53: _PyObject_MakeTpCall + 0x23b (0x52f0ab in /workspace/venvs/py311_torch/bin/python)
frame #54: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #55: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #56: PyObject_CallOneArg + 0x47 (0x56b797 in /workspace/venvs/py311_torch/bin/python)
frame #57: /workspace/venvs/py311_torch/bin/python() [0x655dfd]
frame #58: PyObject_GetIter + 0x18 (0x52bbb8 in /workspace/venvs/py311_torch/bin/python)
frame #59: /workspace/venvs/py311_torch/bin/python() [0x55badb]
frame #60: PyObject_Vectorcall + 0x35 (0x54ac95 in /workspace/venvs/py311_torch/bin/python)
frame #61: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #62: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #63: PyObject_CallOneArg + 0x47 (0x56b797 in /workspace/venvs/py311_torch/bin/python)

Traceback (most recent call last):
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1133, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/queue.py", line 180, in get
    self.not_empty.wait(remaining)
  File "/usr/lib/python3.11/threading.py", line 331, in wait
    gotit = waiter.acquire(True, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 246611) is killed by signal: Aborted.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspace/password_researches/password-models/test_ddp_dataloader.py", line 53, in <module>
    main()
  File "/workspace/password_researches/password-models/test_ddp_dataloader.py", line 50, in main
    trainer.fit(model, train_loader, val_loader)  # passing only the train_loader is fine
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 105, in launch
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run
    results = self._run_stage()
              ^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 1033, in _run_stage
    self.fit_loop.run()
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run
    self.advance()
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run
    self.advance(data_fetcher)
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 212, in advance
    batch, _, __ = next(data_fetcher)
                   ^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 133, in __next__
    batch = super().__next__()
            ^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 60, in __next__
    batch = next(self.iterator)
            ^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 341, in __next__
    out = next(self._iterator)
          ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 78, in __next__
    out[i] = next(self.iterators[i])
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1329, in _next_data
    idx, data = self._get_data()
                ^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1285, in _get_data
    success, data = self._try_get_data()
                    ^^^^^^^^^^^^^^^^^^^^
  File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1146, in _try_get_data
    raise RuntimeError(f'DataLoader worker (pid(s) {pids_str}) exited unexpectedly') from e
RuntimeError: DataLoader worker (pid(s) 246611) exited unexpectedly
[rank: 1] Child process with PID 246225 terminated with code 1. Forcefully terminating all other processes to avoid zombies 🧟
[1]    246152 killed     CUDA_LAUNCH_BLOCKING=1 CUDA_VISIBLE_DEVICES=2,3 python test_ddp_dataloader.py
```


### Environment

```
<details>
  <summary>Current environment</summary>

* CUDA:
        - GPU:
                - NVIDIA GeForce RTX 3090
                - NVIDIA GeForce RTX 3090
                - NVIDIA GeForce RTX 3090
                - NVIDIA GeForce RTX 3090
                - NVIDIA GeForce RTX 3090
                - NVIDIA GeForce RTX 3090
        - available:         True
        - version:           12.1
* Lightning:
        - lightning-utilities: 0.8.0
        - pytorch-lightning: 2.2.1
        - torch:             2.2.1
        - torch-tb-profiler: 0.4.3
        - torchdata:         0.7.1
        - torchmetrics:      0.11.4
        - torchtext:         0.17.1
        - torchvision:       0.17.1
* Packages:
        - absl-py:           1.4.0
        - aiohttp:           3.8.4
        - aiosignal:         1.3.1
        - alabaster:         0.7.13
        - anyio:             3.6.2
        - argon2-cffi:       21.3.0
        - argon2-cffi-bindings: 21.2.0
        - arrow:             1.2.3
        - asttokens:         2.2.1
        - async-timeout:     4.0.2
        - attrs:             22.2.0
        - babel:             2.13.0
        - backcall:          0.2.0
        - beautifulsoup4:    4.11.2
        - bleach:            6.0.0
        - bloom-filter2:     2.0.0
        - bottleneck:        1.3.7
        - cachetools:        5.3.0
        - certifi:           2022.12.7
        - cffi:              1.15.1
        - charset-normalizer: 2.1.1
        - cmake:             3.25.0
        - comm:              0.1.2
        - contourpy:         1.0.7
        - cppimport:         22.8.2
        - cycler:            0.11.0
        - debugpy:           1.6.6
        - decorator:         5.1.1
        - defusedxml:        0.7.1
        - docutils:          0.20.1
        - executing:         1.2.0
        - fastjsonschema:    2.16.3
        - filelock:          3.9.0
        - fonttools:         4.39.0
        - fqdn:              1.5.1
        - frozenlist:        1.3.3
        - fsspec:            2023.12.2
        - google-auth:       2.16.2
        - google-auth-oauthlib: 1.0.0
        - grpcio:            1.51.3
        - huggingface-hub:   0.20.1
        - idna:              3.4
        - imagesize:         1.4.1
        - ipykernel:         6.21.3
        - ipython:           8.11.0
        - ipython-genutils:  0.2.0
        - ipywidgets:        8.0.4
        - isoduration:       20.11.0
        - jedi:              0.18.2
        - jinja2:            3.1.2
        - joblib:            1.3.2
        - jsonpointer:       2.3
        - jsonschema:        4.17.3
        - jupyter:           1.0.0
        - jupyter-client:    8.0.3
        - jupyter-console:   6.6.3
        - jupyter-core:      5.2.0
        - jupyter-events:    0.6.3
        - jupyter-server:    2.4.0
        - jupyter-server-terminals: 0.4.4
        - jupyterlab-pygments: 0.2.2
        - jupyterlab-widgets: 3.0.5
        - kiwisolver:        1.4.4
        - lightning-utilities: 0.8.0
        - lit:               15.0.7
        - llvmlite:          0.41.0
        - mako:              1.2.4
        - markdown:          3.4.1
        - markupsafe:        2.1.2
        - matplotlib:        3.7.1
        - matplotlib-inline: 0.1.6
        - mistune:           2.0.5
        - mpmath:            1.2.1
        - multidict:         6.0.4
        - nbclassic:         0.5.3
        - nbclient:          0.7.2
        - nbconvert:         7.2.10
        - nbformat:          5.7.3
        - nest-asyncio:      1.5.6
        - networkx:          3.0
        - notebook:          6.5.3
        - notebook-shim:     0.2.2
        - numba:             0.58.0
        - numexpr:           2.8.7
        - numpy:             1.25.2
        - nvidia-cublas-cu11: 11.10.3.66
        - nvidia-cublas-cu12: 12.1.3.1
        - nvidia-cuda-cupti-cu11: 11.7.101
        - nvidia-cuda-cupti-cu12: 12.1.105
        - nvidia-cuda-nvrtc-cu11: 11.7.99
        - nvidia-cuda-nvrtc-cu12: 12.1.105
        - nvidia-cuda-runtime-cu11: 11.7.99
        - nvidia-cuda-runtime-cu12: 12.1.105
        - nvidia-cudnn-cu11: 8.5.0.96
        - nvidia-cudnn-cu12: 8.9.2.26
        - nvidia-cufft-cu11: 10.9.0.58
        - nvidia-cufft-cu12: 11.0.2.54
        - nvidia-curand-cu11: 10.2.10.91
        - nvidia-curand-cu12: 10.3.2.106
        - nvidia-cusolver-cu11: 11.4.0.1
        - nvidia-cusolver-cu12: 11.4.5.107
        - nvidia-cusparse-cu11: 11.7.4.91
        - nvidia-cusparse-cu12: 12.1.0.106
        - nvidia-nccl-cu11:  2.14.3
        - nvidia-nccl-cu12:  2.19.3
        - nvidia-nvjitlink-cu12: 12.2.140
        - nvidia-nvtx-cu11:  11.7.91
        - nvidia-nvtx-cu12:  12.1.105
        - oauthlib:          3.2.2
        - packaging:         23.0
        - pandas:            2.2.1
        - pandocfilters:     1.5.0
        - parso:             0.8.3
        - patsy:             0.5.5
        - pexpect:           4.8.0
        - pickleshare:       0.7.5
        - pillow:            9.4.0
        - pip:               24.0
        - platformdirs:      3.1.1
        - prometheus-client: 0.16.0
        - prompt-toolkit:    3.0.38
        - protobuf:          4.22.1
        - psutil:            5.9.4
        - ptyprocess:        0.7.0
        - pure-eval:         0.2.2
        - pyasn1:            0.4.8
        - pyasn1-modules:    0.2.8
        - pybind11:          2.11.1
        - pybind11-stubgen:  2.5
        - pycparser:         2.21
        - pygments:          2.14.0
        - pyparsing:         3.0.9
        - pypinyin:          0.50.0
        - pyrsistent:        0.19.3
        - python-dateutil:   2.8.2
        - python-json-logger: 2.0.7
        - pytorch-lightning: 2.2.1
        - pytz:              2022.7.1
        - pyyaml:            6.0
        - pyzmq:             25.0.1
        - qtconsole:         5.4.1
        - qtpy:              2.3.0
        - regex:             2023.10.3
        - requests:          2.28.1
        - requests-oauthlib: 1.3.1
        - rfc3339-validator: 0.1.4
        - rfc3986-validator: 0.1.1
        - rsa:               4.9
        - safetensors:       0.4.2
        - scikit-learn:      1.4.1.post1
        - scipy:             1.11.3
        - seaborn:           0.13.2
        - send2trash:        1.8.0
        - sentencepiece:     0.1.97
        - setuptools:        65.5.0
        - six:               1.16.0
        - sniffio:           1.3.0
        - snowballstemmer:   2.2.0
        - soupsieve:         2.4
        - sphinx:            7.2.6
        - sphinxcontrib-applehelp: 1.0.7
        - sphinxcontrib-devhelp: 1.0.5
        - sphinxcontrib-htmlhelp: 2.0.4
        - sphinxcontrib-jsmath: 1.0.1
        - sphinxcontrib-qthelp: 1.0.6
        - sphinxcontrib-serializinghtml: 1.1.9
        - stack-data:        0.6.2
        - statsmodels:       0.14.1
        - sympy:             1.11.1
        - tensorboard:       2.16.2
        - tensorboard-data-server: 0.7.0
        - tensorboard-plugin-wit: 1.8.1
        - terminado:         0.17.1
        - threadpoolctl:     3.2.0
        - tinycss2:          1.2.1
        - tokenizers:        0.15.0
        - torch:             2.2.1
        - torch-tb-profiler: 0.4.3
        - torchdata:         0.7.1
        - torchmetrics:      0.11.4
        - torchtext:         0.17.1
        - torchvision:       0.17.1
        - tornado:           6.2
        - tqdm:              4.66.2
        - traitlets:         5.9.0
        - transformers:      4.38.2
        - triton:            2.2.0
        - typing-extensions: 4.10.0
        - tzdata:            2023.3
        - uri-template:      1.2.0
        - urllib3:           1.26.13
        - wcwidth:           0.2.6
        - webcolors:         1.12
        - webencodings:      0.5.1
        - websocket-client:  1.5.1
        - werkzeug:          2.2.3
        - wheel:             0.40.0
        - widgetsnbextension: 4.0.5
        - yarl:              1.8.2
        - zhdate:            0.1
* System:
        - OS:                Linux
        - architecture:
                - 64bit
                - ELF
        - processor:         x86_64
        - python:            3.11.8
        - release:           5.15.0-91-generic
        - version:           #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023

</details>
```


### More info

_No response_

cc @carmocca @justusschock @awaelchli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA Error while training in DDP and using workers in dataloaders #19598

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA Error while training in DDP and using workers in dataloaders #19598

Description

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Error messages and logs

Environment

More info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions