-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
bugSomething isn't workingSomething isn't workingfabriclightning.fabric.Fabriclightning.fabric.Fabrictorch.compile
Milestone
Description
Bug description
When training a model with multiple GPUs (e.g. 2) and setting num_workers
of Dataloader to be greater than zero, the Dataloader worker will say CUDA error: initialization error
.
It is worth noticing that when passing only the train_loader to trainer.fit
, it works fine.
What version are you seeing the problem on?
v2.2
How to reproduce the bug
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torch.nn.utils.rnn import pad_sequence
class Model(pl.LightningModule):
def __init__(self):
super().__init__()
self.a = nn.Linear(1, 1)
def calc_loss(self, x):
loss = F.mse_loss(self.a(x.view(-1, 1)), x.view(-1, 1))
return loss
def training_step(self, batch, batch_idx):
loss = self.calc_loss(batch)
self.log(f'train_loss', loss)
return loss
def validation_step(self, batch, batch_idx):
loss = self.calc_loss(batch)
self.log(f'val_loss', loss)
return loss
def configure_optimizers(self):
optimizer = torch.optim.AdamW(self.parameters())
return optimizer
def collate_char_batch(batch: list[str]):
vocab = lambda s: [ord(c) for c in s]
x_tensors = [torch.tensor(vocab(list(s)), dtype=torch.float) for s in batch]
return pad_sequence(x_tensors, True)
def main():
model = Model()
train_loader = DataLoader(['123456'] * 3000000, 2600, True, collate_fn=collate_char_batch, pin_memory=True, num_workers=1)
val_loader = DataLoader(['123456'] * 3000000, 2600, collate_fn=collate_char_batch, pin_memory=True, num_workers=1)
trainer = pl.Trainer(
strategy='ddp',
default_root_dir=os.path.dirname(__file__),
max_epochs=10,
)
trainer.fit(model, train_loader, val_loader) # passing only the train_loader is fine
if __name__ == '__main__':
main()
Error messages and logs
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 2 processes
----------------------------------------------------------------------------------------------------
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [2,3]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [2,3]
| Name | Type | Params
--------------------------------
0 | a | Linear | 2
--------------------------------
2 Trainable params
0 Non-trainable params
2 Total params
0.000 Total estimated model params size (MB)
Sanity Checking: | | 0/? [00:00<?, ?it/s]/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=39` in the `DataLoader` to improve performance.
Sanity Checking DataLoader 0: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 16.95it/s]/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:441: It is recommended to use `self.log('val_loss', ..., sync_dist=True)` when logging on epoch level in distributed setting to accumulate the metric across devices.
/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=39` in the `DataLoader` to improve performance.
Epoch 0: 0%| | 0/577 [00:00<?, ?it/s]terminate called after throwing an instance of 'c10::Error'
what(): CUDA error: initialization error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f3288b81d87 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f3288b3275f in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f3288f8d8a8 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::cuda::ExchangeDevice(int) + 0x8a (0x7f3288f8dd0a in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0xfa4fea (0x7f323e4cffea in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_cuda.so)
frame #5: <unknown function> + 0x543010 (0x7f32874bf010 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0x649bf (0x7f3288b669bf in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #7: c10::TensorImpl::~TensorImpl() + 0x21b (0x7f3288b5fc8b in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #8: c10::TensorImpl::~TensorImpl() + 0x9 (0x7f3288b5fe39 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libc10.so)
frame #9: <unknown function> + 0x80b718 (0x7f3287787718 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #10: THPVariable_subclass_dealloc(_object*) + 0x2f6 (0x7f3287787a96 in /workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/lib/libtorch_python.so)
frame #11: /workspace/venvs/py311_torch/bin/python() [0x56e57e]
frame #12: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #13: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #14: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #15: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #16: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #17: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #18: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #19: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #20: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #21: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #22: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #23: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #24: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #25: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #26: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #27: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #28: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #29: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #30: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #31: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #32: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #33: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #34: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #35: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #36: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #37: /workspace/venvs/py311_torch/bin/python() [0x56e54c]
frame #38: /workspace/venvs/py311_torch/bin/python() [0x56d278]
frame #39: /workspace/venvs/py311_torch/bin/python() [0x52561a]
frame #40: /workspace/venvs/py311_torch/bin/python() [0x522e55]
frame #41: /workspace/venvs/py311_torch/bin/python() [0x60e6be]
frame #42: /workspace/venvs/py311_torch/bin/python() [0x57da37]
frame #43: PyObject_GetIter + 0x18 (0x52bbb8 in /workspace/venvs/py311_torch/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x2334 (0x53ef04 in /workspace/venvs/py311_torch/bin/python)
frame #45: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #46: _PyEval_EvalFrameDefault + 0x4a81 (0x541651 in /workspace/venvs/py311_torch/bin/python)
frame #47: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #48: /workspace/venvs/py311_torch/bin/python() [0x56d8e6]
frame #49: _PyObject_MakeTpCall + 0x23b (0x52f0ab in /workspace/venvs/py311_torch/bin/python)
frame #50: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #51: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #52: /workspace/venvs/py311_torch/bin/python() [0x56d8e6]
frame #53: _PyObject_MakeTpCall + 0x23b (0x52f0ab in /workspace/venvs/py311_torch/bin/python)
frame #54: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #55: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #56: PyObject_CallOneArg + 0x47 (0x56b797 in /workspace/venvs/py311_torch/bin/python)
frame #57: /workspace/venvs/py311_torch/bin/python() [0x655dfd]
frame #58: PyObject_GetIter + 0x18 (0x52bbb8 in /workspace/venvs/py311_torch/bin/python)
frame #59: /workspace/venvs/py311_torch/bin/python() [0x55badb]
frame #60: PyObject_Vectorcall + 0x35 (0x54ac95 in /workspace/venvs/py311_torch/bin/python)
frame #61: _PyEval_EvalFrameDefault + 0x6be (0x53d28e in /workspace/venvs/py311_torch/bin/python)
frame #62: _PyFunction_Vectorcall + 0x173 (0x565f23 in /workspace/venvs/py311_torch/bin/python)
frame #63: PyObject_CallOneArg + 0x47 (0x56b797 in /workspace/venvs/py311_torch/bin/python)
Traceback (most recent call last):
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1133, in _try_get_data
data = self._data_queue.get(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/queue.py", line 180, in get
self.not_empty.wait(remaining)
File "/usr/lib/python3.11/threading.py", line 331, in wait
gotit = waiter.acquire(True, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 246611) is killed by signal: Aborted.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspace/password_researches/password-models/test_ddp_dataloader.py", line 53, in <module>
main()
File "/workspace/password_researches/password-models/test_ddp_dataloader.py", line 50, in main
trainer.fit(model, train_loader, val_loader) # passing only the train_loader is fine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
call._call_and_handle_interrupt(
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 105, in launch
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 987, in _run
results = self._run_stage()
^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 1033, in _run_stage
self.fit_loop.run()
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run
self.advance()
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance
self.epoch_loop.run(self._data_fetcher)
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run
self.advance(data_fetcher)
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 212, in advance
batch, _, __ = next(data_fetcher)
^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 133, in __next__
batch = super().__next__()
^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/loops/fetchers.py", line 60, in __next__
batch = next(self.iterator)
^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 341, in __next__
out = next(self._iterator)
^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/pytorch_lightning/utilities/combined_loader.py", line 78, in __next__
out[i] = next(self.iterators[i])
^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1329, in _next_data
idx, data = self._get_data()
^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1285, in _get_data
success, data = self._try_get_data()
^^^^^^^^^^^^^^^^^^^^
File "/workspace/venvs/py311_torch/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1146, in _try_get_data
raise RuntimeError(f'DataLoader worker (pid(s) {pids_str}) exited unexpectedly') from e
RuntimeError: DataLoader worker (pid(s) 246611) exited unexpectedly
[rank: 1] Child process with PID 246225 terminated with code 1. Forcefully terminating all other processes to avoid zombies 🧟
[1] 246152 killed CUDA_LAUNCH_BLOCKING=1 CUDA_VISIBLE_DEVICES=2,3 python test_ddp_dataloader.py
Environment
<details>
<summary>Current environment</summary>
* CUDA:
- GPU:
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 3090
- available: True
- version: 12.1
* Lightning:
- lightning-utilities: 0.8.0
- pytorch-lightning: 2.2.1
- torch: 2.2.1
- torch-tb-profiler: 0.4.3
- torchdata: 0.7.1
- torchmetrics: 0.11.4
- torchtext: 0.17.1
- torchvision: 0.17.1
* Packages:
- absl-py: 1.4.0
- aiohttp: 3.8.4
- aiosignal: 1.3.1
- alabaster: 0.7.13
- anyio: 3.6.2
- argon2-cffi: 21.3.0
- argon2-cffi-bindings: 21.2.0
- arrow: 1.2.3
- asttokens: 2.2.1
- async-timeout: 4.0.2
- attrs: 22.2.0
- babel: 2.13.0
- backcall: 0.2.0
- beautifulsoup4: 4.11.2
- bleach: 6.0.0
- bloom-filter2: 2.0.0
- bottleneck: 1.3.7
- cachetools: 5.3.0
- certifi: 2022.12.7
- cffi: 1.15.1
- charset-normalizer: 2.1.1
- cmake: 3.25.0
- comm: 0.1.2
- contourpy: 1.0.7
- cppimport: 22.8.2
- cycler: 0.11.0
- debugpy: 1.6.6
- decorator: 5.1.1
- defusedxml: 0.7.1
- docutils: 0.20.1
- executing: 1.2.0
- fastjsonschema: 2.16.3
- filelock: 3.9.0
- fonttools: 4.39.0
- fqdn: 1.5.1
- frozenlist: 1.3.3
- fsspec: 2023.12.2
- google-auth: 2.16.2
- google-auth-oauthlib: 1.0.0
- grpcio: 1.51.3
- huggingface-hub: 0.20.1
- idna: 3.4
- imagesize: 1.4.1
- ipykernel: 6.21.3
- ipython: 8.11.0
- ipython-genutils: 0.2.0
- ipywidgets: 8.0.4
- isoduration: 20.11.0
- jedi: 0.18.2
- jinja2: 3.1.2
- joblib: 1.3.2
- jsonpointer: 2.3
- jsonschema: 4.17.3
- jupyter: 1.0.0
- jupyter-client: 8.0.3
- jupyter-console: 6.6.3
- jupyter-core: 5.2.0
- jupyter-events: 0.6.3
- jupyter-server: 2.4.0
- jupyter-server-terminals: 0.4.4
- jupyterlab-pygments: 0.2.2
- jupyterlab-widgets: 3.0.5
- kiwisolver: 1.4.4
- lightning-utilities: 0.8.0
- lit: 15.0.7
- llvmlite: 0.41.0
- mako: 1.2.4
- markdown: 3.4.1
- markupsafe: 2.1.2
- matplotlib: 3.7.1
- matplotlib-inline: 0.1.6
- mistune: 2.0.5
- mpmath: 1.2.1
- multidict: 6.0.4
- nbclassic: 0.5.3
- nbclient: 0.7.2
- nbconvert: 7.2.10
- nbformat: 5.7.3
- nest-asyncio: 1.5.6
- networkx: 3.0
- notebook: 6.5.3
- notebook-shim: 0.2.2
- numba: 0.58.0
- numexpr: 2.8.7
- numpy: 1.25.2
- nvidia-cublas-cu11: 11.10.3.66
- nvidia-cublas-cu12: 12.1.3.1
- nvidia-cuda-cupti-cu11: 11.7.101
- nvidia-cuda-cupti-cu12: 12.1.105
- nvidia-cuda-nvrtc-cu11: 11.7.99
- nvidia-cuda-nvrtc-cu12: 12.1.105
- nvidia-cuda-runtime-cu11: 11.7.99
- nvidia-cuda-runtime-cu12: 12.1.105
- nvidia-cudnn-cu11: 8.5.0.96
- nvidia-cudnn-cu12: 8.9.2.26
- nvidia-cufft-cu11: 10.9.0.58
- nvidia-cufft-cu12: 11.0.2.54
- nvidia-curand-cu11: 10.2.10.91
- nvidia-curand-cu12: 10.3.2.106
- nvidia-cusolver-cu11: 11.4.0.1
- nvidia-cusolver-cu12: 11.4.5.107
- nvidia-cusparse-cu11: 11.7.4.91
- nvidia-cusparse-cu12: 12.1.0.106
- nvidia-nccl-cu11: 2.14.3
- nvidia-nccl-cu12: 2.19.3
- nvidia-nvjitlink-cu12: 12.2.140
- nvidia-nvtx-cu11: 11.7.91
- nvidia-nvtx-cu12: 12.1.105
- oauthlib: 3.2.2
- packaging: 23.0
- pandas: 2.2.1
- pandocfilters: 1.5.0
- parso: 0.8.3
- patsy: 0.5.5
- pexpect: 4.8.0
- pickleshare: 0.7.5
- pillow: 9.4.0
- pip: 24.0
- platformdirs: 3.1.1
- prometheus-client: 0.16.0
- prompt-toolkit: 3.0.38
- protobuf: 4.22.1
- psutil: 5.9.4
- ptyprocess: 0.7.0
- pure-eval: 0.2.2
- pyasn1: 0.4.8
- pyasn1-modules: 0.2.8
- pybind11: 2.11.1
- pybind11-stubgen: 2.5
- pycparser: 2.21
- pygments: 2.14.0
- pyparsing: 3.0.9
- pypinyin: 0.50.0
- pyrsistent: 0.19.3
- python-dateutil: 2.8.2
- python-json-logger: 2.0.7
- pytorch-lightning: 2.2.1
- pytz: 2022.7.1
- pyyaml: 6.0
- pyzmq: 25.0.1
- qtconsole: 5.4.1
- qtpy: 2.3.0
- regex: 2023.10.3
- requests: 2.28.1
- requests-oauthlib: 1.3.1
- rfc3339-validator: 0.1.4
- rfc3986-validator: 0.1.1
- rsa: 4.9
- safetensors: 0.4.2
- scikit-learn: 1.4.1.post1
- scipy: 1.11.3
- seaborn: 0.13.2
- send2trash: 1.8.0
- sentencepiece: 0.1.97
- setuptools: 65.5.0
- six: 1.16.0
- sniffio: 1.3.0
- snowballstemmer: 2.2.0
- soupsieve: 2.4
- sphinx: 7.2.6
- sphinxcontrib-applehelp: 1.0.7
- sphinxcontrib-devhelp: 1.0.5
- sphinxcontrib-htmlhelp: 2.0.4
- sphinxcontrib-jsmath: 1.0.1
- sphinxcontrib-qthelp: 1.0.6
- sphinxcontrib-serializinghtml: 1.1.9
- stack-data: 0.6.2
- statsmodels: 0.14.1
- sympy: 1.11.1
- tensorboard: 2.16.2
- tensorboard-data-server: 0.7.0
- tensorboard-plugin-wit: 1.8.1
- terminado: 0.17.1
- threadpoolctl: 3.2.0
- tinycss2: 1.2.1
- tokenizers: 0.15.0
- torch: 2.2.1
- torch-tb-profiler: 0.4.3
- torchdata: 0.7.1
- torchmetrics: 0.11.4
- torchtext: 0.17.1
- torchvision: 0.17.1
- tornado: 6.2
- tqdm: 4.66.2
- traitlets: 5.9.0
- transformers: 4.38.2
- triton: 2.2.0
- typing-extensions: 4.10.0
- tzdata: 2023.3
- uri-template: 1.2.0
- urllib3: 1.26.13
- wcwidth: 0.2.6
- webcolors: 1.12
- webencodings: 0.5.1
- websocket-client: 1.5.1
- werkzeug: 2.2.3
- wheel: 0.40.0
- widgetsnbextension: 4.0.5
- yarl: 1.8.2
- zhdate: 0.1
* System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.11.8
- release: 5.15.0-91-generic
- version: #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023
</details>
More info
No response
spirosbax
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingfabriclightning.fabric.Fabriclightning.fabric.Fabrictorch.compile