Open
Description
I tried to test out MachineLearning Notebooks with my Google Colab run.
Created gpu compute (or cpu small) and download files locally from git
Using same hugging face transformers version as I use on colab, cannot load model on Azure ML Notebook.
Any chance to support Hugging Face transformers models in ML Notebooks or am I missing a point?
Torch is by default 1.6 so I upgrade that one too
**Azure Notebook: Python 3.6.9 :: Anaconda, Inc. **
import transformers
transformers.__version__
'4.4.2'
import torch
torch.__version__
##GPU T80 instance, also tried CPU instance too
'1.8.0'
#!sudo apt-get install git-lfs
!git clone https://huggingface.co/gorkemgoknar/gpt2-turkish-writer
#tokenizer loading is fine
from transformers import GPT2Tokenizer, AutoModelForCausalLM
import torch
tokenizer = GPT2Tokenizer.from_pretrained("content/chatbot/model_tr_writer/gpt2-turkish-writer",local_files_only=True)
model = AutoModelForCausalLM.from_pretrained("content/chatbot/model_tr_writer/gpt2-turkish-writer",local_files_only=True)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1061 try:
-> 1062 state_dict = torch.load(resolved_archive_file, map_location="cpu")
1063 except Exception:
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
584 orig_position = opened_file.tell()
--> 585 with _open_zipfile_reader(opened_file) as opened_zipfile:
586 if _is_torchscript_zip(opened_zipfile):
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/torch/serialization.py in __init__(self, name_or_buffer)
241 def __init__(self, name_or_buffer) -> None:
--> 242 super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
243
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
<ipython-input-6-3056aee5f19b> in <module>
----> 1 model = AutoModelForCausalLM.from_pretrained("content/chatbot/model_tr_writer/gpt2-turkish-writer",local_files_only=True)
2
3
4 # Get sequence length max of 1024
5 tokenizer.model_max_length=1024
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/transformers/models/auto/modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1111 if type(config) in MODEL_FOR_CAUSAL_LM_MAPPING.keys():
1112 return MODEL_FOR_CAUSAL_LM_MAPPING[type(config)].from_pretrained(
-> 1113 pretrained_model_name_or_path, *model_args, config=config, **kwargs
1114 )
1115 raise ValueError(
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1063 except Exception:
1064 raise OSError(
-> 1065 f"Unable to load weights from pytorch checkpoint file for '{pretrained_model_name_or_path}' "
1066 f"at '{resolved_archive_file}'"
1067 "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. "
OSError: Unable to load weights from pytorch checkpoint file for 'content/chatbot/model_tr_writer/gpt2-turkish-writer' at 'content/chatbot/model_tr_writer/gpt2-turkish-writer/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
Google Colab - Python 3.7.10
import torch
torch.__version__
"1.8.0+cu101"
import transformers
transformers.__version__
"4.4.2"
import torch
torch.__version__
"1.8.0+cu101"
Only differences seem torch version (yet I can run it on CPU on my Macos) and Python version 3.6.9 vs 3.7.10