Python demos requirement incompatibility

**Describe the bug**
There's still one more issue caused by the transformers upgrade aimed at the 2025.1 release. If you run a test program that is designed to confirm compatibility between the transformers library and the Intel-optimized optimum.intel.openvino you get a traceback:
```
Traceback (most recent call last):
File "//./smoke-2.py", line 34, in
output_ids = model.generate(input_ids, attention_mask=attention_mask, max_length=40)
File "/usr/local/lib64/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2092, in generate
self._prepare_cache_for_generation(
File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1714, in _prepare_cache_for_generation
if not self._supports_default_dynamic_cache():
File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1665, in _supports_default_dynamic_cache
self._supports_cache_class
AttributeError: 'OVModelForCausalLM' object has no attribute '_supports_cache_class'
```
The _supports_cache_class attribute was introduced recently (transformers 4.42.x), and the Optimum-Intel (OVModelForCausalLM) class hasn't implemented support for the latest caching API introduced by transformers. Upstream noticed this and added support in the optimum 1.18.1 release.

So, the requirements should be optimum[diffusers]==1.18.1. Would upgrading optimum cause any other problems?

**To Reproduce**
Run the following program in the image after installing the demos/python_demos/requirements.txt python modules.
```
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoTokenizer

# Model name compatible with OpenVINO optimizations
model_name = "gpt2"

# Load tokenizer (Transformers API)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

# Load optimized model (Optimum Intel API with OpenVINO backend)
model = OVModelForCausalLM.from_pretrained(model_name, export=True)

# Prepare input text
prompt = "Testing transformers and optimum.intel integration"
inputs = tokenizer(prompt, return_tensors="pt", padding=True)
input_ids = inputs.input_ids
attention_mask = inputs.attention_mask

# Generate output (testing both transformers tokenization & OpenVINO inference)
output_ids = model.generate(input_ids, attention_mask=attention_mask, max_length=40)
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print("Prompt:", prompt)
print("Generated text:", generated_text)
```

**Expected behavior**
The program should output something. And it does with optimum==1.18.1.

**Configuration**
OVMS 2025.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python demos requirement incompatibility #3205

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python demos requirement incompatibility #3205

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions