Skip to content

Performance degradation on certain vision models from v4.51.* #37748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 4 tasks
yuan-thomas opened this issue Apr 24, 2025 · 1 comment
Open
3 of 4 tasks

Performance degradation on certain vision models from v4.51.* #37748

yuan-thomas opened this issue Apr 24, 2025 · 1 comment
Labels

Comments

@yuan-thomas
Copy link

yuan-thomas commented Apr 24, 2025

System Info

  • transformers version: 4.51.3
  • Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39
  • Python version: 3.12.3
  • Huggingface_hub version: 0.30.2
  • Safetensors version: 0.5.3
  • Accelerate version: not installed
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.7.0a0+7c8ec84dab.nv25.03 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: Yes
  • GPU type: NVIDIA GeForce RTX 3070 Laptop GPU

Who can help?

vision models: @amyeroberts, @qubvel

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Run this script:

from transformers import AutoImageProcessor, ConvNextV2Model
import torch
import torch.nn as nn
import time
from datasets import load_dataset

dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]

image_processor = AutoImageProcessor.from_pretrained("facebook/convnextv2-large-1k-224")
model = ConvNextV2Model.from_pretrained("facebook/convnextv2-large-1k-224")

inputs = image_processor(image, return_tensors="pt")

start_time = time.time()

model.train()

logits = model(**inputs).last_hidden_state.mean(dim=1) # [batch_size, hidden_size]
criterion = nn.BCEWithLogitsLoss()

fake_logits = torch.randn_like(logits)

loss = criterion(logits, fake_logits)
loss.backward()

print(time.time() - start_time)

Expected behavior

It seems there is performance degradation between version 4.50.* and 4.51.*. Tried pytorch version 2.4 & 2.7.

In my testing, 4.51.* is about 4x slower than the previous version. Using the script attached:

  • 4.50.* uses 1.1s
  • 4.51.* uses 4.5s
@yuan-thomas
Copy link
Author

While duration is longer, GPU utilization is much higher too, from average of 30% in 4.50.* to 90% in 4.51.*, while running this script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant