Skip to content

[Community contributions] Model cards #36979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
stevhliu opened this issue Mar 25, 2025 · 69 comments · Fixed by #37052, #37076, #37156, #37184 or #37261
Open

[Community contributions] Model cards #36979

stevhliu opened this issue Mar 25, 2025 · 69 comments · Fixed by #37052, #37076, #37156, #37184 or #37261

Comments

@stevhliu
Copy link
Member

stevhliu commented Mar 25, 2025

Hey friends! 👋

We are currently in the process of improving the Transformers model cards by making them more directly useful for everyone. The main goal is to:

  1. Standardize all model cards with a consistent format so users know what to expect when moving between different model cards or trying to learn how to use a new model.
  2. Include a brief description of the model (what makes it unique/different) written in a way that's accessible to everyone.
  3. Provide ready to use code examples featuring the Pipeline, AutoModel, and transformers-cli with available optimizations included. For large models, provide a quantization example so its easier for everyone to run the model.
  4. Include an attention mask visualizer for currently supported models to help users visualize what a model is seeing (refer to Add attention visualization tool  #36630) for more details.

Compare the before and after model cards below:

Image

With so many models in Transformers, we could really use some a hand with standardizing the existing model cards. If you're interested in making a contribution, pick a model from the list below and then you can get started!

Steps

Each model card should follow the format below. You can copy the text exactly as it is!

# add appropriate badges
<div style="float: right;">
    <div class="flex flex-wrap space-x-1">
           <img alt="" src="" >
    </div>
</div>

# Model name

[Model name](https://huggingface.co/papers/...) ...

A brief description of the model and what makes it unique/different. Try to write this like you're talking to a friend. 

You can find all the original [Model name] checkpoints under the [Model name](link) collection.

> [!TIP]
> Click on the [Model name] models in the right sidebar for more examples of how to apply [Model name] to different [insert task types here] tasks.

The example below demonstrates how to generate text based on an image with [`Pipeline`] or the [`AutoModel`] class.

<hfoptions id="usage">
<hfoption id="Pipeline>

insert pipeline code here

</hfoption>
<hfoption id="AutoModel">

add AutoModel code here

</hfoption>
<hfoption id="transformers-cli">

add transformers-cli usage here if applicable/supported, otherwise close the hfoption block

</hfoption>
</hfoptions

Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.

The example below uses [insert quantization method here](link to quantization method) to only quantize the weights to __.

# add if this is supported for your model
Use the [AttentionMaskVisualizer](https://github.com/huggingface/transformers/blob/beb9b5b02246b9b7ee81ddf938f93f44cfeaad19/src/transformers/utils/attention_visualizer.py#L139) to better understand what tokens the model can and cannot attend to.

\```py
from transformers.utils.attention_visualizer import AttentionMaskVisualizer

visualizer = AttentionMaskVisualizer("google/gemma-3-4b-it")
visualizer("<img>What is shown in this image?")
\```

# upload image to https://huggingface.co/datasets/huggingface/documentation-images/tree/main/transformers/model_doc and ping me to merge
<div class="flex justify-center">
    <img src=""/>
</div>

## Notes

- Any other model-specific notes should go here.

   \```py
    <insert relevant code snippet here related to the note if its available>
   \ ```

For examples, take a look at #36469 or the BERT, Llama, Llama 2, Gemma 3, PaliGemma, ViT, and Whisper model cards on the main version of the docs.

Once you're done or if you have any questions, feel free to ping @stevhliu to review. Don't add fix to your PR to avoid closing this issue.

I'll also be right there working alongside you and opening PRs to convert the model cards so we can complete this faster together! 🤗

Models

@devesh-2002
Copy link
Contributor

Hi. I would like to work on model card for gemma 2.

@NahieliV
Copy link
Contributor

Hi. I would like to work on model card for mistral.

@NahieliV
Copy link
Contributor

Hi @stevhliu , this is my first contribution so I have a really basic question . Should I clone every repo under mistralai? I just cloned the repo mistralai/Ministral-8B-Instruct-2410, but there are many other repos under mistralai. It's ok if I need to, but I just want to be sure.

@capnmav77
Copy link

Hey , I would like to work on the model card for llama3 .

@stevhliu
Copy link
Member Author

Hey @NahieliV, welcome! You only need to modify the mistral.md file. This is just for the model cards in the Transformers docs rather than the Hub.

@arkhamHack
Copy link
Contributor

Hey @stevhliu I would like to work on the model card for qwen2_5_vl.

@hesamsheikh
Copy link

@stevhliu Is it not possible to automate with an LLM?

@AbhishekRP2002
Copy link
Contributor

hi @stevhliu i would be super grateful if you can let me work on the model card for code_llama

@bimal-gajera
Copy link
Contributor

Hey @stevhliu, I would like to work on the cohere model card.

@ash-01xor
Copy link
Contributor

Hey @stevhliu , i would like to contribute to gpt2 model card

@saumanraaj
Copy link

Hey @stevhliu , I would like to contribute to vitpose model card

@Wu-n0
Copy link
Contributor

Wu-n0 commented Mar 28, 2025

Hey @stevhliu, I would like to work on the electra model card

@shubham0204
Copy link
Contributor

shubham0204 commented Mar 28, 2025

@stevhliu I will update the model card for depth_anything.
PR: #37065

@darmasrmez
Copy link

Hey @stevhliu , I would like to contribute to mixtral model card

@ash-01xor
Copy link
Contributor

ash-01xor commented Mar 29, 2025

To the folks who have been raising PR so far , just have a doubt did you get to install flax , tf-keras , sentencepiece etc.
Before making the changes, I'm trying to set up the environment following the steps here: https://github.com/huggingface/transformers/tree/main/docs.
Currently, I'm trying to build the documentation, but I repeatedly encounter errors such as Unable to register cuDNN factory: and the library installation errors. So would like to know if I am missing any steps or if all these library installations are necessary for making the changes

EDIT : Got it up and running, had to install all the libraries to make it run successfully. Initially felt doubtful about the need to install all the libraries such as flax but yea seems like it has to be installed too.

@arpitsinghgautam
Copy link

Hey @stevhliu, I would like to work on the phi3 model card

@shubham0204
Copy link
Contributor

To the folks who have been raising PR so far , just have a doubt did you get to install flax , tf-keras , sentencepiece etc. Before making the changes, I'm trying to set up the environment following the steps here: https://github.com/huggingface/transformers/tree/main/docs. Currently, I'm trying to build the documentation, but I repeatedly encounter errors such as Unable to register cuDNN factory: and the library installation errors. So would like to know if I am missing any steps or if all these library installations are necessary for making the changes

As you just going to edit the docs, you need not have a complete development setup. Fork the transformers repo, checkout a new branch, and start updating the Markdown document of your choice in the docs/source/en/model_doc directory.

@Rishik00
Copy link

Hey @stevhliu I would like to work on the model card for deberta. Hope that's alright

@souvikchand
Copy link

hi @stevhliu I would like to add model card for ALBERT model

@stevhliu
Copy link
Member Author

Hey @Rishik00, DeBERTa is already taken. Do you want to work on DeBERTav2?

@afafelwafi
Copy link
Contributor

afafelwafi commented Apr 18, 2025

Hello @stevhliu , I would like to add the model card for Gemma and Siglip2 if that's possible

@Nikil-D-Gr8
Copy link

Hello @stevhliu !
Can I work on mllama?
Thanks

@Rishik00
Copy link

@stevhliu Definitely! I'd love to! Is there a deadline?

@stevhliu
Copy link
Member Author

Is there a deadline?

There's no deadline, so feel free to work on it whenever you have the time :)

@souvikchand
Copy link

@stevhliu for the attention visualization image, I opened a pull request at https://huggingface.co/datasets/huggingface/documentation-images/discussions/479 . please merge it so i can include it in model card

@saswatmeher
Copy link
Contributor

@stevhliu raised a PR for SigLIP2

@saswatmeher
Copy link
Contributor

@stevhliu raised a PR for SigLIP2

@stevhliu I worked on SigLIP2 without realizing it was already assigned to someone else. Apologies for the oversight. Let me know if I should close this PR. Happy to adjust as needed based on your guidance.

@KishanPipariya
Copy link

@stevhliu Can I work on Audio Spectrogram Transformers?

@SaiSanthosh1508
Copy link

@stevhliu can i work on qwen2_vl

@stevhliu
Copy link
Member Author

@afafelwafi, since @saswatmeher has already raised a PR for SigLIP2, would it be ok if you picked a different model? 🤗

@souvikchand
Copy link

@stevhliu for the attention visualization image, I opened a pull request at https://huggingface.co/datasets/huggingface/documentation-images/discussions/479 . please merge it so I can include it in model card

@stevhliu please merge this

@stevhliu
Copy link
Member Author

Hey, I don't think the AttentionMaskVisualizer is integrated for ALBERT yet! Would you like to open a PR to add it?

@souvikchand
Copy link

Hey, I don't think the AttentionMaskVisualizer is integrated for ALBERT yet! Would you like to open a PR to add it?

@stevhliu Yeah, you're right ALBERT is not compatible with AttentionMaskVisualizer. So I was trying to bypasss the visualization using matplotlib. Apologies for not discussing this with you earlier. my code is down below --

from transformers import AlbertTokenizer, AlbertModel
import matplotlib.pyplot as plt
import torch

tokenizer = AlbertTokenizer.from_pretrained("albert/albert-base-v1")
model = AlbertModel.from_pretrained("albert/albert-base-v1", output_attentions=True)  # Enable attention output

text = "Plants create energy through a process known as"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

attentions = outputs.attentions  # List of attention tensors per layer

# Visualize first layer, first head
plt.imshow(attentions[0][0, 0].numpy(), cmap='plasma')
plt.title("Layer 1 - Head 1 Attention")
plt.xlabel("Source Tokens")
plt.ylabel("Target Tokens")
plt.xticks(range(len(inputs.input_ids[0])), [tokenizer.decode(tok) for tok in inputs.input_ids[0]])
plt.yticks(range(len(inputs.input_ids[0])), [tokenizer.decode(tok) for tok in inputs.input_ids[0]])
plt.colorbar()
plt.show()

please let me know if you think it's worth keeping or if it would be better to discard it.

about the task of integrating AttentionMaskVisualizer I'd be happy to help but I might need a few pointers to get started. If you have any guidance or can point me to where I should start, that would be super helpful.

another query for model card --- i need to update the albert.md right?

@stevhliu
Copy link
Member Author

please let me know if you think it's worth keeping or if it would be better to discard it

I think it'd be better to discard it and instead try to integrate it with the existing AttentionMaskVisualizer method. To get started, I would check out #36630 for more context and then the attention_visualizer.py file. If you need any additional guidance, feel free to comment on that PR and I'm sure someone would be happy to help you!

i need to update the albert.md right?

Yeah this is the only file you need to update!

@Tanuj-rai
Copy link

Hey! @stevhliu, can I work on granite?

@souvikchand
Copy link

@stevhliu I have created a PR for ALBERT in #37753. Please review and let me know 😄

@Tanuj-rai Tanuj-rai mentioned this issue Apr 25, 2025
5 tasks
@stevhliu stevhliu reopened this Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment