Improve model card: Add pipeline tag, paper link, abstract, code, and usage

by nielsr HF Staff - opened Aug 31

←

nielsr

Aug 31

This PR significantly improves the model card for Qwen2.5-VL-3B-Instruct by:

Adding essential metadata:
- Adding pipeline_tag: image-text-to-text, ensuring the model is discoverable under the correct category on the Hub (https://huggingface.co/models?pipeline_tag=image-text-to-text).
- Adding vision-language-model to the existing tags.
Enriching content:
- Updating the main title to include the paper name for clarity.
- Adding the paper title and a direct link to its Hugging Face page: Self-Rewarding Vision-Language Model via Reasoning Decomposition.
- Including the full paper abstract for a detailed overview.
- Adding a clear link to the GitHub repository (https://github.com/zli12321/Vision-SR1).
- Populating the "Model description", "Intended uses & limitations", and "Training and evaluation data" sections with information extracted from the paper and the GitHub README.
- Adding a "Sample Usage" section with direct code snippets (bash commands) for setup, training, merging, and evaluation response generation, as found in the original GitHub README.
- Adding the training reward progression image.
- Adding the citation for the EasyR1 source code as specified in the GitHub README.

These changes make the model card more informative, discoverable, and user-friendly.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment