Add model card metadata and links to paper, code and project page
Browse filesThis PR adds a model card with metadata for better discoverability, including the pipeline tag, library name, and license. It also includes links to the paper, code, and project page.
README.md
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-4.0
|
3 |
+
pipeline_tag: image-text-to-text
|
4 |
+
library_name: transformers
|
5 |
+
---
|
6 |
+
|
7 |
+
VLAA-Thinker is a vision-language model that takes an image and text as input and outputs text, as described in [SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models](https://huggingface.co/papers/2504.11468).
|
8 |
+
|
9 |
+
Project Page: https://ucsc-vlaa.github.io/VLAA-Thinking/
|
10 |
+
Code: https://github.com/UCSC-VLAA/VLAA-Thinking
|