Add library name and pipeline tag
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -1,7 +1,9 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
datasets:
|
4 |
- mikewang/PVD-160K
|
|
|
|
|
|
|
5 |
---
|
6 |
|
7 |
<h1 align="center"> Text-Based Reasoning About Vector Graphics </h1>
|
@@ -19,7 +21,6 @@ datasets:
|
|
19 |
|
20 |
</p>
|
21 |
|
22 |
-
|
23 |
We observe that current *large multimodal models (LMMs)* still struggle with seemingly straightforward reasoning tasks that require precise perception of low-level visual details, such as identifying spatial relations or solving simple mazes. In particular, this failure mode persists in question-answering tasks about vector graphics—images composed purely of 2D objects and shapes.
|
24 |
|
25 |

|
|
|
1 |
---
|
|
|
2 |
datasets:
|
3 |
- mikewang/PVD-160K
|
4 |
+
license: apache-2.0
|
5 |
+
library_name: transformers
|
6 |
+
pipeline_tag: image-to-text
|
7 |
---
|
8 |
|
9 |
<h1 align="center"> Text-Based Reasoning About Vector Graphics </h1>
|
|
|
21 |
|
22 |
</p>
|
23 |
|
|
|
24 |
We observe that current *large multimodal models (LMMs)* still struggle with seemingly straightforward reasoning tasks that require precise perception of low-level visual details, such as identifying spatial relations or solving simple mazes. In particular, this failure mode persists in question-answering tasks about vector graphics—images composed purely of 2D objects and shapes.
|
25 |
|
26 |

|