File size: 535 Bytes
			
			| 0d23b3d b8adbc1 0d23b3d 2e4713d 0d23b3d b8adbc1 2e4713d 78e59ab | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ---
datasets:
- shuaishuaicdp/GUI-World
language:
- en
license: cc-by-4.0
metrics:
- bertscore
- LLM-as-a-Judge
tags:
- gui
- agent
pipeline_tag: video-text-to-text
---
This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io). 
It was presented in [GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents](https://huggingface.co/papers/2406.10819).
See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks. |