ONE-Lab
/

GUI-Vid

@@ -13,4 +13,8 @@ tags:
 pipeline_tag: video-text-to-text
 ---
-This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io). See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks.

 pipeline_tag: video-text-to-text
 ---
+This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io).
+It was presented in [GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents](https://huggingface.co/papers/2406.10819).
+See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks.