InfiX-ai
/

InfiGUI-G1-7B

@@ -15,7 +15,7 @@ library_name: transformers
 # InfiGUI-G1-7B
-This repository contains the InfiGUI-G1-7B model from the paper **[InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization](https://github.com/InfiXAI/InfiGUI-R1)**.
 The model is based on `Qwen2.5-VL-7B-Instruct` and is fine-tuned using our proposed **Adaptive Exploration Policy Optimization (AEPO)** framework. AEPO is a novel reinforcement learning method designed to enhance the model's **semantic alignment** for GUI grounding tasks. It overcomes the exploration bottlenecks of standard RLVR methods by integrating a multi-answer generation strategy with a theoretically-grounded adaptive reward function, enabling more effective and efficient learning for complex GUI interactions.
@@ -164,12 +164,24 @@ if __name__ == "__main__":
 To reproduce the results in our paper, please refer to our repo for detailed instructions.
-For more details on the methodology and evaluation, please refer to our [paper](https://github.com/InfiXAI/InfiGUI-R1) and [repository](https://github.com/InfiXAI/InfiGUI-G1).
 ## Citation Information
 If you find this work useful, we would be grateful if you consider citing the following papers:
 ```bibtex
 @article{liu2025infigui,
   title={InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners},

 # InfiGUI-G1-7B
+This repository contains the InfiGUI-G1-7B model from the paper **[InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization](https://arxiv.org/abs/2508.05731)**.
 The model is based on `Qwen2.5-VL-7B-Instruct` and is fine-tuned using our proposed **Adaptive Exploration Policy Optimization (AEPO)** framework. AEPO is a novel reinforcement learning method designed to enhance the model's **semantic alignment** for GUI grounding tasks. It overcomes the exploration bottlenecks of standard RLVR methods by integrating a multi-answer generation strategy with a theoretically-grounded adaptive reward function, enabling more effective and efficient learning for complex GUI interactions.
 To reproduce the results in our paper, please refer to our repo for detailed instructions.
+For more details on the methodology and evaluation, please refer to our [paper](https://arxiv.org/abs/2508.05731) and [repository](https://github.com/InfiXAI/InfiGUI-G1).
 ## Citation Information
 If you find this work useful, we would be grateful if you consider citing the following papers:
+```bibtex
+@misc{liu2025infiguig1advancingguigrounding,
+      title={InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization},
+      author={Yuhang Liu and Zeyu Liu and Shuanghe Zhu and Pengxiang Li and Congkai Xie and Jiasheng Wang and Xueyu Hu and Xiaotian Han and Jianbo Yuan and Xinyao Wang and Shengyu Zhang and Hongxia Yang and Fei Wu},
+      year={2025},
+      eprint={2508.05731},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2508.05731},
+}
+```
 ```bibtex
 @article{liu2025infigui,
   title={InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners},