SiriusL commited on
Commit
f521761
·
verified ·
1 Parent(s): 9a420d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -2
README.md CHANGED
@@ -15,7 +15,7 @@ library_name: transformers
15
 
16
  # InfiGUI-G1-7B
17
 
18
- This repository contains the InfiGUI-G1-7B model from the paper **[InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization](https://github.com/InfiXAI/InfiGUI-R1)**.
19
 
20
  The model is based on `Qwen2.5-VL-7B-Instruct` and is fine-tuned using our proposed **Adaptive Exploration Policy Optimization (AEPO)** framework. AEPO is a novel reinforcement learning method designed to enhance the model's **semantic alignment** for GUI grounding tasks. It overcomes the exploration bottlenecks of standard RLVR methods by integrating a multi-answer generation strategy with a theoretically-grounded adaptive reward function, enabling more effective and efficient learning for complex GUI interactions.
21
 
@@ -164,12 +164,24 @@ if __name__ == "__main__":
164
 
165
  To reproduce the results in our paper, please refer to our repo for detailed instructions.
166
 
167
- For more details on the methodology and evaluation, please refer to our [paper](https://github.com/InfiXAI/InfiGUI-R1) and [repository](https://github.com/InfiXAI/InfiGUI-G1).
168
 
169
  ## Citation Information
170
 
171
  If you find this work useful, we would be grateful if you consider citing the following papers:
172
 
 
 
 
 
 
 
 
 
 
 
 
 
173
  ```bibtex
174
  @article{liu2025infigui,
175
  title={InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners},
 
15
 
16
  # InfiGUI-G1-7B
17
 
18
+ This repository contains the InfiGUI-G1-7B model from the paper **[InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization](https://arxiv.org/abs/2508.05731)**.
19
 
20
  The model is based on `Qwen2.5-VL-7B-Instruct` and is fine-tuned using our proposed **Adaptive Exploration Policy Optimization (AEPO)** framework. AEPO is a novel reinforcement learning method designed to enhance the model's **semantic alignment** for GUI grounding tasks. It overcomes the exploration bottlenecks of standard RLVR methods by integrating a multi-answer generation strategy with a theoretically-grounded adaptive reward function, enabling more effective and efficient learning for complex GUI interactions.
21
 
 
164
 
165
  To reproduce the results in our paper, please refer to our repo for detailed instructions.
166
 
167
+ For more details on the methodology and evaluation, please refer to our [paper](https://arxiv.org/abs/2508.05731) and [repository](https://github.com/InfiXAI/InfiGUI-G1).
168
 
169
  ## Citation Information
170
 
171
  If you find this work useful, we would be grateful if you consider citing the following papers:
172
 
173
+ ```bibtex
174
+ @misc{liu2025infiguig1advancingguigrounding,
175
+ title={InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization},
176
+ author={Yuhang Liu and Zeyu Liu and Shuanghe Zhu and Pengxiang Li and Congkai Xie and Jiasheng Wang and Xueyu Hu and Xiaotian Han and Jianbo Yuan and Xinyao Wang and Shengyu Zhang and Hongxia Yang and Fei Wu},
177
+ year={2025},
178
+ eprint={2508.05731},
179
+ archivePrefix={arXiv},
180
+ primaryClass={cs.AI},
181
+ url={https://arxiv.org/abs/2508.05731},
182
+ }
183
+ ```
184
+
185
  ```bibtex
186
  @article{liu2025infigui,
187
  title={InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners},