wangclnlp commited on
Commit
21503fb
·
verified ·
1 Parent(s): 3eb74f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -220,6 +220,14 @@ print(f"The better response is response{max(set(res), key=res.count)} in {k} vot
220
  Tips: To accelerate inference, GRAM-R^2 can be run with [vLLM](https://github.com/vllm-project/vllm) using multiple processes and threads. We also provide this script as a reference implementation at [this](https://github.com/wangclnlp/GRAM/tree/main/extensions/GRAM-RR).
221
 
222
  ### Citation
223
- ```bash
224
- coming soon
 
 
 
 
 
 
 
 
225
  ```
 
220
  Tips: To accelerate inference, GRAM-R^2 can be run with [vLLM](https://github.com/vllm-project/vllm) using multiple processes and threads. We also provide this script as a reference implementation at [this](https://github.com/wangclnlp/GRAM/tree/main/extensions/GRAM-RR).
221
 
222
  ### Citation
223
+ ```
224
+ @misc{wang2025gramr2,
225
+ title={GRAM-R$^2$: Self-Training Generative Foundation Reward Models for Reward Reasoning},
226
+ author={Chenglong Wang and Yongyu Mu and Hang Zhou and Yifu Huo and Ziming Zhu and Jiali Zeng and Murun Yang and Bei Li and Tong Xiao and Xiaoyang Hao and Chunliang Zhang and Fandong Meng and Jingbo Zhu},
227
+ year={2025},
228
+ eprint={2509.02492},
229
+ archivePrefix={arXiv},
230
+ primaryClass={cs.CL},
231
+ url={https://arxiv.org/abs/2509.02492},
232
+ }
233
  ```