Co-Reward is a self-supervised reinforcement learning method for LLM reasoning, which leverages contrastive agreement between original and rephrased q
AI & ML interests
Trustworthy Machine Learning and Reasoning
Recent Activity
View all activity