wenxueru's picture

wenxueru

Aunderline

·

https://github.com/wenxueru

Aunderline

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

upvoted a paper 17 days ago

SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency

upvoted a paper 17 days ago

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

View all activity

Organizations

None yet

authored a paper 17 days ago

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Paper • 2512.12576 • Published 22 days ago • 2

submitted a paper to Daily Papers 17 days ago

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Paper • 2512.12576 • Published 22 days ago • 2

authored 10 papers 7 months ago

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Paper • 2408.16326 • Published Aug 29, 2024 • 1

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

Paper • 2502.04675 • Published Feb 7, 2025 • 1

Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch

Paper • 2502.17173 • Published Feb 24, 2025

On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation

Paper • 2406.12221 • Published Jun 18, 2024

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?

Paper • 2410.05584 • Published Oct 8, 2024

Offline Pseudo Relevance Feedback for Efficient and Effective Single-pass Dense Retrieval

Paper • 2308.10191 • Published Aug 20, 2023

The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models

Paper • 2503.03122 • Published Mar 5, 2025 • 1

End-to-End Entity Detection with Proposer and Regressor

Paper • 2210.10260 • Published Oct 19, 2022

Type-supervised sequence labeling based on the heterogeneous star graph for named entity recognition

Paper • 2210.10240 • Published Oct 19, 2022

Transferable Post-training via Inverse Value Learning

Paper • 2410.21027 • Published Oct 28, 2024