GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 12 days ago • 151
Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings Paper • 2508.00632 • Published 20 days ago • 3
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm Paper • 2507.18553 • Published 27 days ago • 39
Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement Paper • 2507.18742 • Published 27 days ago • 5
view article Article Automated Discovery of High-Performance GPU Kernels with OpenEvolve By codelion • Jun 27 • 21
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8 • 41
Configurable Preference Tuning ⚙️📝 Collection CPT uses rubric-guided synthetic data and DPO to enable LLMs to dynamically adjust behavior (e.g., writing style) at inference with system prompts • 7 items • Updated Jun 17 • 1
Configurable Preference Tuning with Rubric-Guided Synthetic Data Paper • 2506.11702 • Published Jun 13 • 2
Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit Paper • 2506.06607 • Published Jun 7 • 2
Atropos Artifacts Collection A collection of experimental artifacts created with Atropos, Nous' RL Environments framework - https://github.com/NousResearch/Atropos • 9 items • Updated 29 days ago • 10
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29 • 97
Perception Encoder: The best visual embeddings are not at the output of the network Paper • 2504.13181 • Published Apr 17 • 35
ReZero: Enhancing LLM search ability by trying one-more-time Paper • 2504.11001 • Published Apr 15 • 15