-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 5.87k • 42 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 271 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 260 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 126
Collections
Discover the best community collections!
Collections including paper arxiv:2508.18076
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 130 -
Magistral
Paper • 2506.10910 • Published • 64 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 56
-
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities
Paper • 2408.00765 • Published • 14 -
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Paper • 2407.21646 • Published • 18 -
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Paper • 2408.04284 • Published • 26 -
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Paper • 2408.07852 • Published • 16
-
Deep Think with Confidence
Paper • 2508.15260 • Published • 81 -
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
Paper • 2508.12040 • Published • 14 -
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States
Paper • 2406.12053 • Published -
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Paper • 2508.18076 • Published • 5
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 54 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 74 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 70 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 3
-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 5.87k • 42 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 271 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 260 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 126
-
Deep Think with Confidence
Paper • 2508.15260 • Published • 81 -
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
Paper • 2508.12040 • Published • 14 -
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States
Paper • 2406.12053 • Published -
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Paper • 2508.18076 • Published • 5
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 130 -
Magistral
Paper • 2506.10910 • Published • 64 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 56
-
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Paper • 2506.09790 • Published • 54 -
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance
Paper • 2506.06444 • Published • 74 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 70 -
Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research
Paper • 2502.04644 • Published • 3
-
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities
Paper • 2408.00765 • Published • 14 -
Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Paper • 2407.21646 • Published • 18 -
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Paper • 2408.04284 • Published • 26 -
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Paper • 2408.07852 • Published • 16