3 46

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper about 23 hours ago

Towards a Unified View of Large Language Model Post-Training

upvoted a paper 2 days ago

Benchmarking Optimizers for Large Language Model Pretraining

upvoted a paper 2 days ago

DCPO: Dynamic Clipping Policy Optimization

View all activity

Organizations

None yet

upvoted a paper about 23 hours ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published 1 day ago • 45

upvoted 5 papers 2 days ago

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published 5 days ago • 27

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published 4 days ago • 22

upvoted 2 papers 3 days ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published 4 days ago • 76

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 4 days ago • 141

upvoted 3 papers 4 days ago

Efficient Code Embeddings from Code Generation Models

Paper • 2508.21290 • Published 8 days ago • 15

Model-Task Alignment Drives Distinct RL Outcomes

Paper • 2508.21188 • Published 8 days ago • 8

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published 9 days ago • 27

upvoted 9 papers 5 days ago

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Paper • 2508.19229 • Published 11 days ago • 19

Diffusion Language Models Know the Answer Before Decoding

Paper • 2508.19982 • Published 10 days ago • 22

Predicting the Order of Upcoming Tokens Improves Language Modeling

Paper • 2508.19228 • Published 11 days ago • 20

Beyond Transcription: Mechanistic Interpretability in ASR

Paper • 2508.15882 • Published 16 days ago • 83

Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published 9 days ago • 9

TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning

Paper • 2508.20374 • Published 9 days ago • 21

AWorld: Orchestrating the Training Recipe for Agentic AI

Paper • 2508.20404 • Published 9 days ago • 37

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published 9 days ago • 56

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Paper • 2508.18966 • Published 11 days ago • 55

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity