Aviral Kumar's picture

Aviral Kumar

aviralku

·

AI & ML interests

None yet

Recent Activity

authored a paper 12 days ago

Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models

authored a paper 12 days ago

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

authored a paper 12 days ago

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

View all activity

Organizations

aviralku's activity

authored 13 papers 12 days ago

Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models

Paper • 2310.10639 • Published Oct 16, 2023 • 3

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

Paper • 2402.02651 • Published Feb 5, 2024

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Paper • 2402.19446 • Published Feb 29, 2024

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

Paper • 2403.05612 • Published Mar 8, 2024 • 3

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Paper • 2404.14367 • Published Apr 22, 2024 • 1

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

Paper • 2406.14532 • Published Jun 20, 2024

Recursive Introspection: Teaching Language Model Agents How to Self-Improve

Paper • 2407.18219 • Published Jul 25, 2024 • 3

Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Paper • 2410.08146 • Published Oct 10, 2024

Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance

Paper • 2410.13816 • Published Oct 17, 2024 • 2

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Paper • 2412.07762 • Published Dec 10, 2024

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

Paper • 2412.15287 • Published Dec 18, 2024

Value-Based Deep RL Scales Predictably

Paper • 2502.04327 • Published 17 days ago • 5