Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
mphielipp 's Collections
Agentic RL
RL for Autoregressive Tasks
CUDA Optimization
Real2Sim2Real
LLM Training
Light TTS models
Datasets for Robotic Learning
Diffusion and RL
VLM
Visual Reasoning and LLMs
Diffusion Transformers
Robot Learning
Conditional Diffusion
SSMs and Diffusion
Grokking
Self Pedicting Learning in RL
LLMs Evaluation
CV
VLA

LLM Training

updated 27 days ago
Upvote
-

  • LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

    Paper • 2403.13372 • Published Mar 20, 2024 • 135

  • On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

    Paper • 2508.05629 • Published 29 days ago • 170
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略