-
AdithyaSK/Qwen-0.5b-Code-Reasoning
Text Generation • 0.5B • Updated • 5 • • 1 -
AdithyaSK/Qwen-1.5b-Code-Reasoning
Text Generation • 2B • Updated • 4 • 1 -
AdithyaSK/Qwen-0.5b-Code-Reasoning-v1
Text Generation • 0.5B • Updated • 6 • • 1 -
AdithyaSK/Llama-3b-Code-Reasoning
Text Generation • 3B • Updated • 2 • 1
Adithya S K
AI & ML interests
None yet
Recent Activity
updated a Space about 15 hours ago
AdithyaSK/jupyter-agent-openenv published a Space about 15 hours ago
AdithyaSK/jupyter-agent-openenv reacted to qgallouedec's post with 🔥 2 days ago
TRL v1.0 is out!
Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.
The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.
What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.
```
pip install --upgrade trl
```
Read more: hf.co/blog/trl-v1