Text Generation
Transformers
Safetensors
llama
code-generation
conversational
text-generation-inference

Improve model card with metadata, paper link, and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +64 -2
README.md CHANGED
@@ -1,3 +1,65 @@
1
- ## AutoTriton
 
 
 
 
 
2
 
3
- AutoTriton is an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially. We leverage TritonBench and KernelBench as our benchmarks. For more details, please see our repo at [https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ library_name: transformers
4
+ tags:
5
+ - code-generation
6
+ ---
7
 
8
+ # AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
9
+
10
+ This repository contains the **AutoTriton** model, an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially.
11
+
12
+ The model was presented in the paper [AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs](https://huggingface.co/papers/2507.05687).
13
+
14
+ ## Model Overview
15
+
16
+ AutoTriton is the first model dedicated to Triton programming powered by reinforcement learning (RL). It addresses the complex challenges in deep learning kernel development by automating the optimization of computational units, memory management, parallelism, and hardware-specific parameters that typically require extensive manual tuning.
17
+
18
+ The model's training process involves two sequential stages:
19
+ 1. **Supervised Fine-Tuning (SFT)**: AutoTriton is first equipped with essential Triton programming expertise using a high-quality data gathering pipeline.
20
+ 2. **Reinforcement Learning (RL)**: It then undergoes RL with the Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further enhance its Triton programming ability.
21
+
22
+ This approach underscores the promise of RL for automatically generating high-performance kernels, which are core components for building more efficient AI systems.
23
+
24
+ ## Evaluation
25
+
26
+ Experiments across five evaluation channels of TritonBench and KernelBench illustrate that the 8B AutoTriton model achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528. Further analysis highlights the crucial role of each module within AutoTriton, including the SFT stage, the RL stage, and the reward design strategy.
27
+
28
+ ## Usage
29
+
30
+ This model is compatible with the `transformers` library and can be loaded and used as follows:
31
+
32
+ ```python
33
+ from transformers import AutoModelForCausalLM, AutoTokenizer
34
+
35
+ model_name = "AI9Stars/AutoTriton" # Replace with the actual model ID if different
36
+
37
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
38
+ model = AutoModelForCausalLM.from_pretrained(model_name)
39
+
40
+ # Example usage for Triton kernel code generation
41
+ prompt = "def my_triton_kernel(x_ptr, Y_ptr, X, Y, BLOCK_SIZE: tl.constexpr):"
42
+
43
+ inputs = tokenizer(prompt, return_tensors="pt")
44
+ outputs = model.generate(**inputs, max_new_tokens=100)
45
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
46
+ ```
47
+
48
+ ## GitHub Repository
49
+
50
+ For more details on the project, including the full code, training scripts, and additional benchmarks, please refer to the official GitHub repository:
51
+ [https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton)
52
+
53
+ ## Citation
54
+
55
+ If you find this work useful, please consider citing our paper:
56
+
57
+ ```bibtex
58
+ @article{li2024autotriton,
59
+ title={AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs},
60
+ author={Li, Jiajun and Xiao, Wenbo and Deng, Xuan and Lu, Wenbin and Peng, Shiqi and Ma, Xiaokang and Gao, Fan},
61
+ journal={arXiv preprint arXiv:2507.05687},
62
+ year={2024},
63
+ url={https://arxiv.org/abs/2507.05687}
64
+ }
65
+ ```