Text Generation
Transformers
Safetensors
llama
code-generation
conversational
text-generation-inference
LiShangZ commited on
Commit
08a907d
·
verified ·
1 Parent(s): 13bc597

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -2
README.md CHANGED
@@ -3,7 +3,65 @@ license: apache-2.0
3
  datasets:
4
  - ScalingIntelligence/KernelBench
5
  - LiShangZ/TritonBench
 
 
 
 
6
  ---
7
- ## AutoTriton
8
 
9
- AutoTriton is an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially. We leverage TritonBench and KernelBench as our benchmarks. For more details, please see our repo at [https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  datasets:
4
  - ScalingIntelligence/KernelBench
5
  - LiShangZ/TritonBench
6
+ pipeline_tag: text-generation
7
+ library_name: transformers
8
+ tags:
9
+ - code-generation
10
  ---
11
+ # AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
12
 
13
+ This repository contains the **AutoTriton** model, an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially.
14
+
15
+ The model was presented in the paper [AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs](https://huggingface.co/papers/2507.05687).
16
+
17
+ ## Model Overview
18
+
19
+ AutoTriton is the first model dedicated to Triton programming powered by reinforcement learning (RL). It addresses the complex challenges in deep learning kernel development by automating the optimization of computational units, memory management, parallelism, and hardware-specific parameters that typically require extensive manual tuning.
20
+
21
+ The model's training process involves two sequential stages:
22
+ 1. **Supervised Fine-Tuning (SFT)**: AutoTriton is first equipped with essential Triton programming expertise using a high-quality data gathering pipeline.
23
+ 2. **Reinforcement Learning (RL)**: It then undergoes RL with the Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further enhance its Triton programming ability.
24
+
25
+ This approach underscores the promise of RL for automatically generating high-performance kernels, which are core components for building more efficient AI systems.
26
+
27
+ ## Evaluation
28
+
29
+ Experiments across five evaluation channels of TritonBench and KernelBench illustrate that the 8B AutoTriton model achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528. Further analysis highlights the crucial role of each module within AutoTriton, including the SFT stage, the RL stage, and the reward design strategy.
30
+
31
+ ## Usage
32
+
33
+ This model is compatible with the `transformers` library and can be loaded and used as follows:
34
+
35
+ ```python
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ model_name = "ai9stars/AutoTriton" # Replace with the actual model ID if different
39
+
40
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
41
+ model = AutoModelForCausalLM.from_pretrained(model_name)
42
+
43
+ # Example usage for Triton kernel code generation
44
+ prompt = "Use triton language to write an add kernel for me."
45
+
46
+ inputs = tokenizer(prompt, return_tensors="pt")
47
+ outputs = model.generate(**inputs, max_new_tokens=100)
48
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
49
+ ```
50
+
51
+ ## GitHub Repository
52
+
53
+ For more details on the project, including the full code, training scripts, and additional benchmarks, please refer to the official GitHub repository:
54
+ [https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton)
55
+
56
+ ## Citation
57
+
58
+ If you find this work useful, please consider citing our paper:
59
+
60
+ ```bibtex
61
+ @article{li2025autotriton,
62
+ title={AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs},
63
+ author={Li, Shangzhan and Wang, Zefan and He, Ye and Li, Yuxuan and Shi, Qi and Li, Jianling and Hu, Yonggang and Che, Wanxiang and Han, Xu and Liu, Zhiyuan and others},
64
+ journal={arXiv preprint arXiv:2507.05687},
65
+ year={2025}
66
+ }
67
+ ```