ai9stars
/

AutoTriton

@@ -1,3 +1,65 @@
-## AutoTriton
-AutoTriton is an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially. We leverage TritonBench and KernelBench as our benchmarks. For more details, please see our repo at [https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton).

+---
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- code-generation
+---
+# AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs
+This repository contains the **AutoTriton** model, an 8B parameter model for Triton programming, which is trained based on Seed-Coder-8B-Reasoning via supervised fine-tuning and reinforcement learning sequentially.
+The model was presented in the paper [AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs](https://huggingface.co/papers/2507.05687).
+## Model Overview
+AutoTriton is the first model dedicated to Triton programming powered by reinforcement learning (RL). It addresses the complex challenges in deep learning kernel development by automating the optimization of computational units, memory management, parallelism, and hardware-specific parameters that typically require extensive manual tuning.
+The model's training process involves two sequential stages:
+1.  **Supervised Fine-Tuning (SFT)**: AutoTriton is first equipped with essential Triton programming expertise using a high-quality data gathering pipeline.
+2.  **Reinforcement Learning (RL)**: It then undergoes RL with the Group Relative Policy Optimization (GRPO) algorithm, combining a rule-based reward and an execution-based reward to further enhance its Triton programming ability.
+This approach underscores the promise of RL for automatically generating high-performance kernels, which are core components for building more efficient AI systems.
+## Evaluation
+Experiments across five evaluation channels of TritonBench and KernelBench illustrate that the 8B AutoTriton model achieves performance comparable to mainstream large models, including Claude-4-Sonnet and DeepSeek-R1-0528. Further analysis highlights the crucial role of each module within AutoTriton, including the SFT stage, the RL stage, and the reward design strategy.
+## Usage
+This model is compatible with the `transformers` library and can be loaded and used as follows:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "AI9Stars/AutoTriton" # Replace with the actual model ID if different
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+# Example usage for Triton kernel code generation
+prompt = "def my_triton_kernel(x_ptr, Y_ptr, X, Y, BLOCK_SIZE: tl.constexpr):"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=100)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## GitHub Repository
+For more details on the project, including the full code, training scripts, and additional benchmarks, please refer to the official GitHub repository:
+[https://github.com/AI9Stars/AutoTriton](https://github.com/AI9Stars/AutoTriton)
+## Citation
+If you find this work useful, please consider citing our paper:
+```bibtex
+@article{li2024autotriton,
+  title={AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs},
+  author={Li, Jiajun and Xiao, Wenbo and Deng, Xuan and Lu, Wenbin and Peng, Shiqi and Ma, Xiaokang and Gao, Fan},
+  journal={arXiv preprint arXiv:2507.05687},
+  year={2024},
+  url={https://arxiv.org/abs/2507.05687}
+}
+```