Update README.md
Browse files
README.md
CHANGED
@@ -1,21 +1,186 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
3 |
tags:
|
4 |
- text-generation-inference
|
5 |
- transformers
|
6 |
- unsloth
|
7 |
- qwen3_moe
|
8 |
-
|
9 |
-
|
10 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
-
- **Developed by:** Daemontatox
|
16 |
-
- **License:** apache-2.0
|
17 |
-
- **Finetuned from model :** Qwen/Qwen3-Coder-30B-A3B-Instruct
|
18 |
|
19 |
-
This qwen3_moe model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
20 |
|
21 |
-
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
license: apache-2.0
|
5 |
tags:
|
6 |
- text-generation-inference
|
7 |
- transformers
|
8 |
- unsloth
|
9 |
- qwen3_moe
|
10 |
+
- rust
|
11 |
+
- code-generation
|
12 |
+
- instruction-tuning
|
13 |
+
- open-source
|
14 |
+
library_name: transformers
|
15 |
+
base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct
|
16 |
+
model_name: Daemontatox/HydraCoder
|
17 |
+
trained_with:
|
18 |
+
- Unsloth
|
19 |
+
- Hugging Face TRL
|
20 |
+
---
|
21 |
+
|
22 |
+
Daemontatox/HydraCoder
|
23 |
+
|
24 |
+
HydraCoder is a state-of-the-art Rust-specialized coding model built on Qwen/Qwen3-Coder-30B-A3B-Instruct, designed for high-fidelity, idiomatic Rust code generation, completion, and repair.
|
25 |
+
|
26 |
+
This is the strongest pure Rust model to date, specifically fine-tuned on real-world projects, crates, compiler patterns, and Rust best practices.
|
27 |
+
|
28 |
+
🦀 Key Features
|
29 |
+
|
30 |
+
Focused on Rust: Trained on diverse idiomatic Rust repositories, including tokio, serde, actix, clap, and async ecosystems.
|
31 |
+
|
32 |
+
Instruction-tuned: Accepts natural instructions like "write a TCP server" or "convert this struct to JSON".
|
33 |
+
|
34 |
+
Zero-shot Capable: Performs well without examples, and adapts to many Rust-specific patterns like lifetimes, Result<T, E>, traits, ownership, and borrow checking.
|
35 |
+
|
36 |
+
|
37 |
+
|
38 |
+
---
|
39 |
+
|
40 |
+
🧠 Intended Use
|
41 |
+
|
42 |
+
HydraCoder is ideal for:
|
43 |
+
|
44 |
+
Rust code generation from natural instructions
|
45 |
+
|
46 |
+
Auto-completion and snippet insertion in editors
|
47 |
+
|
48 |
+
Static analysis assistant tools
|
49 |
+
|
50 |
+
Compiler plugin or LSP augmentation
|
51 |
+
|
52 |
+
Refactoring or code review suggestions
|
53 |
+
|
54 |
+
Code repair or bugfix generation
|
55 |
+
|
56 |
+
|
57 |
+
|
58 |
---
|
59 |
|
60 |
+
🔬 Model Details
|
61 |
+
|
62 |
+
Property Value
|
63 |
+
|
64 |
+
Base Model Qwen/Qwen3-Coder-30B-A3B-Instruct
|
65 |
+
Fine-tuned Model Daemontatox/HydraCoder
|
66 |
+
Model Type Mixture-of-Experts (2/8 active experts)
|
67 |
+
Parameters ~30B (with 2 active experts, ~7.5B per step)
|
68 |
+
Domain Specialization Idiomatic Rust Code
|
69 |
+
Training Tooling Unsloth + Hugging Face TRL
|
70 |
+
License Apache 2.0
|
71 |
+
|
72 |
+
|
73 |
+
|
74 |
+
---
|
75 |
+
|
76 |
+
✍️ Example Prompt
|
77 |
+
|
78 |
+
Prompt:
|
79 |
+
|
80 |
+
Write a simple multithreaded web server in Rust that serves "Hello, world!" to any GET request.
|
81 |
+
|
82 |
+
HydraCoder Output:
|
83 |
+
|
84 |
+
use std::net::{TcpListener, TcpStream};
|
85 |
+
use std::io::{Read, Write};
|
86 |
+
use std::thread;
|
87 |
+
|
88 |
+
fn handle_client(mut stream: TcpStream) {
|
89 |
+
let mut buffer = [0; 1024];
|
90 |
+
if let Ok(_) = stream.read(&mut buffer) {
|
91 |
+
let response = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\nHello, world!";
|
92 |
+
let _ = stream.write(response.as_bytes());
|
93 |
+
}
|
94 |
+
}
|
95 |
+
|
96 |
+
fn main() -> std::io::Result<()> {
|
97 |
+
let listener = TcpListener::bind("127.0.0.1:7878")?;
|
98 |
+
println!("Server listening on port 7878...");
|
99 |
+
|
100 |
+
for stream in listener.incoming() {
|
101 |
+
match stream {
|
102 |
+
Ok(stream) => {
|
103 |
+
thread::spawn(|| {
|
104 |
+
handle_client(stream);
|
105 |
+
});
|
106 |
+
}
|
107 |
+
Err(e) => eprintln!("Connection failed: {}", e),
|
108 |
+
}
|
109 |
+
}
|
110 |
+
|
111 |
+
Ok(())
|
112 |
+
}
|
113 |
+
|
114 |
+
|
115 |
+
---
|
116 |
+
|
117 |
+
⚡ Inference Code
|
118 |
+
|
119 |
+
You can run inference using transformers and text-generation pipeline:
|
120 |
+
|
121 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
|
122 |
+
|
123 |
+
model_id = "Daemontatox/HydraCoder"
|
124 |
+
|
125 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
126 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)
|
127 |
+
|
128 |
+
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
|
129 |
+
|
130 |
+
prompt = "Write a function in Rust that takes a list of integers and returns the sum of all even numbers."
|
131 |
+
|
132 |
+
output = pipe(prompt, max_new_tokens=200, do_sample=True, temperature=0.2)[0]["generated_text"]
|
133 |
+
print(output)
|
134 |
+
|
135 |
+
|
136 |
+
---
|
137 |
+
|
138 |
+
🧪 Benchmarks (Qualitative)
|
139 |
+
|
140 |
+
HydraCoder performs especially well on:
|
141 |
+
|
142 |
+
Code Rust (HumanEval / MBPP in Rust) – correctly compiling and idiomatic
|
143 |
+
|
144 |
+
Leetcode-style Rust tasks
|
145 |
+
|
146 |
+
Crate-specific patterns – understands macros, derive attributes, and lifetimes
|
147 |
+
|
148 |
+
Ownership-safe solutions
|
149 |
+
|
150 |
+
|
151 |
+
|
152 |
+
---
|
153 |
+
|
154 |
+
🔍 Limitations
|
155 |
+
|
156 |
+
Trained for Rust only – not suited for general-purpose multi-language tasks.
|
157 |
+
|
158 |
+
May hallucinate external crate names or imports if not in prompt.
|
159 |
+
|
160 |
+
Not guaranteed to pass Rust compiler unless prompt includes full context.
|
161 |
+
|
162 |
+
|
163 |
+
|
164 |
+
---
|
165 |
+
|
166 |
+
✅ License
|
167 |
+
|
168 |
+
Released under the Apache 2.0 License. Free for research and commercial use with attribution.
|
169 |
+
|
170 |
+
|
171 |
+
---
|
172 |
+
|
173 |
+
👨💻 Author
|
174 |
+
|
175 |
+
Model Developer: Daemontatox
|
176 |
+
|
177 |
+
Base Model Author: Qwen Team
|
178 |
+
|
179 |
+
Fine-tuned with: Unsloth + TRL
|
180 |
+
|
181 |
+
|
182 |
+
|
183 |
+
---
|
184 |
|
|
|
|
|
|
|
185 |
|
|
|
186 |
|
|