|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen3_moe |
|
- rust |
|
- code-generation |
|
- instruction-tuning |
|
- open-source |
|
library_name: transformers |
|
base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct |
|
model_name: Daemontatox/HydraCoder |
|
trained_with: |
|
- Unsloth |
|
- Hugging Face TRL |
|
datasets: |
|
- Tesslate/Rust_Dataset |
|
- ysr/rust_instruction_dataset |
|
- saurabh5/rlvr-code-data-Rust |
|
--- |
|
 |
|
Daemontatox/HydraCoder |
|
|
|
HydraCoder is a state-of-the-art Rust-specialized coding model built on Qwen/Qwen3-Coder-30B-A3B-Instruct, designed for high-fidelity, idiomatic Rust code generation, completion, and repair. |
|
|
|
This is the strongest pure Rust model to date, specifically fine-tuned on real-world projects, crates, compiler patterns, and Rust best practices. |
|
|
|
🦀 Key Features |
|
|
|
Focused on Rust: Trained on diverse idiomatic Rust repositories, including tokio, serde, actix, clap, and async ecosystems. |
|
|
|
Instruction-tuned: Accepts natural instructions like "write a TCP server" or "convert this struct to JSON". |
|
|
|
Zero-shot Capable: Performs well without examples, and adapts to many Rust-specific patterns like lifetimes, Result<T, E>, traits, ownership, and borrow checking. |
|
|
|
|
|
|
|
--- |
|
# Training parameters |
|
|
|
### Max Sequence length : 8192 |
|
### r = 32 , alpha = 64 |
|
### bias is none |
|
### lora dropout = 0.01 |
|
### learning rate : 2e-4 / 2e-5 depinding on your dataset |
|
### 2 epochs |
|
### lr schedular = cosine |
|
### weight decay is 0.05 |
|
### warmup ration = 0.02 |
|
|
|
### system prompt : |
|
``` |
|
|
|
You are a reasoning-focused AI assistant with expertise in Rust and large language models (LLMs). |
|
Your goal is to solve tasks by thinking step-by-step, applying principles of systems programming, memory safety, and performance-aware design. |
|
Use logical deduction, structured thinking, and factual grounding rooted in the Rust ecosystem and machine learning best practices. |
|
Ask for clarification if the input is ambiguous. |
|
Keep your answers concise but well-justified, referencing relevant Rust constructs or ML paradigms when helpful. |
|
|
|
|
|
Approach this like an intermediate-level Rust and LLM engineer. |
|
Break down the problem into parts—such as data ownership, type safety, concurrency, or model architecture. |
|
Identify assumptions, make inferences, and evaluate alternatives with a focus on correctness and efficiency. |
|
Avoid overconfidence. |
|
Explain your reasoning clearly, even if the final answer is simple. |
|
prompt: |
|
{} |
|
|
|
Reasoning: |
|
{} |
|
|
|
response: |
|
{} |
|
|
|
``` |
|
|
|
🧠 Intended Use |
|
|
|
HydraCoder is ideal for: |
|
|
|
Rust code generation from natural instructions |
|
|
|
Auto-completion and snippet insertion in editors |
|
|
|
Static analysis assistant tools |
|
|
|
Compiler plugin or LSP augmentation |
|
|
|
Refactoring or code review suggestions |
|
|
|
Code repair or bugfix generation |
|
|
|
|
|
|
|
--- |
|
|
|
🔬 Model Details |
|
|
|
Property Value |
|
|
|
Base Model Qwen/Qwen3-Coder-30B-A3B-Instruct |
|
Fine-tuned Model Daemontatox/HydraCoder |
|
Model Type Mixture-of-Experts (2/8 active experts) |
|
Parameters ~30B (with 2 active experts, ~7.5B per step) |
|
Domain Specialization Idiomatic Rust Code |
|
Training Tooling Unsloth + Hugging Face TRL |
|
License Apache 2.0 |
|
|
|
|
|
|
|
--- |
|
|
|
✍️ Example Prompt |
|
|
|
Prompt: |
|
|
|
Write a simple multithreaded web server in Rust that serves "Hello, world!" to any GET request. |
|
|
|
HydraCoder Output: |
|
```rust |
|
use std::net::{TcpListener, TcpStream}; |
|
use std::io::{Read, Write}; |
|
use std::thread; |
|
|
|
fn handle_client(mut stream: TcpStream) { |
|
let mut buffer = [0; 1024]; |
|
if let Ok(_) = stream.read(&mut buffer) { |
|
let response = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\nHello, world!"; |
|
let _ = stream.write(response.as_bytes()); |
|
} |
|
} |
|
|
|
fn main() -> std::io::Result<()> { |
|
let listener = TcpListener::bind("127.0.0.1:7878")?; |
|
println!("Server listening on port 7878..."); |
|
|
|
for stream in listener.incoming() { |
|
match stream { |
|
Ok(stream) => { |
|
thread::spawn(|| { |
|
handle_client(stream); |
|
}); |
|
} |
|
Err(e) => eprintln!("Connection failed: {}", e), |
|
} |
|
} |
|
|
|
Ok(()) |
|
} |
|
``` |
|
|
|
--- |
|
|
|
⚡ Inference Code |
|
|
|
You can run inference using transformers and text-generation pipeline: |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
|
|
model_id = "Daemontatox/HydraCoder" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True) |
|
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
|
|
prompt = "Write a function in Rust that takes a list of integers and returns the sum of all even numbers." |
|
|
|
output = pipe(prompt, max_new_tokens=200, do_sample=True, temperature=0.2)[0]["generated_text"] |
|
print(output) |
|
``` |
|
|
|
--- |
|
|
|
🧪 Benchmarks (Qualitative) |
|
|
|
HydraCoder performs especially well on: |
|
|
|
Code Rust (HumanEval / MBPP in Rust) – correctly compiling and idiomatic |
|
|
|
Leetcode-style Rust tasks |
|
|
|
Crate-specific patterns – understands macros, derive attributes, and lifetimes |
|
|
|
Ownership-safe solutions |
|
|
|
|
|
|
|
--- |
|
|
|
🔍 Limitations |
|
|
|
Trained for Rust only – not suited for general-purpose multi-language tasks. |
|
|
|
May hallucinate external crate names or imports if not in prompt. |
|
|
|
Not guaranteed to pass Rust compiler unless prompt includes full context. |
|
|
|
|
|
|
|
--- |
|
|
|
✅ License |
|
|
|
Released under the Apache 2.0 License. Free for research and commercial use with attribution. |
|
|
|
|
|
--- |
|
|
|
👨💻 Author |
|
|
|
Model Developer: Daemontatox |
|
|
|
Base Model Author: Qwen Team |
|
|
|
Fine-tuned with: Unsloth + TRL |
|
|
|
|
|
|
|
--- |