kurakurai
/

Luth-0.6B-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

MaxLSB commited on 13 days ago

Commit

69ee8da

·

verified ·

1 Parent(s): 02e0524

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -19,12 +19,16 @@ pipeline_tag: text-generation
 **Luth-0.6B-Instruct** is a French fine-tuned version of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
 ## Model Details
 Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-0.6B model. This process successfully retained the model's English capabilities while improving its performance on nearly all selected benchmarks in both French and English.
 ## Benchmark Results
 **French Evaluation:**
 ![French Evaluation](media/french_evaluation.png)

 **Luth-0.6B-Instruct** is a French fine-tuned version of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
+Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth).
 ## Model Details
 Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-0.6B model. This process successfully retained the model's English capabilities while improving its performance on nearly all selected benchmarks in both French and English.
 ## Benchmark Results
+We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
 **French Evaluation:**
 ![French Evaluation](media/french_evaluation.png)