license: apache-2.0 | |
language: | |
- en | |
- fr | |
- de | |
- es | |
- it | |
- pt | |
base_model: | |
- alamios/Qwenstral-Small-3.1-0.5B | |
datasets: | |
- alamios/Mistral-Small-24B-Instruct-2501-Conversations | |
pipeline_tag: text-generation | |
library_name: transformers | |
tags: | |
- qwen | |
- qwen2.5 | |
- mistral | |
- mistral-small | |
- mistral-small-3.1 | |
### exl2 quant (measurement.json in main branch) | |
--- | |
### check revisions for quants | |
--- | |
# Mistral-Small-3.1-DRAFT-0.5B | |
This model is meant to be used as draft model for speculative decoding with [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) or [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) | |
# Data info | |
The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch. |