Update README.md
Browse files
README.md
CHANGED
@@ -8,30 +8,39 @@ tags:
|
|
8 |
- mlx
|
9 |
---
|
10 |
|
11 |
-
#
|
12 |
|
13 |
-
|
14 |
-
converted to MLX format from [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|
15 |
-
using mlx-lm version **0.24.0**.
|
16 |
|
17 |
-
|
|
|
18 |
|
19 |
```bash
|
20 |
-
|
21 |
```
|
22 |
|
23 |
-
|
24 |
-
from mlx_lm import load, generate
|
25 |
|
26 |
-
|
27 |
|
28 |
-
|
|
|
|
|
|
|
|
|
29 |
|
30 |
-
|
31 |
-
messages = [{"role": "user", "content": prompt}]
|
32 |
-
prompt = tokenizer.apply_chat_template(
|
33 |
-
messages, add_generation_prompt=True
|
34 |
-
)
|
35 |
|
36 |
-
|
37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
- mlx
|
9 |
---
|
10 |
|
11 |
+
# nexaml/Qwen3-0.6B-8bit-MLX
|
12 |
|
13 |
+
## Quickstart
|
|
|
|
|
14 |
|
15 |
+
Run them directly with [nexa-sdk](https://github.com/NexaAI/nexa-sdk) installed
|
16 |
+
In nexa-sdk CLI:
|
17 |
|
18 |
```bash
|
19 |
+
nexaml/Qwen3-0.6B-8bit-MLX
|
20 |
```
|
21 |
|
22 |
+
## Overview
|
|
|
23 |
|
24 |
+
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
|
25 |
|
26 |
+
- **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios.
|
27 |
+
- **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
|
28 |
+
- **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
|
29 |
+
- **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
|
30 |
+
- **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**.
|
31 |
|
32 |
+
#### Model Overview
|
|
|
|
|
|
|
|
|
33 |
|
34 |
+
**Qwen3-0.6B** has the following features:
|
35 |
+
- Type: Causal Language Models
|
36 |
+
- Training Stage: Pretraining & Post-training
|
37 |
+
- Number of Parameters: 0.6B
|
38 |
+
- Number of Paramaters (Non-Embedding): 0.44B
|
39 |
+
- Number of Layers: 28
|
40 |
+
- Number of Attention Heads (GQA): 16 for Q and 8 for KV
|
41 |
+
- Context Length: 32,768
|
42 |
+
|
43 |
+
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
44 |
+
|
45 |
+
## Reference
|
46 |
+
**Original model card**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
|