nexaml commited on
Commit
b566345
·
verified ·
1 Parent(s): fc30e1b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -17
README.md CHANGED
@@ -8,30 +8,39 @@ tags:
8
  - mlx
9
  ---
10
 
11
- # mlx-community/Qwen3-0.6B-8bit
12
 
13
- This model [mlx-community/Qwen3-0.6B-8bit](https://huggingface.co/mlx-community/Qwen3-0.6B-8bit) was
14
- converted to MLX format from [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)
15
- using mlx-lm version **0.24.0**.
16
 
17
- ## Use with mlx
 
18
 
19
  ```bash
20
- pip install mlx-lm
21
  ```
22
 
23
- ```python
24
- from mlx_lm import load, generate
25
 
26
- model, tokenizer = load("mlx-community/Qwen3-0.6B-8bit")
27
 
28
- prompt = "hello"
 
 
 
 
29
 
30
- if tokenizer.chat_template is not None:
31
- messages = [{"role": "user", "content": prompt}]
32
- prompt = tokenizer.apply_chat_template(
33
- messages, add_generation_prompt=True
34
- )
35
 
36
- response = generate(model, tokenizer, prompt=prompt, verbose=True)
37
- ```
 
 
 
 
 
 
 
 
 
 
 
 
8
  - mlx
9
  ---
10
 
11
+ # nexaml/Qwen3-0.6B-8bit-MLX
12
 
13
+ ## Quickstart
 
 
14
 
15
+ Run them directly with [nexa-sdk](https://github.com/NexaAI/nexa-sdk) installed
16
+ In nexa-sdk CLI:
17
 
18
  ```bash
19
+ nexaml/Qwen3-0.6B-8bit-MLX
20
  ```
21
 
22
+ ## Overview
 
23
 
24
+ Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
25
 
26
+ - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios.
27
+ - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
28
+ - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
29
+ - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
30
+ - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**.
31
 
32
+ #### Model Overview
 
 
 
 
33
 
34
+ **Qwen3-0.6B** has the following features:
35
+ - Type: Causal Language Models
36
+ - Training Stage: Pretraining & Post-training
37
+ - Number of Parameters: 0.6B
38
+ - Number of Paramaters (Non-Embedding): 0.44B
39
+ - Number of Layers: 28
40
+ - Number of Attention Heads (GQA): 16 for Q and 8 for KV
41
+ - Context Length: 32,768
42
+
43
+ For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
44
+
45
+ ## Reference
46
+ **Original model card**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)