Safetensors
qwen3
YHLLEO commited on
Commit
08ba82f
·
verified ·
1 Parent(s): b1bacc5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -16,6 +16,7 @@ In this study, we present a comprehensive and open-source pipeline for training
16
  <p align="center">
17
  <img width="100%" src="https://west-mask-4fa.notion.site/image/attachment%3A49aa5b9e-0fbc-49aa-b3e2-eddea14e6c47%3Abenchmark_comparison_panels.png?table=block&id=23bab5f9-55ec-80b0-a536-e347209ebde5&spaceId=ac3ab5f9-55ec-815c-b1fd-0003d8804c06&width=1420&userId=&cache=v2">
18
  </p>
 
19
  Performance in comparison with SOTA models on AIME 24&25 and LiveCodeBench v5. Klear-SFT and Klear-Preview refer to Klear-Qwen3-Thinking-SFT and Klear-Qwen3-Thinking-Preview, respectively. Among 7B and 8B models, we outperform [AceReason-Nemotron-1.1-7B](https://arxiv.org/pdf/2506.13284) (AceReason) and [Qwen3-8B](https://arxiv.org/pdf/2505.09388). Although we do not use the [DeepSeek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) dataset, we achieve comparable results to [DeepSeek-R1-0528-Qwen3-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B). Additionally, compared to larger models like [Qwen3-32B](https://arxiv.org/pdf/2505.09388) and [DeepSeek-R1 (0120)](https://huggingface.co/deepseek-ai/DeepSeek-R1), we also demonstrate significant advantages.
20
 
21
 
 
16
  <p align="center">
17
  <img width="100%" src="https://west-mask-4fa.notion.site/image/attachment%3A49aa5b9e-0fbc-49aa-b3e2-eddea14e6c47%3Abenchmark_comparison_panels.png?table=block&id=23bab5f9-55ec-80b0-a536-e347209ebde5&spaceId=ac3ab5f9-55ec-815c-b1fd-0003d8804c06&width=1420&userId=&cache=v2">
18
  </p>
19
+
20
  Performance in comparison with SOTA models on AIME 24&25 and LiveCodeBench v5. Klear-SFT and Klear-Preview refer to Klear-Qwen3-Thinking-SFT and Klear-Qwen3-Thinking-Preview, respectively. Among 7B and 8B models, we outperform [AceReason-Nemotron-1.1-7B](https://arxiv.org/pdf/2506.13284) (AceReason) and [Qwen3-8B](https://arxiv.org/pdf/2505.09388). Although we do not use the [DeepSeek-R1-0528](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528) dataset, we achieve comparable results to [DeepSeek-R1-0528-Qwen3-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B). Additionally, compared to larger models like [Qwen3-32B](https://arxiv.org/pdf/2505.09388) and [DeepSeek-R1 (0120)](https://huggingface.co/deepseek-ai/DeepSeek-R1), we also demonstrate significant advantages.
21
 
22