Update README.md
Browse files
README.md
CHANGED
@@ -45,8 +45,10 @@ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing
|
|
45 |
|
46 |
---
|
47 |
license: apache-2.0
|
48 |
-
|
|
|
49 |
- MasterControlAIML/JSON-Unstructured-Structured
|
|
|
50 |
---
|
51 |
**DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
|
52 |
|
|
|
45 |
|
46 |
---
|
47 |
license: apache-2.0
|
48 |
+
|
49 |
+
Datasets:
|
50 |
- MasterControlAIML/JSON-Unstructured-Structured
|
51 |
+
|
52 |
---
|
53 |
**DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
|
54 |
|