Update README.md
Browse files
README.md
CHANGED
@@ -39,6 +39,9 @@ Data quality is another critical factor. Large language models require high-qual
|
|
39 |
- **Dataset**: OpenO1-SFT (complete dataset)
|
40 |
- **Training Duration**: 1 epoch
|
41 |
|
|
|
|
|
|
|
42 |
## Model Specifications
|
43 |
|
44 |
- **Architecture**: Transformer decoder (135M parameters)
|
@@ -57,6 +60,8 @@ Data quality is another critical factor. Large language models require high-qual
|
|
57 |
|
58 |
- No RoPE scaling applied
|
59 |
- No quantization used
|
|
|
|
|
60 |
|
61 |
## Usage
|
62 |
|
|
|
39 |
- **Dataset**: OpenO1-SFT (complete dataset)
|
40 |
- **Training Duration**: 1 epoch
|
41 |
|
42 |
+
<details>
|
43 |
+
<summary>More details</summary>
|
44 |
+
|
45 |
## Model Specifications
|
46 |
|
47 |
- **Architecture**: Transformer decoder (135M parameters)
|
|
|
60 |
|
61 |
- No RoPE scaling applied
|
62 |
- No quantization used
|
63 |
+
|
64 |
+
</details>
|
65 |
|
66 |
## Usage
|
67 |
|