Commit
·
c092dc8
1
Parent(s):
939ee59
Initial
Browse files
README.md
CHANGED
@@ -137,44 +137,42 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
137 |
| bnb_4bit_quant_type | `nf4` |
|
138 |
| bnb_4bit_use_double_quant | `true` |
|
139 |
|
140 |
-
Aşağıda her başlık için ayrı birer tablo oluşturdum:
|
141 |
-
|
142 |
#### Dataset
|
143 |
|
144 |
-
| Parameter
|
145 |
-
|
146 |
-
| Dataset Name
|
147 |
-
| Split
|
148 |
-
| Number of Rows
|
149 |
-
| Max Token Length
|
150 |
-
| Shuffle
|
151 |
-
| Number of Processes
|
152 |
|
153 |
#### Tokenizer
|
154 |
|
155 |
-
| Parameter | Value
|
156 |
-
|
157 |
-
| Truncation | Enabled (`max_length=8192`)
|
158 |
-
| Masked Language Modeling (MLM) | `False`
|
159 |
|
160 |
#### Speeds, Sizes, Times
|
161 |
|
162 |
-
| Parameter
|
163 |
-
|
164 |
-
| Total Training Time
|
165 |
-
| Checkpoint Frequency
|
166 |
-
| Checkpoint Steps
|
167 |
|
168 |
#### Compute Infrastructure
|
169 |
|
170 |
-
| Parameter | Value
|
171 |
-
|
172 |
-
| GPU | 1 × NVIDIA H100 SXM (80 GB VRAM)
|
173 |
-
| RAM | 125 GB
|
174 |
-
| CPU | 16 vCPU
|
175 |
-
| OS | Ubuntu 22.04
|
176 |
-
| Frameworks | PyTorch 2.4.0
|
177 |
-
| CUDA Version | 12.4.1
|
178 |
|
179 |
---
|
180 |
|
|
|
137 |
| bnb_4bit_quant_type | `nf4` |
|
138 |
| bnb_4bit_use_double_quant | `true` |
|
139 |
|
|
|
|
|
140 |
#### Dataset
|
141 |
|
142 |
+
| Parameter | Value |
|
143 |
+
|---------------------|----------------------------|
|
144 |
+
| Dataset Name | `nvidia/OpenCodeReasoning` |
|
145 |
+
| Split | `split_0` |
|
146 |
+
| Number of Rows | `8000` |
|
147 |
+
| Max Token Length | `8192` |
|
148 |
+
| Shuffle | `True` |
|
149 |
+
| Number of Processes | `4` |
|
150 |
|
151 |
#### Tokenizer
|
152 |
|
153 |
+
| Parameter | Value |
|
154 |
+
|--------------------------------|-----------------------------|
|
155 |
+
| Truncation | Enabled (`max_length=8192`) |
|
156 |
+
| Masked Language Modeling (MLM) | `False` |
|
157 |
|
158 |
#### Speeds, Sizes, Times
|
159 |
|
160 |
+
| Parameter | Value |
|
161 |
+
|----------------------|------------------------------------------------------------|
|
162 |
+
| Total Training Time | ~3.5 hours |
|
163 |
+
| Checkpoint Frequency | every `10000` steps |
|
164 |
+
| Checkpoint Steps | `checkpoint-10000`, `checkpoint-20000`, `checkpoint-24000` |
|
165 |
|
166 |
#### Compute Infrastructure
|
167 |
|
168 |
+
| Parameter | Value |
|
169 |
+
|--------------|----------------------------------|
|
170 |
+
| GPU | 1 × NVIDIA H100 SXM (80 GB VRAM) |
|
171 |
+
| RAM | 125 GB |
|
172 |
+
| CPU | 16 vCPU |
|
173 |
+
| OS | Ubuntu 22.04 |
|
174 |
+
| Frameworks | PyTorch 2.4.0 |
|
175 |
+
| CUDA Version | 12.4.1 |
|
176 |
|
177 |
---
|
178 |
|