Update README.md
Browse files
README.md
CHANGED
@@ -121,7 +121,7 @@ print(tokenizer.decode(outputs[0]))
|
|
121 |
## Model
|
122 |
|
123 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective.
|
124 |
-
- **
|
125 |
- **Context length:** 8K tokens
|
126 |
- **Pretraining tokens:** 22 billion
|
127 |
- **Precision:** bfloat16
|
|
|
121 |
## Model
|
122 |
|
123 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective.
|
124 |
+
- **Training steps:** 120K
|
125 |
- **Context length:** 8K tokens
|
126 |
- **Pretraining tokens:** 22 billion
|
127 |
- **Precision:** bfloat16
|