Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,7 @@ Introducing **SaplingDream**, a compact GPT model with 0.5 billion parameters, b
|
|
22 |
To enhance generalization, we are fine-tuning the base model using Stochastic Gradient Descent (SGD) alongside a "Polynomial" learning rate scheduler, starting with a learning rate of 1e-4. Our goal is to ensure that the model not only learns the tokens but also develops the ability to reason through problems effectively.
|
23 |
|
24 |
For training, we are utilizing the [open-thoughts/OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [prithivMLmods/Deepthink-Reasoning-Ins](https://huggingface.co/datasets/prithivMLmods/Deepthink-Reasoning-Ins) datasets across the entire epoch.
|
|
|
25 |
|
26 |
---
|
27 |
# Our Apps & Socials
|
|
|
22 |
To enhance generalization, we are fine-tuning the base model using Stochastic Gradient Descent (SGD) alongside a "Polynomial" learning rate scheduler, starting with a learning rate of 1e-4. Our goal is to ensure that the model not only learns the tokens but also develops the ability to reason through problems effectively.
|
23 |
|
24 |
For training, we are utilizing the [open-thoughts/OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [prithivMLmods/Deepthink-Reasoning-Ins](https://huggingface.co/datasets/prithivMLmods/Deepthink-Reasoning-Ins) datasets across the entire epoch.
|
25 |
+
[You can find the GGUF version here.](https://huggingface.co/XeTute/SaplingDream_V0.5-0.5B-GGUF)
|
26 |
|
27 |
---
|
28 |
# Our Apps & Socials
|