XeTute commited on
Commit
e993757
·
verified ·
1 Parent(s): e754925

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -22,6 +22,7 @@ Introducing **SaplingDream**, a compact GPT model with 0.5 billion parameters, b
22
  To enhance generalization, we are fine-tuning the base model using Stochastic Gradient Descent (SGD) alongside a "Polynomial" learning rate scheduler, starting with a learning rate of 1e-4. Our goal is to ensure that the model not only learns the tokens but also develops the ability to reason through problems effectively.
23
 
24
  For training, we are utilizing the [open-thoughts/OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [prithivMLmods/Deepthink-Reasoning-Ins](https://huggingface.co/datasets/prithivMLmods/Deepthink-Reasoning-Ins) datasets across the entire epoch.
 
25
 
26
  ---
27
  # Our Apps & Socials
 
22
  To enhance generalization, we are fine-tuning the base model using Stochastic Gradient Descent (SGD) alongside a "Polynomial" learning rate scheduler, starting with a learning rate of 1e-4. Our goal is to ensure that the model not only learns the tokens but also develops the ability to reason through problems effectively.
23
 
24
  For training, we are utilizing the [open-thoughts/OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [prithivMLmods/Deepthink-Reasoning-Ins](https://huggingface.co/datasets/prithivMLmods/Deepthink-Reasoning-Ins) datasets across the entire epoch.
25
+ [You can find the GGUF version here.](https://huggingface.co/XeTute/SaplingDream_V0.5-0.5B-GGUF)
26
 
27
  ---
28
  # Our Apps & Socials