Transformers
GGUF
English
Chinese
Inference Endpoints
conversational
XeTute commited on
Commit
98f7db1
·
verified ·
1 Parent(s): bf301b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -41
README.md CHANGED
@@ -3,57 +3,30 @@ license: apache-2.0
3
  datasets:
4
  - open-thoughts/OpenThoughts-114k
5
  - prithivMLmods/Deepthink-Reasoning-Ins
6
- base_model: XeTute/SaplingDream_V0.5-0.5B
 
7
  language:
8
  - en
9
  - zh
10
  new_version: XeTute/SaplingDream_V1-0.5B
11
  library_name: transformers
12
- tags:
13
- - llama-cpp
14
- - gguf-my-repo
15
  ---
16
 
17
- # XeTute/SaplingDream_V0.5-0.5B-Q8_0-GGUF
18
- This model was converted to GGUF format from [`XeTute/SaplingDream_V0.5-0.5B`](https://huggingface.co/XeTute/SaplingDream_V0.5-0.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
- Refer to the [original model card](https://huggingface.co/XeTute/SaplingDream_V0.5-0.5B) for more details on the model.
20
 
21
- ## Use with llama.cpp
22
- Install llama.cpp through brew (works on Mac and Linux)
23
 
24
- ```bash
25
- brew install llama.cpp
26
 
27
- ```
28
- Invoke the llama.cpp server or the CLI.
29
 
30
- ### CLI:
31
- ```bash
32
- llama-cli --hf-repo XeTute/SaplingDream_V0.5-0.5B-Q8_0-GGUF --hf-file saplingdream_v0.5-0.5b-q8_0.gguf -p "The meaning to life and the universe is"
33
- ```
34
 
35
- ### Server:
36
- ```bash
37
- llama-server --hf-repo XeTute/SaplingDream_V0.5-0.5B-Q8_0-GGUF --hf-file saplingdream_v0.5-0.5b-q8_0.gguf -c 2048
38
- ```
39
-
40
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
41
-
42
- Step 1: Clone llama.cpp from GitHub.
43
- ```
44
- git clone https://github.com/ggerganov/llama.cpp
45
- ```
46
-
47
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
48
- ```
49
- cd llama.cpp && LLAMA_CURL=1 make
50
- ```
51
 
52
- Step 3: Run inference through the main binary.
53
- ```
54
- ./llama-cli --hf-repo XeTute/SaplingDream_V0.5-0.5B-Q8_0-GGUF --hf-file saplingdream_v0.5-0.5b-q8_0.gguf -p "The meaning to life and the universe is"
55
- ```
56
- or
57
- ```
58
- ./llama-server --hf-repo XeTute/SaplingDream_V0.5-0.5B-Q8_0-GGUF --hf-file saplingdream_v0.5-0.5b-q8_0.gguf -c 2048
59
- ```
 
3
  datasets:
4
  - open-thoughts/OpenThoughts-114k
5
  - prithivMLmods/Deepthink-Reasoning-Ins
6
+ base_model:
7
+ - Qwen/Qwen2.5-0.5B-Instruct
8
  language:
9
  - en
10
  - zh
11
  new_version: XeTute/SaplingDream_V1-0.5B
12
  library_name: transformers
 
 
 
13
  ---
14
 
15
+ # Sapling Dream V0.5
16
+ This is a finetune ontop of the model mentioned below using the datasets mentioned below through 60% of the training process from V1. V1 will be released of **Feb. 22 2025**, and it will have better performance than this one. Consider this a demo.
 
17
 
18
+ We are currently in the process of training our model, with an official release scheduled for **February 22, 2025 at 17:00 according to the timezone of the Islamic Republic of Pakistan**.
 
19
 
20
+ Introducing **SaplingDream**, a compact GPT model with 0.5 billion parameters, based on the [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) architecture. This model has been fine-tuned on reasoning datasets with meticulous attention to detail, ensuring the highest quality—hence the name "SaplingDream." See this as advanced "instruction" tuning for the base model to support reasoning to make up for its size efficiently.
 
21
 
22
+ To enhance generalization, we are fine-tuning the base model using Stochastic Gradient Descent (SGD) alongside a "Polynomial" learning rate scheduler, starting with a learning rate of 1e-4. Our goal is to ensure that the model not only learns the tokens but also develops the ability to reason through problems effectively.
 
23
 
24
+ For training, we are utilizing the [open-thoughts/OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [prithivMLmods/Deepthink-Reasoning-Ins](https://huggingface.co/datasets/prithivMLmods/Deepthink-Reasoning-Ins) datasets across the entire epoch.
25
+ [You can find the FP32 version here.](https://huggingface.co/XeTute/SaplingDream_V0.5-0.5B)
 
 
26
 
27
+ ---
28
+ # Our Apps & Socials
29
+ [Chat with our Assistant](https://xetute.com/) | [Support us Financially](https://ko-fi.com/XeTute) | [Visit our GitHub](https://github.com/XeTute)
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
+ Long live the Islamic Republic of Pakistan; Glory to the Islamic Republic of Pakistan 🇵🇰
32
+ ![The Flag of the Islamic Federal Republic of Pakistan](https://upload.wikimedia.org/wikipedia/commons/3/32/Flag_of_Pakistan.svg)