dumbequation commited on
Commit
a74380d
·
verified ·
1 Parent(s): b515bf7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -7,10 +7,17 @@ tags:
7
  - qwen2
8
  - trl
9
  - grpo
 
10
  license: apache-2.0
11
  language:
12
  - en
 
 
13
  ---
 
 
 
 
14
 
15
  # Uploaded model
16
 
@@ -18,6 +25,4 @@ language:
18
  - **License:** apache-2.0
19
  - **Finetuned from model :** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
20
 
21
- This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
-
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
7
  - qwen2
8
  - trl
9
  - grpo
10
+ - deepseek
11
  license: apache-2.0
12
  language:
13
  - en
14
+ datasets:
15
+ - gretelai/symptom_to_diagnosis
16
  ---
17
+ A Qwen2.5 3Billion parameter model trained to "think" like DeepSeek's R1 using GRPO to be able deduce a disease using patients' complaints in one-shot!
18
+
19
+ Tiny but really impressive model.
20
+ Training to think and reason has also resulted significant boost in general ELO of the model.
21
 
22
  # Uploaded model
23
 
 
25
  - **License:** apache-2.0
26
  - **Finetuned from model :** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
27
 
28
+ This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.