lbourdois commited on
Commit
fea940d
·
verified ·
1 Parent(s): 003b3fc

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +105 -92
README.md CHANGED
@@ -1,93 +1,106 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-1.5B-Instruct
3
- library_name: transformers
4
- model_name: null
5
- tags:
6
- - generated_from_trainer
7
- - trl
8
- - grpo
9
- - deepseek
10
- - r1
11
- licence: license
12
- license: apache-2.0
13
- datasets:
14
- - bhaviktheslider/JSON-Unstructured-Structured
15
- ---
16
-
17
- # Model Card for DeepSeek-R1-Strategy-Qwen-2.5-1.5b-Unstructured-To-Structured
18
-
19
- This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
20
- It has been trained using [TRL](https://github.com/huggingface/trl).
21
-
22
- ## Quick start
23
-
24
- ```python
25
- from transformers import pipeline
26
-
27
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
28
- generator = pipeline("text-generation", model="MasterControlAIML/DeepSeek-R1-Strategy-Qwen-2.5-1.5b-Unstructured-To-Structured", device="cuda")
29
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
30
- print(output["generated_text"])
31
- ```
32
-
33
- ## Training procedure
34
-
35
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/bhavik18385-mastercontrol/grpo_training/runs/cnqeubat)
36
-
37
-
38
- This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
39
-
40
- ### Framework versions
41
-
42
- - TRL: 0.14.0
43
- - Transformers: 4.48.1
44
- - Pytorch: 2.5.1
45
- - Datasets: 3.1.0
46
- - Tokenizers: 0.21.0
47
-
48
- ---
49
- license: apache-2.0
50
-
51
- Datasets:
52
- - MasterControlAIML/JSON-Unstructured-Structured
53
-
54
- ---
55
- **DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
56
-
57
- *Problem - Unstructured to Structured JSON Creation*
58
-
59
- *Desired Input - Unstructured Text Paragraphs and Blank Schema Rules*
60
-
61
- *Output - Filled Created JSON from Unstructured Text following Blank Schema Rules*
62
-
63
- *Dataset Link to Understand More - https://huggingface.co/datasets/MasterControlAIML/JSON-Unstructured-Structured*
64
-
65
- ## Updated Model with new reward modelling and prompts here: https://huggingface.co/MasterControlAIML/DeepSeek-R1-Qwen-2.5-1.5b-Latest-Unstructured-To-Structured
66
-
67
-
68
- ## Citations
69
-
70
- Cite GRPO as:
71
-
72
- ```bibtex
73
- @article{zhihong2024deepseekmath,
74
- title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
75
- author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
76
- year = 2024,
77
- eprint = {arXiv:2402.03300},
78
- }
79
-
80
- ```
81
-
82
- Cite TRL as:
83
-
84
- ```bibtex
85
- @misc{vonwerra2022trl,
86
- title = {{TRL: Transformer Reinforcement Learning}},
87
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
88
- year = 2020,
89
- journal = {GitHub repository},
90
- publisher = {GitHub},
91
- howpublished = {\url{https://github.com/huggingface/trl}}
92
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  ```
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
3
+ library_name: transformers
4
+ tags:
5
+ - generated_from_trainer
6
+ - trl
7
+ - grpo
8
+ - deepseek
9
+ - r1
10
+ licence: license
11
+ license: apache-2.0
12
+ datasets:
13
+ - bhaviktheslider/JSON-Unstructured-Structured
14
+ language:
15
+ - zho
16
+ - eng
17
+ - fra
18
+ - spa
19
+ - por
20
+ - deu
21
+ - ita
22
+ - rus
23
+ - jpn
24
+ - kor
25
+ - vie
26
+ - tha
27
+ - ara
28
+ ---
29
+
30
+ # Model Card for DeepSeek-R1-Strategy-Qwen-2.5-1.5b-Unstructured-To-Structured
31
+
32
+ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
33
+ It has been trained using [TRL](https://github.com/huggingface/trl).
34
+
35
+ ## Quick start
36
+
37
+ ```python
38
+ from transformers import pipeline
39
+
40
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
41
+ generator = pipeline("text-generation", model="MasterControlAIML/DeepSeek-R1-Strategy-Qwen-2.5-1.5b-Unstructured-To-Structured", device="cuda")
42
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
43
+ print(output["generated_text"])
44
+ ```
45
+
46
+ ## Training procedure
47
+
48
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/bhavik18385-mastercontrol/grpo_training/runs/cnqeubat)
49
+
50
+
51
+ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
52
+
53
+ ### Framework versions
54
+
55
+ - TRL: 0.14.0
56
+ - Transformers: 4.48.1
57
+ - Pytorch: 2.5.1
58
+ - Datasets: 3.1.0
59
+ - Tokenizers: 0.21.0
60
+
61
+ ---
62
+ license: apache-2.0
63
+
64
+ Datasets:
65
+ - MasterControlAIML/JSON-Unstructured-Structured
66
+
67
+ ---
68
+ **DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
69
+
70
+ *Problem - Unstructured to Structured JSON Creation*
71
+
72
+ *Desired Input - Unstructured Text Paragraphs and Blank Schema Rules*
73
+
74
+ *Output - Filled Created JSON from Unstructured Text following Blank Schema Rules*
75
+
76
+ *Dataset Link to Understand More - https://huggingface.co/datasets/MasterControlAIML/JSON-Unstructured-Structured*
77
+
78
+ ## Updated Model with new reward modelling and prompts here: https://huggingface.co/MasterControlAIML/DeepSeek-R1-Qwen-2.5-1.5b-Latest-Unstructured-To-Structured
79
+
80
+
81
+ ## Citations
82
+
83
+ Cite GRPO as:
84
+
85
+ ```bibtex
86
+ @article{zhihong2024deepseekmath,
87
+ title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
88
+ author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
89
+ year = 2024,
90
+ eprint = {arXiv:2402.03300},
91
+ }
92
+
93
+ ```
94
+
95
+ Cite TRL as:
96
+
97
+ ```bibtex
98
+ @misc{vonwerra2022trl,
99
+ title = {{TRL: Transformer Reinforcement Learning}},
100
+ author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
101
+ year = 2020,
102
+ journal = {GitHub repository},
103
+ publisher = {GitHub},
104
+ howpublished = {\url{https://github.com/huggingface/trl}}
105
+ }
106
  ```