lbourdois commited on
Commit
a4bca36
·
verified ·
1 Parent(s): 56a3daf

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +88 -74
README.md CHANGED
@@ -1,75 +1,89 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-7B-Instruct
3
- datasets:
4
- - jerry128/hotpotqa-2-10-disjoint-1
5
- - jerry128/hotpotqa-2-10-disjoint-2
6
- - jerry128/hotpotqa-2-10-disjoint-3
7
- - jerry128/hotpotqa-2-10-disjoint-4
8
- - jerry128/hotpotqa-2-10-disjoint-6
9
- - jerry128/hotpotqa-2-10-disjoint-7
10
- - jerry128/hotpotqa-2-10-disjoint-8
11
- - jerry128/hotpotqa-2-10-disjoint-9
12
- library_name: transformers
13
- model_name: home/jerry8/axolotl-artifacts/hotpotqa-outputs-cl-2-10-unshuffled
14
- tags:
15
- - generated_from_trainer
16
- licence: license
17
- ---
18
-
19
- # Model Card for home/jerry8/axolotl-artifacts/hotpotqa-outputs-cl-2-10-unshuffled
20
-
21
- This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the [['jerry128/hotpotqa-2-10-disjoint-1', 'jerry128/hotpotqa-2-10-disjoint-2', 'jerry128/hotpotqa-2-10-disjoint-3', 'jerry128/hotpotqa-2-10-disjoint-4', 'jerry128/hotpotqa-2-10-disjoint-6', 'jerry128/hotpotqa-2-10-disjoint-7', 'jerry128/hotpotqa-2-10-disjoint-8', 'jerry128/hotpotqa-2-10-disjoint-9']](https://huggingface.co/datasets/['jerry128/hotpotqa-2-10-disjoint-1', 'jerry128/hotpotqa-2-10-disjoint-2', 'jerry128/hotpotqa-2-10-disjoint-3', 'jerry128/hotpotqa-2-10-disjoint-4', 'jerry128/hotpotqa-2-10-disjoint-6', 'jerry128/hotpotqa-2-10-disjoint-7', 'jerry128/hotpotqa-2-10-disjoint-8', 'jerry128/hotpotqa-2-10-disjoint-9']) dataset.
22
- It has been trained using [TRL](https://github.com/huggingface/trl).
23
-
24
- ## Quick start
25
-
26
- ```python
27
- from transformers import pipeline
28
-
29
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
30
- generator = pipeline("text-generation", model="None", device="cuda")
31
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
32
- print(output["generated_text"])
33
- ```
34
-
35
- ## Training procedure
36
-
37
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/jhuang192-university-of-illinois-urbana-champaign/hotpotqa-grpo/runs/4mua162m)
38
-
39
-
40
- This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
41
-
42
- ### Framework versions
43
-
44
- - TRL: 0.15.1
45
- - Transformers: 4.49.0
46
- - Pytorch: 2.5.1
47
- - Datasets: 3.2.0
48
- - Tokenizers: 0.21.0
49
-
50
- ## Citations
51
-
52
- Cite GRPO as:
53
-
54
- ```bibtex
55
- @article{zhihong2024deepseekmath,
56
- title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
57
- author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
58
- year = 2024,
59
- eprint = {arXiv:2402.03300},
60
- }
61
-
62
- ```
63
-
64
- Cite TRL as:
65
-
66
- ```bibtex
67
- @misc{vonwerra2022trl,
68
- title = {{TRL: Transformer Reinforcement Learning}},
69
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
70
- year = 2020,
71
- journal = {GitHub repository},
72
- publisher = {GitHub},
73
- howpublished = {\url{https://github.com/huggingface/trl}}
74
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
  ```
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-7B-Instruct
3
+ datasets:
4
+ - jerry128/hotpotqa-2-10-disjoint-1
5
+ - jerry128/hotpotqa-2-10-disjoint-2
6
+ - jerry128/hotpotqa-2-10-disjoint-3
7
+ - jerry128/hotpotqa-2-10-disjoint-4
8
+ - jerry128/hotpotqa-2-10-disjoint-6
9
+ - jerry128/hotpotqa-2-10-disjoint-7
10
+ - jerry128/hotpotqa-2-10-disjoint-8
11
+ - jerry128/hotpotqa-2-10-disjoint-9
12
+ library_name: transformers
13
+ model_name: home/jerry8/axolotl-artifacts/hotpotqa-outputs-cl-2-10-unshuffled
14
+ tags:
15
+ - generated_from_trainer
16
+ licence: license
17
+ language:
18
+ - zho
19
+ - eng
20
+ - fra
21
+ - spa
22
+ - por
23
+ - deu
24
+ - ita
25
+ - rus
26
+ - jpn
27
+ - kor
28
+ - vie
29
+ - tha
30
+ - ara
31
+ ---
32
+
33
+ # Model Card for home/jerry8/axolotl-artifacts/hotpotqa-outputs-cl-2-10-unshuffled
34
+
35
+ This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the [['jerry128/hotpotqa-2-10-disjoint-1', 'jerry128/hotpotqa-2-10-disjoint-2', 'jerry128/hotpotqa-2-10-disjoint-3', 'jerry128/hotpotqa-2-10-disjoint-4', 'jerry128/hotpotqa-2-10-disjoint-6', 'jerry128/hotpotqa-2-10-disjoint-7', 'jerry128/hotpotqa-2-10-disjoint-8', 'jerry128/hotpotqa-2-10-disjoint-9']](https://huggingface.co/datasets/['jerry128/hotpotqa-2-10-disjoint-1', 'jerry128/hotpotqa-2-10-disjoint-2', 'jerry128/hotpotqa-2-10-disjoint-3', 'jerry128/hotpotqa-2-10-disjoint-4', 'jerry128/hotpotqa-2-10-disjoint-6', 'jerry128/hotpotqa-2-10-disjoint-7', 'jerry128/hotpotqa-2-10-disjoint-8', 'jerry128/hotpotqa-2-10-disjoint-9']) dataset.
36
+ It has been trained using [TRL](https://github.com/huggingface/trl).
37
+
38
+ ## Quick start
39
+
40
+ ```python
41
+ from transformers import pipeline
42
+
43
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
44
+ generator = pipeline("text-generation", model="None", device="cuda")
45
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
46
+ print(output["generated_text"])
47
+ ```
48
+
49
+ ## Training procedure
50
+
51
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/jhuang192-university-of-illinois-urbana-champaign/hotpotqa-grpo/runs/4mua162m)
52
+
53
+
54
+ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
55
+
56
+ ### Framework versions
57
+
58
+ - TRL: 0.15.1
59
+ - Transformers: 4.49.0
60
+ - Pytorch: 2.5.1
61
+ - Datasets: 3.2.0
62
+ - Tokenizers: 0.21.0
63
+
64
+ ## Citations
65
+
66
+ Cite GRPO as:
67
+
68
+ ```bibtex
69
+ @article{zhihong2024deepseekmath,
70
+ title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}},
71
+ author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo},
72
+ year = 2024,
73
+ eprint = {arXiv:2402.03300},
74
+ }
75
+
76
+ ```
77
+
78
+ Cite TRL as:
79
+
80
+ ```bibtex
81
+ @misc{vonwerra2022trl,
82
+ title = {{TRL: Transformer Reinforcement Learning}},
83
+ author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
84
+ year = 2020,
85
+ journal = {GitHub repository},
86
+ publisher = {GitHub},
87
+ howpublished = {\url{https://github.com/huggingface/trl}}
88
+ }
89
  ```