Taishi-N324 commited on
Commit
e3dc340
·
verified ·
1 Parent(s): 9786992

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -79
README.md CHANGED
@@ -10,7 +10,7 @@ license: apache-2.0
10
 
11
  # Swallow-MS-7b-v0.1
12
 
13
- Our Swallow-MS-7b-v0.1 model has undergone continuous pre-training from the Mistral-7B-v0.1, primarily with the addition of Japanese language data.
14
 
15
  # Model Release Updates
16
 
@@ -38,24 +38,9 @@ This repository provides large language models developed by [TokyoTech-LLM](http
38
  |---|---|---|---|---|---|---|---|---|---|
39
  | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
40
 
41
- ## Base Model Performance
42
-
43
 
44
  ## Evaluation Benchmarks
45
 
46
- ### Japanese evaluation benchmarks
47
-
48
- We used llm-jp-eval(v1.0.0) and JP Language Model Evaluation Harness(commit #9b42d41). The details are as follows:
49
-
50
- - Multiple-choice question answering (JCommonsenseQA [Kurihara+, 2022])
51
- - Open-ended question answering (JEMHopQA [Ishii+, 2023])
52
- - Open-ended question answering (NIILC [Sekine, 2003])
53
- - Machine reading comprehension (JSQuAD [Kurihara+, 2022])
54
- - Automatic summarization (XL-Sum [Hasan+, 2021])
55
- - Machine translation (WMT2020 ja-en [Barrault+, 2020])
56
- - Machine translation (WMT2020 en-ja [Barrault+, 2020])
57
- - Mathematical reasoning (MGSM [Shi+, 2023])
58
-
59
  ### MT-Bench JA
60
 
61
  We used [Japanese MT-Bench](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_question) to assess the instruction-following capabilities of models.
@@ -66,24 +51,6 @@ We utilized the following artifacts:
66
  - Reference Answer: [Nejumi LLM-Leaderboard NEO, mtbench_ja_referenceanswer_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_referenceanswer/v1)
67
  - Prompt for Judge: [Nejumi LLM-Lederboard NEO, mtbench_ja_prompt_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_prompt/v1)
68
 
69
- ### English evaluation benchmarks
70
-
71
- We used the Language Model Evaluation Harness(v.0.3.0). The details are as follows:
72
-
73
- - Multiple-choice question answering (OpenBookQA [Mihaylov+, 2018])
74
- - Open-ended question answering (TriviaQA [Joshi+, 2017])
75
- - Machine reading comprehension (SQuAD 2.0 [Rajpurkar+, 2018])
76
- - Commonsense reasoning (XWINO [Tikhonov & Ryabinin, 2021])
77
- - Natural language inference (HellaSwag [Zellers+, 2019])
78
- - Mathematical reasoning (GSM8k [Cobbe+, 2021])
79
-
80
- ### Code evaluation benchmarks
81
-
82
- We utilized the Code Generation LM Evaluation Harness [Allal+, 2022] (commit #0261c52). The details are as follows:
83
-
84
- - Code generation (HumanEval [Chen+, 2021])
85
- - Code generation in Japanese (JHumanEval [Satoh+, 2024])
86
-
87
 
88
  ## Usage
89
 
@@ -93,7 +60,7 @@ First install additional dependencies in [requirements.txt](./requirements.txt):
93
  pip install -r requirements.txt
94
  ```
95
 
96
- ### Instruction format Ver1.0
97
  This format must be adhered to strictly, as deviations may result in less optimal outputs from the model.
98
 
99
  The template used to construct a prompt for the Instruct model is specified as follows:
@@ -102,15 +69,16 @@ The template used to construct a prompt for the Instruct model is specified as f
102
  <s>[INST] <<SYS>>\n{Instruction}\n<</SYS>>\n\n{USER_MESSAGE_1} [INST] {BOT_MESSAGE_1} </s>[INST] {USER_MESSAGE_2}[/INST]
103
  ```
104
 
105
- Please be aware that ``<s> `` and ``</s> `` are special tokens used for the beginning of string (BOS) and end of string (EOS), respectively, while [INST] and [/INST] are considered regular strings.
106
 
107
- ### Use the instruct model Ver1.0
 
108
 
109
  ```python
110
  import torch
111
  from transformers import AutoTokenizer, AutoModelForCausalLM
112
 
113
- model_name = "tokyotech-llm/Swallow-MS-7b-instruct-v1.0"
114
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
115
  tokenizer = AutoTokenizer.from_pretrained(model_name)
116
 
@@ -131,49 +99,9 @@ decoded = tokenizer.batch_decode(generated_ids)
131
  print(decoded[0])
132
  ```
133
 
134
-
135
- ### Use the base model
136
-
137
- ```python
138
- from transformers import AutoModelForCausalLM, AutoTokenizer
139
- import torch
140
-
141
- model_name = "tokyotech-llm/Swallow-MS-7b-v0.1"
142
- tokenizer = AutoTokenizer.from_pretrained(model_name)
143
-
144
- model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
145
- prompt = "東京工業大学の主なキャンパスは、"
146
- input_ids = tokenizer.encode(
147
- prompt,
148
- add_special_tokens=False,
149
- return_tensors="pt"
150
- )
151
- tokens = model.generate(
152
- input_ids.to(device=model.device),
153
- max_new_tokens=128,
154
- temperature=0.99,
155
- top_p=0.95,
156
- do_sample=True,
157
- )
158
-
159
- out = tokenizer.decode(tokens[0], skip_special_tokens=True)
160
- print(out)
161
- ```
162
-
163
  ## Training Datasets
164
 
165
- ### Continual Pre-Training
166
- The following datasets were used for continual pre-training.
167
-
168
- - [Algebraic Stack](https://huggingface.co/datasets/EleutherAI/proof-pile-2)
169
- - [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
170
- - [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)
171
- - [Swallow Corpus](https://chokkan.org/temp/tokyotech-llm/swallow-corpus)
172
- - [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
173
-
174
- ### Instruction Tuning
175
-
176
- #### Ver1.0
177
 
178
  The following datasets were used for the instruction tuning.
179
 
 
10
 
11
  # Swallow-MS-7b-v0.1
12
 
13
+ Our Swallow-MS-7b-v0.1 model has undergone continual pre-training from the Mistral-7B-v0.1, primarily with the addition of Japanese language data.
14
 
15
  # Model Release Updates
16
 
 
38
  |---|---|---|---|---|---|---|---|---|---|
39
  | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
40
 
 
 
41
 
42
  ## Evaluation Benchmarks
43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  ### MT-Bench JA
45
 
46
  We used [Japanese MT-Bench](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_question) to assess the instruction-following capabilities of models.
 
51
  - Reference Answer: [Nejumi LLM-Leaderboard NEO, mtbench_ja_referenceanswer_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_referenceanswer/v1)
52
  - Prompt for Judge: [Nejumi LLM-Lederboard NEO, mtbench_ja_prompt_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_prompt/v1)
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## Usage
56
 
 
60
  pip install -r requirements.txt
61
  ```
62
 
63
+ ### Instruction format Ver0.1
64
  This format must be adhered to strictly, as deviations may result in less optimal outputs from the model.
65
 
66
  The template used to construct a prompt for the Instruct model is specified as follows:
 
69
  <s>[INST] <<SYS>>\n{Instruction}\n<</SYS>>\n\n{USER_MESSAGE_1} [INST] {BOT_MESSAGE_1} </s>[INST] {USER_MESSAGE_2}[/INST]
70
  ```
71
 
72
+ Please be aware that ``<s>`` and ``</s>`` are special tokens used for the beginning of string (BOS) and end of string (EOS), respectively, while [INST] and [/INST] are considered regular strings.
73
 
74
+ For the "{Instruction}" part, We recommend using "あなたは誠実で優秀な日本人のアシスタントです。"
75
+ ### Use the instruct model Ver0.1
76
 
77
  ```python
78
  import torch
79
  from transformers import AutoTokenizer, AutoModelForCausalLM
80
 
81
+ model_name = "tokyotech-llm/Swallow-MS-7b-instruct-v0.1"
82
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
83
  tokenizer = AutoTokenizer.from_pretrained(model_name)
84
 
 
99
  print(decoded[0])
100
  ```
101
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  ## Training Datasets
103
 
104
+ ### Instruction Tuning Ver0.1
 
 
 
 
 
 
 
 
 
 
 
105
 
106
  The following datasets were used for the instruction tuning.
107