tokyotech-llm
/

Swallow-MS-7b-instruct-v0.1

@@ -10,7 +10,7 @@ license: apache-2.0
 # Swallow-MS-7b-v0.1
-Our Swallow-MS-7b-v0.1 model has undergone continuous pre-training from the Mistral-7B-v0.1, primarily with the addition of Japanese language data.
 # Model Release Updates
@@ -38,24 +38,9 @@ This repository provides large language models developed by [TokyoTech-LLM](http
 |---|---|---|---|---|---|---|---|---|---|
 | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
-## Base Model Performance
 ## Evaluation Benchmarks
-### Japanese evaluation benchmarks
-We used llm-jp-eval(v1.0.0) and JP Language Model Evaluation Harness(commit #9b42d41). The details are as follows:
-- Multiple-choice question answering (JCommonsenseQA [Kurihara+, 2022])
-- Open-ended question answering (JEMHopQA [Ishii+, 2023])
-- Open-ended question answering (NIILC [Sekine, 2003])
-- Machine reading comprehension (JSQuAD [Kurihara+, 2022])
-- Automatic summarization (XL-Sum [Hasan+, 2021])
-- Machine translation (WMT2020 ja-en [Barrault+, 2020])
-- Machine translation (WMT2020 en-ja [Barrault+, 2020])
-- Mathematical reasoning (MGSM [Shi+, 2023])
 ### MT-Bench JA
 We used [Japanese MT-Bench](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_question) to assess the instruction-following capabilities of models.
@@ -66,24 +51,6 @@ We utilized the following artifacts:
 - Reference Answer: [Nejumi LLM-Leaderboard NEO, mtbench_ja_referenceanswer_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_referenceanswer/v1)
 - Prompt for Judge: [Nejumi LLM-Lederboard NEO, mtbench_ja_prompt_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_prompt/v1)
-### English evaluation benchmarks
-We used the Language Model Evaluation Harness(v.0.3.0). The details are as follows:
-- Multiple-choice question answering (OpenBookQA [Mihaylov+, 2018])
-- Open-ended question answering (TriviaQA [Joshi+, 2017])
-- Machine reading comprehension (SQuAD 2.0 [Rajpurkar+, 2018])
-- Commonsense reasoning (XWINO [Tikhonov & Ryabinin, 2021])
-- Natural language inference (HellaSwag [Zellers+, 2019])
-- Mathematical reasoning (GSM8k [Cobbe+, 2021])
-### Code evaluation benchmarks
-We utilized the Code Generation LM Evaluation Harness [Allal+, 2022] (commit #0261c52). The details are as follows:
-- Code generation (HumanEval [Chen+, 2021])
-- Code generation in Japanese (JHumanEval [Satoh+, 2024])
 ## Usage
@@ -93,7 +60,7 @@ First install additional dependencies in [requirements.txt](./requirements.txt):
 pip install -r requirements.txt
 ```
-### Instruction format Ver1.0
 This format must be adhered to strictly, as deviations may result in less optimal outputs from the model.
 The template used to construct a prompt for the Instruct model is specified as follows:
@@ -102,15 +69,16 @@ The template used to construct a prompt for the Instruct model is specified as f
 <s>[INST] <<SYS>>\n{Instruction}\n<</SYS>>\n\n{USER_MESSAGE_1} [INST] {BOT_MESSAGE_1} </s>[INST] {USER_MESSAGE_2}[/INST]
 ```
-Please be aware that  ``<s> `` and  ``</s> `` are special tokens used for the beginning of string (BOS) and end of string (EOS), respectively, while [INST] and [/INST] are considered regular strings.
-### Use the instruct model Ver1.0
 ```python
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
-model_name = "tokyotech-llm/Swallow-MS-7b-instruct-v1.0"
 model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
 tokenizer = AutoTokenizer.from_pretrained(model_name)
@@ -131,49 +99,9 @@ decoded = tokenizer.batch_decode(generated_ids)
 print(decoded[0])
 ```
-### Use the base model
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-import torch
-model_name = "tokyotech-llm/Swallow-MS-7b-v0.1"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
-prompt = "東京工業大学の主なキャンパスは、"
-input_ids = tokenizer.encode(
-    prompt,
-    add_special_tokens=False,
-    return_tensors="pt"
-)
-tokens = model.generate(
-    input_ids.to(device=model.device),
-    max_new_tokens=128,
-    temperature=0.99,
-    top_p=0.95,
-    do_sample=True,
-)
-out = tokenizer.decode(tokens[0], skip_special_tokens=True)
-print(out)
-```
 ## Training Datasets
-### Continual Pre-Training
-The following datasets were used for continual pre-training.
-- [Algebraic Stack](https://huggingface.co/datasets/EleutherAI/proof-pile-2)
-- [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch)
-- [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)
-- [Swallow Corpus](https://chokkan.org/temp/tokyotech-llm/swallow-corpus)
-- [The Pile](https://huggingface.co/datasets/EleutherAI/pile)
-### Instruction Tuning
-#### Ver1.0
 The following datasets were used for the instruction tuning.

 # Swallow-MS-7b-v0.1
+Our Swallow-MS-7b-v0.1 model has undergone continual pre-training from the Mistral-7B-v0.1, primarily with the addition of Japanese language data.
 # Model Release Updates
 |---|---|---|---|---|---|---|---|---|---|
 | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
 ## Evaluation Benchmarks
 ### MT-Bench JA
 We used [Japanese MT-Bench](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_question) to assess the instruction-following capabilities of models.
 - Reference Answer: [Nejumi LLM-Leaderboard NEO, mtbench_ja_referenceanswer_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_referenceanswer/v1)
 - Prompt for Judge: [Nejumi LLM-Lederboard NEO, mtbench_ja_prompt_v1](https://wandb.ai/wandb-japan/llm-leaderboard/artifacts/dataset/mtbench_ja_prompt/v1)
 ## Usage
 pip install -r requirements.txt
 ```
+### Instruction format Ver0.1
 This format must be adhered to strictly, as deviations may result in less optimal outputs from the model.
 The template used to construct a prompt for the Instruct model is specified as follows:
 <s>[INST] <<SYS>>\n{Instruction}\n<</SYS>>\n\n{USER_MESSAGE_1} [INST] {BOT_MESSAGE_1} </s>[INST] {USER_MESSAGE_2}[/INST]
 ```
+Please be aware that ``<s>`` and ``</s>`` are special tokens used for the beginning of string (BOS) and end of string (EOS), respectively, while [INST] and [/INST] are considered regular strings.
+For the "{Instruction}" part, We recommend using "あなたは誠実で優秀な日本人のアシスタントです。"
+### Use the instruct model Ver0.1
 ```python
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "tokyotech-llm/Swallow-MS-7b-instruct-v0.1"
 model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 print(decoded[0])
 ```
 ## Training Datasets
+### Instruction Tuning Ver0.1
 The following datasets were used for the instruction tuning.