tokyotech-llm
/

Llama-3.1-Swallow-8B-Instruct-v0.2

@@ -12,6 +12,9 @@ datasets:
   - lmsys/lmsys-chat-1m
   - tokyotech-llm/lmsys-chat-1m-synth
   - argilla/magpie-ultra-v0.1
 ---
 # Llama 3.1 Swallow - Built with Llama
@@ -36,7 +39,7 @@ See the Swallow Model Index section to find other model variants.
 |Model|Llama-3.1-Swallow v0.1|Llama-3.1-Swallow-Instruct v0.1|Llama-3.1-Swallow v0.2|Llama-3.1-Swallow-Instruct v0.2|Llama-3.1-Swallow-Instruct v0.3|
 |---|---|---|---|---|---|
 |8B| [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.2) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.2) |  [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)
-|70B| [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1) |  |  | |
 ![logo](./logo.png)
@@ -196,22 +199,19 @@ print(output[0].outputs[0].text)
 The following datasets were used for the instruction tuning.
 - Japanese
-  - `lmsys-chat-1m-synth-ja-wo-pii-and-template-instructions`
     - Single-turn Japanese instruction dataset synthesized and derived from [lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m) [\[Zhang+, ICLR24\]](https://openreview.net/forum?id=BOfDKxfwt0)). First-turn user instructions were translated into Japanese via DeepL (machine translation), and assistant responses were generated using [Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct). [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) served as a judge for rejection sampling (n=6).
    Conversations containing personally identifiable information (PII) and template-based user instructions were removed. Duplicate instructions were removed.
-    - The dataset is available at [tokyotech-llm/lmsys-chat-1m-synth](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth).
-  - `filtered-magpie-ultra-ja`
     - A Japanese variant of the `filtered-magpie-ultra-en` dataset, translated into Japanese by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it).
-  - `gemma-magpie`
-    - A Japanese synthetic Q&A dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions. The conversations were then heuristically filtered for quality and length.
 - English
-  - `lmsys-chat-1m-synth-en-wo-pii-and-template-instructions`
-    - The creation process is similar to `lmsys-chat-1m-synth-ja-wo-pii-and-template-instructions`, but this version uses the original English user instructions. The assistant responses were generated in English as well. Rejection sampling was not applied for this version.
-    - The dataset is available at [tokyotech-llm/lmsys-chat-1m-synth](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth).
   - `filtered-magpie-ultra-en`
     - A subset of the [magpie-ultra](https://huggingface.co/datasets/argilla/magpie-ultra-v0.1) dataset, developed following the MAGPIE recipe [\[Xu+, arXiv24\]](https://arxiv.org/abs/2406.08464) using [Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct). This subset includes only samples rated as 'average,' 'good,' or 'excellent.'
 ## Risks and Limitations
 The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.

   - lmsys/lmsys-chat-1m
   - tokyotech-llm/lmsys-chat-1m-synth
   - argilla/magpie-ultra-v0.1
+  - tokyotech-llm/swallow-magpie-ultra-v0.1
+  - tokyotech-llm/swallow-gemma-magpie-v0.1
 ---
 # Llama 3.1 Swallow - Built with Llama
 |Model|Llama-3.1-Swallow v0.1|Llama-3.1-Swallow-Instruct v0.1|Llama-3.1-Swallow v0.2|Llama-3.1-Swallow-Instruct v0.2|Llama-3.1-Swallow-Instruct v0.3|
 |---|---|---|---|---|---|
 |8B| [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.2) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.2) |  [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)
+|70B| [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1) | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1) |  |  | [Link](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3) |
 ![logo](./logo.png)
 The following datasets were used for the instruction tuning.
 - Japanese
+  - [Llama-3.1-LMSYS-Chat-1M-Synth-Ja](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth)
     - Single-turn Japanese instruction dataset synthesized and derived from [lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m) [\[Zhang+, ICLR24\]](https://openreview.net/forum?id=BOfDKxfwt0)). First-turn user instructions were translated into Japanese via DeepL (machine translation), and assistant responses were generated using [Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct). [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) served as a judge for rejection sampling (n=6).
    Conversations containing personally identifiable information (PII) and template-based user instructions were removed. Duplicate instructions were removed.
+  - [Swallow-Magpie-Ultra-v0.1](https://huggingface.co/datasets/tokyotech-llm/swallow-magpie-ultra-v0.1)
     - A Japanese variant of the `filtered-magpie-ultra-en` dataset, translated into Japanese by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it).
+  - [Swallow-Gemma-Magpie-v0.1](https://huggingface.co/datasets/tokyotech-llm/swallow-gemma-magpie-v0.1)
+    - A Japanese synthetic instruction tuning dataset from scratch, generated by [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it). User instructions were created with prompts specific to each topic, and assistant responses were generated for these instructions. The conversations were then heuristically filtered for quality and length.
 - English
+  - [Llama-3.1-LMSYS-Chat-1M-Synth-En](https://huggingface.co/datasets/tokyotech-llm/lmsys-chat-1m-synth)
+    - The creation process is similar to `Llama-3.1-LMSYS-Chat-1M-Synth-Ja`, but this version uses the original English user instructions. The assistant responses were generated in English as well. Rejection sampling was not applied for this version.
   - `filtered-magpie-ultra-en`
     - A subset of the [magpie-ultra](https://huggingface.co/datasets/argilla/magpie-ultra-v0.1) dataset, developed following the MAGPIE recipe [\[Xu+, arXiv24\]](https://arxiv.org/abs/2406.08464) using [Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct). This subset includes only samples rated as 'average,' 'good,' or 'excellent.'
 ## Risks and Limitations
 The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.