beomi commited on
Commit
b5ef04e
·
1 Parent(s): b3d5578

Fix typo on tokenize example

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -43,7 +43,7 @@ Llama-2-Ko is an auto-regressive language model that uses an optimized transform
43
  - New vocab and merges, trained with Korean Corpus
44
  - Tokenizer Examples: Llama-2 vs **Llama-2-Ko**
45
  - Use the same tokenization for English, but a shorter/merged tokenization for Korean.
46
- - Tokenize "안녕하세요, 오늘은 날씨가 좋네요."
47
  - Llama-2:
48
  ```
49
  ['▁', '안', '<0xEB>', '<0x85>', '<0x95>', '하', '세', '요', ',', '▁', '오', '<0xEB>', '<0x8A>', '<0x98>', '은', '▁', '<0xEB>', '<0x82>', '<0xA0>', '씨', '가', '▁', '<0xEC>', '<0xA2>', '<0x8B>', '<0xEB>', '<0x84>', '<0xA4>', '요']
 
43
  - New vocab and merges, trained with Korean Corpus
44
  - Tokenizer Examples: Llama-2 vs **Llama-2-Ko**
45
  - Use the same tokenization for English, but a shorter/merged tokenization for Korean.
46
+ - Tokenize "안녕하세요, 오늘은 날씨가 좋네요."
47
  - Llama-2:
48
  ```
49
  ['▁', '안', '<0xEB>', '<0x85>', '<0x95>', '하', '세', '요', ',', '▁', '오', '<0xEB>', '<0x8A>', '<0x98>', '은', '▁', '<0xEB>', '<0x82>', '<0xA0>', '씨', '가', '▁', '<0xEC>', '<0xA2>', '<0x8B>', '<0xEB>', '<0x84>', '<0xA4>', '요']