HighCWu commited on
Commit
d5cfe56
·
verified ·
1 Parent(s): 87ecf47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -1
README.md CHANGED
@@ -8,4 +8,58 @@ base_model:
8
  - HighCWu/Embformer-MiniMind-RLHF-0.1B
9
  pipeline_tag: text-generation
10
  library_name: transformers
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - HighCWu/Embformer-MiniMind-RLHF-0.1B
9
  pipeline_tag: text-generation
10
  library_name: transformers
11
+ ---
12
+
13
+ # Embformer-MiniMind-R1-0.1B
14
+
15
+ A 0.1B distilled reasoning model of the reasearch note [Embformer: An Embedding-Weight-Only Transformer Architecture](https://doi.org/10.5281/zenodo.15736957), which trained on [jingyaogong/minimind_dataset](https://huggingface.co/datasets/jingyaogong/minimind_dataset) with 512 sequence length.
16
+
17
+
18
+ Run commands in the terminal:
19
+ ```sh
20
+ pip install "transformers @ git+https://github.com/huggingface/transformers.git@cb0f604"
21
+ ```
22
+
23
+ The following contains a code snippet illustrating how to use the model generate content based on given inputs.
24
+
25
+ ```python
26
+ from transformers import AutoModelForCausalLM, AutoTokenizer
27
+
28
+ model_name = "HighCWu/Embformer-MiniMind-R1-0.1B"
29
+
30
+ # load the tokenizer and the model
31
+ tokenizer = AutoTokenizer.from_pretrained(
32
+ model_name,
33
+ trust_remote_code=True,
34
+ cache_dir=".cache"
35
+ )
36
+ model = AutoModelForCausalLM.from_pretrained(
37
+ model_name,
38
+ torch_dtype="auto",
39
+ device_map="auto",
40
+ trust_remote_code=True,
41
+ cache_dir=".cache"
42
+ )
43
+
44
+ # prepare the model input
45
+ prompt = "请为我讲解“大语言模型”这个概念。"
46
+ messages = [
47
+ {"role": "user", "content": prompt}
48
+ ]
49
+ text = tokenizer.apply_chat_template(
50
+ messages,
51
+ tokenize=False,
52
+ add_generation_prompt=True
53
+ )
54
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
55
+
56
+ # conduct text completion
57
+ generated_ids = model.generate(
58
+ input_ids=model_inputs['input_ids'],
59
+ attention_mask=model_inputs['attention_mask'],
60
+ max_new_tokens=8192
61
+ )
62
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
63
+
64
+ print(tokenizer.decode(output_ids, skip_special_tokens=True))
65
+ ```