hypothetical commited on
Commit
0b195df
·
verified ·
1 Parent(s): bdc874a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -56,7 +56,8 @@ model = AutoModelForCausalLM.from_pretrained(
56
  token=hf_token,
57
  cache_dir=hf_cache_dir,
58
  torch_dtype=torch.bfloat16,
59
- attn_implementation="sdpa"
 
60
  ).to(device)
61
  model.generation_config.pad_token_id = tokenizer.eos_token_id
62
 
@@ -65,7 +66,9 @@ prompt = "Describe basics of DNNs quantization."
65
  inputs = tokenizer(prompt, return_tensors="pt")
66
  inputs.to(device)
67
 
68
- generate_ids = model.generate(**inputs, max_length=500)
 
 
69
  input_len = inputs['input_ids'].shape[1]
70
  generate_ids = generate_ids[:, input_len:]
71
  output = tokenizer.batch_decode(
@@ -126,13 +129,10 @@ For quality evaluation we have used: #TODO link to github
126
  | Winogrande | 0 | 0 | 0 | 0 | 0 | 0 |
127
 
128
 
129
- > __MMLU__: Evaluates/shows {MMLU}
130
-
131
- > __MMLU__: Evaluates/shows ...
132
-
133
- > __Arc Challenge__: Evaluates/shows ...
134
-
135
- > __PIQA__: Evaluates/shows ...
136
 
137
  ### Latency benchmarks
138
 
 
56
  token=hf_token,
57
  cache_dir=hf_cache_dir,
58
  torch_dtype=torch.bfloat16,
59
+ attn_implementation="sdpa",
60
+ mode='s'
61
  ).to(device)
62
  model.generation_config.pad_token_id = tokenizer.eos_token_id
63
 
 
66
  inputs = tokenizer(prompt, return_tensors="pt")
67
  inputs.to(device)
68
 
69
+ with torch.inference_mode:
70
+ generate_ids = model.generate(**inputs, max_length=500)
71
+
72
  input_len = inputs['input_ids'].shape[1]
73
  generate_ids = generate_ids[:, input_len:]
74
  output = tokenizer.batch_decode(
 
129
  | Winogrande | 0 | 0 | 0 | 0 | 0 | 0 |
130
 
131
 
132
+ * **MMLU**:Evaluates general knowledge across 57 subjects including science, humanities, engineering, and more. Shows model's ability to handle diverse academic topics.
133
+ * **PIQA**: Evaluates physical commonsense reasoning through questions about everyday physical interactions. Shows model's understanding of real-world physics concepts.
134
+ * **Arc Challenge**: Evaluates grade-school level multiple-choice questions requiring reasoning. Shows model's ability to solve complex reasoning tasks.
135
+ * **Winogrande**: Evaluates commonsense reasoning through sentence completion tasks. Shows model's capability to understand context and resolve ambiguity.
 
 
 
136
 
137
  ### Latency benchmarks
138