Adding Evaluation Results
#1
by
leaderboard-pr-bot
- opened
README.md
CHANGED
@@ -1,13 +1,116 @@
|
|
1 |
---
|
|
|
|
|
2 |
license: apache-2.0
|
|
|
|
|
3 |
datasets:
|
4 |
- Open-Orca/SlimOrca
|
5 |
-
language:
|
6 |
-
- en
|
7 |
pipeline_tag: text-generation
|
8 |
inference: false
|
9 |
-
|
10 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# π Falcon-RW-1B-Instruct-OpenOrca
|
@@ -76,4 +179,17 @@ This model may generate inaccurate or misleading information and is prone to hal
|
|
76 |
The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
|
77 |
|
78 |
## π¬ Contact
|
79 |
-
For further inquiries or feedback, please contact at [email protected].
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
license: apache-2.0
|
5 |
+
tags:
|
6 |
+
- text-generation-inference
|
7 |
datasets:
|
8 |
- Open-Orca/SlimOrca
|
|
|
|
|
9 |
pipeline_tag: text-generation
|
10 |
inference: false
|
11 |
+
model-index:
|
12 |
+
- name: falcon-rw-1b-instruct-openorca
|
13 |
+
results:
|
14 |
+
- task:
|
15 |
+
type: text-generation
|
16 |
+
name: Text Generation
|
17 |
+
dataset:
|
18 |
+
name: AI2 Reasoning Challenge (25-Shot)
|
19 |
+
type: ai2_arc
|
20 |
+
config: ARC-Challenge
|
21 |
+
split: test
|
22 |
+
args:
|
23 |
+
num_few_shot: 25
|
24 |
+
metrics:
|
25 |
+
- type: acc_norm
|
26 |
+
value: 34.56
|
27 |
+
name: normalized accuracy
|
28 |
+
source:
|
29 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
30 |
+
name: Open LLM Leaderboard
|
31 |
+
- task:
|
32 |
+
type: text-generation
|
33 |
+
name: Text Generation
|
34 |
+
dataset:
|
35 |
+
name: HellaSwag (10-Shot)
|
36 |
+
type: hellaswag
|
37 |
+
split: validation
|
38 |
+
args:
|
39 |
+
num_few_shot: 10
|
40 |
+
metrics:
|
41 |
+
- type: acc_norm
|
42 |
+
value: 60.93
|
43 |
+
name: normalized accuracy
|
44 |
+
source:
|
45 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
46 |
+
name: Open LLM Leaderboard
|
47 |
+
- task:
|
48 |
+
type: text-generation
|
49 |
+
name: Text Generation
|
50 |
+
dataset:
|
51 |
+
name: MMLU (5-Shot)
|
52 |
+
type: cais/mmlu
|
53 |
+
config: all
|
54 |
+
split: test
|
55 |
+
args:
|
56 |
+
num_few_shot: 5
|
57 |
+
metrics:
|
58 |
+
- type: acc
|
59 |
+
value: 28.77
|
60 |
+
name: accuracy
|
61 |
+
source:
|
62 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
63 |
+
name: Open LLM Leaderboard
|
64 |
+
- task:
|
65 |
+
type: text-generation
|
66 |
+
name: Text Generation
|
67 |
+
dataset:
|
68 |
+
name: TruthfulQA (0-shot)
|
69 |
+
type: truthful_qa
|
70 |
+
config: multiple_choice
|
71 |
+
split: validation
|
72 |
+
args:
|
73 |
+
num_few_shot: 0
|
74 |
+
metrics:
|
75 |
+
- type: mc2
|
76 |
+
value: 37.42
|
77 |
+
source:
|
78 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
79 |
+
name: Open LLM Leaderboard
|
80 |
+
- task:
|
81 |
+
type: text-generation
|
82 |
+
name: Text Generation
|
83 |
+
dataset:
|
84 |
+
name: Winogrande (5-shot)
|
85 |
+
type: winogrande
|
86 |
+
config: winogrande_xl
|
87 |
+
split: validation
|
88 |
+
args:
|
89 |
+
num_few_shot: 5
|
90 |
+
metrics:
|
91 |
+
- type: acc
|
92 |
+
value: 60.69
|
93 |
+
name: accuracy
|
94 |
+
source:
|
95 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
96 |
+
name: Open LLM Leaderboard
|
97 |
+
- task:
|
98 |
+
type: text-generation
|
99 |
+
name: Text Generation
|
100 |
+
dataset:
|
101 |
+
name: GSM8k (5-shot)
|
102 |
+
type: gsm8k
|
103 |
+
config: main
|
104 |
+
split: test
|
105 |
+
args:
|
106 |
+
num_few_shot: 5
|
107 |
+
metrics:
|
108 |
+
- type: acc
|
109 |
+
value: 3.41
|
110 |
+
name: accuracy
|
111 |
+
source:
|
112 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ericzzz/falcon-rw-1b-instruct-openorca
|
113 |
+
name: Open LLM Leaderboard
|
114 |
---
|
115 |
|
116 |
# π Falcon-RW-1B-Instruct-OpenOrca
|
|
|
179 |
The model is provided 'as is' without any warranties, and the creators are not liable for any damages arising from its use. Users are responsible for their interactions with the model.
|
180 |
|
181 |
## π¬ Contact
|
182 |
+
For further inquiries or feedback, please contact at [email protected].
|
183 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
184 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ericzzz__falcon-rw-1b-instruct-openorca)
|
185 |
+
|
186 |
+
| Metric |Value|
|
187 |
+
|---------------------------------|----:|
|
188 |
+
|Avg. |37.63|
|
189 |
+
|AI2 Reasoning Challenge (25-Shot)|34.56|
|
190 |
+
|HellaSwag (10-Shot) |60.93|
|
191 |
+
|MMLU (5-Shot) |28.77|
|
192 |
+
|TruthfulQA (0-shot) |37.42|
|
193 |
+
|Winogrande (5-shot) |60.69|
|
194 |
+
|GSM8k (5-shot) | 3.41|
|
195 |
+
|