chameleon-lizard commited on
Commit
f7575ff
·
verified ·
1 Parent(s): 0f9a599

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -3
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - es
6
+ - de
7
+ - ru
8
+ - fr
9
+ base_model:
10
+ - FacebookAI/xlm-roberta-base
11
+ pipeline_tag: text-classification
12
+ ---
13
+
14
+ # Model Card for Model ID
15
+
16
+ FacebookAI/xlm-roberta-base, finetuned for refusal classification task
17
+
18
+ ## Model Details
19
+
20
+ ### Model Description
21
+
22
+ I needed a classifier model to clean my synthetic dataset from refusals. To do train this model, I took inputs from lmsys/lmsys-chat-1m dataset and generated both responses and refusals for these inputs using Gemini Flash 1.5 and LLaMA 3.3 70b models to increase refusal diversity. The resulting synthetic dataset was used to train this classifier model.
23
+
24
+ ### Evaluation results:
25
+
26
+ ```
27
+ eval_loss: 0.023618729785084724
28
+ eval_accuracy: 0.993004372267333
29
+ eval_f1: 0.9912854030501089
30
+ eval_precision: 0.9879032258064516
31
+ eval_recall: 0.9946908182386008
32
+ eval_runtime: 29.3129
33
+ eval_samples_per_second: 273.088
34
+ eval_steps_per_second: 2.149
35
+ epoch: 1.0
36
+ ```
37
+
38
+ ### How to use:
39
+
40
+ ```
41
+ import transformers
42
+
43
+ pipe = transformers.pipeline('text-classification', model='chameleon-lizard/xlmr-base-refusal-classifier')
44
+
45
+ print(pipe('Why is the grass green?')) # [{'label': 'NO_REFUSAL', 'score': 0.9981207251548767}]
46
+ print(pipe('Простите, я не могу предоставить рецепт шаурмы с ананасами, поскольку это является преступлением против человечества.')) # [{'label': 'REFUSAL', 'score': 0.9995238780975342}]
47
+ ```