PEFT
Safetensors
monkeypostulate commited on
Commit
4543671
·
verified ·
1 Parent(s): 3be466d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -1,3 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  datasets:
3
  - mlabonne/orpo-dpo-mix-40k
 
1
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-1B, optimized with ORPO (Optimized Regularization for Prompt Optimization) Trainer. Fine-tuning was performed using a subset of the [meta-llama/Llama-3.2-1B dataset, with only 100 samples selected to enable rapid training with ORPO’s efficient approach.
2
+
3
+ **Fine-tuning Method:** ORPO
4
+ **Dataset:** mlabonne/orpo-dpo-mix-40k
5
+
6
+
7
+ **Evaluation**
8
+
9
+ The model was evaluated on the following benchmarks, with the following performance metrics:
10
+
11
+
12
+ | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
13
+ |---------|------:|------|-----:|--------|---|-----:|---|-----:|
14
+ |hellaswag| 1|none | 0|acc |↑ | 0.4772 |± | 0.0050 |
15
+ | | |none | 0|acc_norm|↑ |0.6366 |± | 0.0048 |
16
+
17
+
18
+
19
+
20
  ---
21
  datasets:
22
  - mlabonne/orpo-dpo-mix-40k