File size: 1,195 Bytes
1d592ef
 
 
 
 
 
 
 
 
 
e3546cf
 
 
 
87d95b8
16d7ef2
87d95b8
 
e3546cf
 
 
 
 
 
16d7ef2
e3546cf
 
 
 
16d7ef2
e3546cf
059183b
 
 
753f6e2
 
 
 
 
 
 
 
e3546cf
 
 
 
16d7ef2
 
e3546cf
 
059183b
 
33dc622
 
 
059183b
e3546cf
 
 
87d95b8
 
 
 
e3546cf
 
 
 
 
1d592ef
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
license: mit
datasets:
- mlabonne/orpo-dpo-mix-40k
language:
- en
base_model:
- meta-llama/Llama-3.2-1B-Instruct
pipeline_tag: text-classification
---
## Model Details

### Model Description

- **Developed by:** Chintan Shah

- **Model type:** meta-llama/Llama-3.2-1B-Instruct
- **Finetuned from model [optional]:** meta-llama/Llama-3.2-1B-Instruct


## Training Details

### Training Data

mlabonne/orpo-dpo-mix-40k


### Training Procedure

ORPO

### Training Parameters
## Training Arguments:

- Learning Rate: 1e-5
- Batch Size: 1
- max_steps: 1
- Block Size: 512
- Warmup Ratio: 0.1
- Weight Decay: 0.01
- Gradient Accumulation: 4
- Mixed Precision: bf16


#### Training Hyperparameters

- **Training regime:** 
fp16 mixed precision


### LoRA Configuration:

- R: 16
- Alpha: 32
- Dropout: 0.05


## Evaluation

|  Tasks  |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|hellaswag|      1|none  |     0|acc     |↑  |0.4408|±  |0.0050|
|         |       |none  |     0|acc_norm|↑  |0.5922|±  |0.0049|

### Testing Data, Factors & Metrics

#### Testing Data

https://github.com/EleutherAI/lm-evaluation-harnes