Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ This repository contains our comprehensive AI safety evaluation PEFT adapter mod
|
|
21 |
|
22 |
## Model Performance
|
23 |
|
24 |
-
|
25 |
|
26 |
### Overall Performance
|
27 |
- **Total Accuracy: 81.90%** (86/105 correct predictions)
|
|
|
21 |
|
22 |
## Model Performance
|
23 |
|
24 |
+
The GroundedAI Phi-4-Mini-Judge model achieves strong performance across all three evaluation dimensions on a balanced test set of 105 samples (35 per task):
|
25 |
|
26 |
### Overall Performance
|
27 |
- **Total Accuracy: 81.90%** (86/105 correct predictions)
|