Update README.md
Browse files
README.md
CHANGED
@@ -116,14 +116,14 @@ The fine-tuning dataset was compiled from the following sources:
|
|
116 |
* **Name:** LUXELLA (Luxembourgish Excellence Language Learning Assessment)
|
117 |
* **Description:** A custom benchmark designed to evaluate Luxembourgish language proficiency using synthetically generated questions.
|
118 |
* **Generation:** Questions generated using a Gemini-based LLM, prompted across 15 categories (vocabulary, grammar, translation, comprehension, culture, idioms, etc.), 4 difficulty levels (beginner, intermediate, advanced, native), and randomized topics. Output in structured JSON.
|
119 |
-
* **Evaluation Method:** LLM-based judgment. A separate LLM acts as a judge, scoring responses
|
120 |
* **Link:** Under-Progress
|
121 |
|
122 |
### Evaluation Results
|
123 |
|
124 |
**LuxLlama Performance on LUXELLA:**
|
125 |
|
126 |
-
* **Overall Score:**
|
127 |
|
128 |
**Scores by Category:**
|
129 |
|
|
|
116 |
* **Name:** LUXELLA (Luxembourgish Excellence Language Learning Assessment)
|
117 |
* **Description:** A custom benchmark designed to evaluate Luxembourgish language proficiency using synthetically generated questions.
|
118 |
* **Generation:** Questions generated using a Gemini-based LLM, prompted across 15 categories (vocabulary, grammar, translation, comprehension, culture, idioms, etc.), 4 difficulty levels (beginner, intermediate, advanced, native), and randomized topics. Output in structured JSON.
|
119 |
+
* **Evaluation Method:** LLM-based judgment. A separate LLM acts as a judge, scoring responses and providing a brief explanation.
|
120 |
* **Link:** Under-Progress
|
121 |
|
122 |
### Evaluation Results
|
123 |
|
124 |
**LuxLlama Performance on LUXELLA:**
|
125 |
|
126 |
+
* **Overall Score:** 74.6
|
127 |
|
128 |
**Scores by Category:**
|
129 |
|