Update README.md
Browse files
README.md
CHANGED
@@ -109,6 +109,112 @@ The fine-tuning dataset was compiled from the following sources:
|
|
109 |
* `save_steps`: 1000
|
110 |
* **Optimization Kernel:** Liger kernel enabled (`use_liger=True`) for increased throughput and reduced memory usage via optimized Triton kernels for common LLM operations.
|
111 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
112 |
## Evaluation
|
113 |
|
114 |
### Benchmarking Dataset - LUXELLA
|
@@ -166,6 +272,7 @@ The fine-tuning dataset was compiled from the following sources:
|
|
166 |
|
167 |
**Summary:** LuxLlama demonstrates strong performance on the LUXELLA benchmark, outperforming other tested models significantly. It excels in translation, comprehension, and verb conjugation. Areas like vocabulary, spelling, and idioms show relatively lower scores, indicating room for improvement in capturing finer linguistic nuances. The model handles beginner-level tasks very well, with a gradual decrease in performance as difficulty increases, validating the benchmark's sensitivity. Sample high-performing questions show correct handling of cultural knowledge, spelling, and advanced verb conjugation, while low-performing samples highlight challenges with specific grammar rules (Konjunktiv II usage), subtle distinctions in vocabulary (Niess vs Kusinn), and standard word order conventions.
|
168 |
|
|
|
169 |
## Learnings and Observations
|
170 |
|
171 |
*(This section should be updated after further analysis and usage.)*
|
|
|
109 |
* `save_steps`: 1000
|
110 |
* **Optimization Kernel:** Liger kernel enabled (`use_liger=True`) for increased throughput and reduced memory usage via optimized Triton kernels for common LLM operations.
|
111 |
|
112 |
+
## Inference - vLLM
|
113 |
+
|
114 |
+
### Installation
|
115 |
+
|
116 |
+
```bash
|
117 |
+
!pip -q install vllm
|
118 |
+
!pip -q install bitsandbytes>=0.45.3
|
119 |
+
```
|
120 |
+
|
121 |
+
### vLLM Inference
|
122 |
+
|
123 |
+
```python
|
124 |
+
import os
|
125 |
+
import torch
|
126 |
+
from vllm import LLM
|
127 |
+
from vllm import SamplingParams
|
128 |
+
|
129 |
+
os.environ['VLLM_USE_V1'] = '0'
|
130 |
+
|
131 |
+
model_id = "aiplanet/LuxLlama"
|
132 |
+
llm = LLM(model=model_id, dtype=torch.bfloat16, trust_remote_code=True, max_model_len=8192)
|
133 |
+
|
134 |
+
prompts = [
|
135 |
+
"""
|
136 |
+
<system>
|
137 |
+
You are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request.
|
138 |
+
</system>
|
139 |
+
<user>
|
140 |
+
### Instruction:
|
141 |
+
Berechent d'Resultat vun dëser Rechnung.
|
142 |
+
|
143 |
+
### Input:
|
144 |
+
Wéi vill ass (12 + 8) × 2 - 10?
|
145 |
+
</user>
|
146 |
+
|
147 |
+
<assistant>
|
148 |
+
""",
|
149 |
+
"""
|
150 |
+
<system>
|
151 |
+
You are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request.
|
152 |
+
</system>
|
153 |
+
<user>
|
154 |
+
### Instruction:
|
155 |
+
Iwwersetz dësen Saz vum Engleschen an d’Lëtzebuergesch.
|
156 |
+
|
157 |
+
### Input:
|
158 |
+
'We are going to the market tomorrow morning.'
|
159 |
+
</user>
|
160 |
+
|
161 |
+
<assistant>
|
162 |
+
""",
|
163 |
+
"""
|
164 |
+
<system>
|
165 |
+
You are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request.
|
166 |
+
</system>
|
167 |
+
<user>
|
168 |
+
### Instruction:
|
169 |
+
Beäntwert d’Fro mat gesonde Mënscheverstand.
|
170 |
+
|
171 |
+
### Input:
|
172 |
+
Kann een engem Bësch eng Email schécken? Firwat oder firwat net?
|
173 |
+
</user>
|
174 |
+
|
175 |
+
<assistant>
|
176 |
+
""",
|
177 |
+
]
|
178 |
+
# Create a sampling params object.
|
179 |
+
sampling_params = SamplingParams(temperature=0.1, top_p=0.1, top_k=1, max_tokens=1024)
|
180 |
+
|
181 |
+
def main(llm):
|
182 |
+
# Create an LLM.
|
183 |
+
llm = llm
|
184 |
+
# Generate texts from the prompts.
|
185 |
+
# The output is a list of RequestOutput objects
|
186 |
+
# that contain the prompt, generated text, and other information.
|
187 |
+
outputs = llm.generate(prompts, sampling_params)
|
188 |
+
# Print the outputs.
|
189 |
+
print("\nGenerated Outputs:\n" + "-" * 60)
|
190 |
+
for output in outputs:
|
191 |
+
prompt = output.prompt
|
192 |
+
generated_text = output.outputs[0].text
|
193 |
+
print(f"Prompt: {prompt!r}")
|
194 |
+
print(f"Output: {generated_text!r}")
|
195 |
+
print("-" * 60)
|
196 |
+
|
197 |
+
|
198 |
+
if __name__ == "__main__":
|
199 |
+
main(llm)
|
200 |
+
```
|
201 |
+
|
202 |
+
#### Output
|
203 |
+
|
204 |
+
```bash
|
205 |
+
Generated Outputs:
|
206 |
+
------------------------------------------------------------
|
207 |
+
Prompt: "\n<system> \nYou are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request. \n</system> \n<user> \n### Instruction: \nBerechent d'Resultat vun dëser Rechnung. \n\n### Input: \nWéi vill ass (12 + 8) × 2 - 10?\n</user> \n\n<assistant>\n"
|
208 |
+
Output: "### Response:\n D'Resultat vun der Rechnung ass 30. Hei ass d'Schrëtt-fir-Schrëtt Berechnung: 1. Füügt 12 an 8: 12 + 8 = 20 2. Multiplizéiert d'Resultat mat 2: 20 × 2 = 40 3. Subtrahéiert 10 vum Resultat: 40 - 10 = 30\n</assistant> \n"
|
209 |
+
------------------------------------------------------------
|
210 |
+
Prompt: "\n<system> \nYou are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request. \n</system> \n<user> \n### Instruction: \nIwwersetz dësen Saz vum Engleschen an d’Lëtzebuergesch. \n\n### Input: \n'We are going to the market tomorrow morning.'\n</user> \n\n<assistant>\n"
|
211 |
+
Output: "### Response:\n 'Mir ginn muer fréi op de Maart.'\n</assistant> \n"
|
212 |
+
------------------------------------------------------------
|
213 |
+
Prompt: "\n<system> \nYou are a highly capable assistant trained to handle a variety of tasks. Below is an instruction that describes a task, paired with an input that provides further context. Your goal is to provide an accurate, clear, and contextually appropriate response that fulfills the user's request. \n</system> \n<user> \n### Instruction: \nBeäntwert d’Fro mat gesonde Mënscheverstand. \n\n### Input: \nKann een engem Bësch eng Email schécken? Firwat oder firwat net?\n</user> \n\n<assistant>\n"
|
214 |
+
Output: "### Response:\n Et ass net méiglech eng E-Mail un e Bësch ze schécken. E-Mail ass eng Form vu Kommunikatioun déi fir d'Mënschen entwéckelt gouf, an et erfuerdert eng digital Infrastruktur fir ze funktionéieren, wéi Internet a Computeren. Bëscher, op der anerer Säit, sinn natierlech Liewensraim, déi aus Planzen, Déieren an aner Elementer besteet. Si hunn keng Fäegkeet fir digital Kommunikatioun oder Informatioun ze verstoen, also kann keng E-Mail un e Bësch geschéckt ginn.\n</assistant> \n"
|
215 |
+
------------------------------------------------------------
|
216 |
+
```
|
217 |
+
|
218 |
## Evaluation
|
219 |
|
220 |
### Benchmarking Dataset - LUXELLA
|
|
|
272 |
|
273 |
**Summary:** LuxLlama demonstrates strong performance on the LUXELLA benchmark, outperforming other tested models significantly. It excels in translation, comprehension, and verb conjugation. Areas like vocabulary, spelling, and idioms show relatively lower scores, indicating room for improvement in capturing finer linguistic nuances. The model handles beginner-level tasks very well, with a gradual decrease in performance as difficulty increases, validating the benchmark's sensitivity. Sample high-performing questions show correct handling of cultural knowledge, spelling, and advanced verb conjugation, while low-performing samples highlight challenges with specific grammar rules (Konjunktiv II usage), subtle distinctions in vocabulary (Niess vs Kusinn), and standard word order conventions.
|
274 |
|
275 |
+
|
276 |
## Learnings and Observations
|
277 |
|
278 |
*(This section should be updated after further analysis and usage.)*
|