Commit
·
177279f
1
Parent(s):
2fd8d40
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,49 @@ tags:
|
|
| 7 |
- code
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
- code
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# ML4SE23_G1_WizardCoder-SCoT-1B-V1.0
|
| 11 |
+
|
| 12 |
+
IN4334 ML4SE
|
| 13 |
+
|
| 14 |
+
Group1 WizardCoder
|
| 15 |
+
|
| 16 |
+
This model is the result of the fine-tunign of the WizardCoder-1B-V1.0 model using Structured Chain-of-Though (S-CoT) enhanced instructions.
|
| 17 |
+
S-CoT is used to enhance a sample of about 1200 entries from the Evol-Instruct 80k dataset.
|
| 18 |
+
The resulting dataset is then used for the training task.
|
| 19 |
+
The current WizardCoder model and the new S-CoT fine-tuned one are compared on both versions of HumanEval and MBPP (S-CoT enhanced and not) on the pass@1 metric.
|
| 20 |
+
The S-CoT enhancement of the evaluation datasets allows to study its effect when used just as a prompting technique, independently of the S-CoT fine-tuning of the model.
|
| 21 |
+
|
| 22 |
+
## Fine-tuning Details
|
| 23 |
+
|
| 24 |
+
| Hyperparameter | [WizardCoder-1B-V1.0](https://huggingface.co/WizardLM/WizardCoder-1B-V1.0) |
|
| 25 |
+
|----------------|---------------------|
|
| 26 |
+
| Batch size | 16 |
|
| 27 |
+
| Learning rate | 2e-5 |
|
| 28 |
+
| Epochs | 3 |
|
| 29 |
+
| Max length | 2048 |
|
| 30 |
+
| Warmup step | 30 |
|
| 31 |
+
| LR scheduler | cosine |
|
| 32 |
+
| Dataset | [ML4SE23_G1_EvolInstruct-SCoT-1k](https://huggingface.co/datasets/ML4SE2023-G1-WizardCoder/ML4SE23_G1_EvolInstruct-SCoT-1k) |
|
| 33 |
+
|
| 34 |
+
The hardware consisted on a GPU instance rented from [DataCrunch](https://datacrunch.io/) with the following specifications:
|
| 35 |
+
|
| 36 |
+
| NVidia RTX A6000 48GB 1A6000.10V |
|
| 37 |
+
|----------------------------------|
|
| 38 |
+
| 2 GPUs |
|
| 39 |
+
| 48GB VRAM per GPU |
|
| 40 |
+
| 60 GB RAM |
|
| 41 |
+
| 10 CPUs |
|
| 42 |
+
| 100GB SSD Storage |
|
| 43 |
+
| Ubuntu 20.04 |
|
| 44 |
+
| CUDA 11.6 |
|
| 45 |
+
|
| 46 |
+
## Results
|
| 47 |
+
|
| 48 |
+
Results of pass@1(%) on HumanEval and MBPP compared to HumanEval-SCoT and MBPP-SCoT using WizardCoder-1B, WizardCoder-SCoT-1B and WizardCoder-15B.
|
| 49 |
+
|
| 50 |
+
| **Dataset** | **WizardCoder-1B-V1.0** | **WizardCoder-SCoT-1B-V1.0** | **WizardCoder-15B-V1.0** |
|
| 51 |
+
|----------------|-------------------------|------------------------------|--------------------------|
|
| 52 |
+
| HumanEval | 23.78 | **17.68** | 57.3 |
|
| 53 |
+
| HumanEval-SCoT | **44.51** | **27.44** | **57.3** |
|
| 54 |
+
| MBPP | 23.4 | **19.4** | 51.8 |
|
| 55 |
+
| MBPP-SCoT | **40** | **28** | **45.6** |
|