Update README.md
Browse files
README.md
CHANGED
@@ -61,7 +61,7 @@ Model evaluation metrics and results.
|
|
61 |
## Model Training Details
|
62 |
|
63 |
This model was obtained by sparse-tranfer of the sparse foundational model [Llama-2-7b-pruned70-retrained](https://huggingface.co/neuralmagic/Llama-2-7b-pruned70-retrained) on the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset.
|
64 |
-
Training was
|
65 |
|
66 |
## Help
|
67 |
|
|
|
61 |
## Model Training Details
|
62 |
|
63 |
This model was obtained by sparse-tranfer of the sparse foundational model [Llama-2-7b-pruned70-retrained](https://huggingface.co/neuralmagic/Llama-2-7b-pruned70-retrained) on the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset.
|
64 |
+
Training was performed for 2 epochs and used the [SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation with [Llama-2-7b-ultrachat](https://huggingface.co/neuralmagic/Llama-2-7b-ultrachat) as teacher.
|
65 |
|
66 |
## Help
|
67 |
|