Update README.md
Browse files
README.md
CHANGED
@@ -31,13 +31,7 @@ model = TextGeneration(model_path="hf:nm-testing/TinyLlama-1.1B-Chat-v1.0-pruned
|
|
31 |
print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
|
32 |
|
33 |
"""
|
34 |
-
|
35 |
-
2. In a mixing bowl, add 1 cup of all-purpose flour, 1 cup of melted coconut oil, 1/2 cup of sugar, 1/2 cup of banana, 1/2 cup of melted coconut oil, 1/2 cup of salt, 1/2 cup of vanilla extract, and 1/2 cup of baking powder.
|
36 |
-
3. Mix the ingredients together until they are well combined.
|
37 |
-
4. Add 1/2 cup of melted coconut oil to the mixture.
|
38 |
-
5. Add 1/2 cup of melted coconut oil to the mixture.
|
39 |
-
6. Mix the ingredients together until they are well combined.
|
40 |
-
7. Add 1/2 cup of melted
|
41 |
|
42 |
"""
|
43 |
```
|
@@ -88,6 +82,11 @@ run_train(
|
|
88 |
splits = splits
|
89 |
)
|
90 |
```
|
|
|
|
|
|
|
|
|
|
|
91 |
Follow the instructions on our [One Shot With SparseML](https://github.com/neuralmagic/sparseml/tree/main/src/sparseml/transformers/sparsification/obcq) page for a step-by-step guide for performing one-shot quantization of large language models.
|
92 |
## Slack
|
93 |
|
|
|
31 |
print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
|
32 |
|
33 |
"""
|
34 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
"""
|
37 |
```
|
|
|
82 |
splits = splits
|
83 |
)
|
84 |
```
|
85 |
+
## Export Model
|
86 |
+
Export the model while injecting the KV Cache
|
87 |
+
```bash
|
88 |
+
sparseml.export --task text-generation output_finetune/
|
89 |
+
```
|
90 |
Follow the instructions on our [One Shot With SparseML](https://github.com/neuralmagic/sparseml/tree/main/src/sparseml/transformers/sparsification/obcq) page for a step-by-step guide for performing one-shot quantization of large language models.
|
91 |
## Slack
|
92 |
|