mwitiderrick commited on
Commit
43acc24
·
verified ·
1 Parent(s): 5d04de5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -31,13 +31,7 @@ model = TextGeneration(model_path="hf:nm-testing/TinyLlama-1.1B-Chat-v1.0-pruned
31
  print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
32
 
33
  """
34
- 1. Preheat the oven to 375°F (178°C).
35
- 2. In a mixing bowl, add 1 cup of all-purpose flour, 1 cup of melted coconut oil, 1/2 cup of sugar, 1/2 cup of banana, 1/2 cup of melted coconut oil, 1/2 cup of salt, 1/2 cup of vanilla extract, and 1/2 cup of baking powder.
36
- 3. Mix the ingredients together until they are well combined.
37
- 4. Add 1/2 cup of melted coconut oil to the mixture.
38
- 5. Add 1/2 cup of melted coconut oil to the mixture.
39
- 6. Mix the ingredients together until they are well combined.
40
- 7. Add 1/2 cup of melted
41
 
42
  """
43
  ```
@@ -88,6 +82,11 @@ run_train(
88
  splits = splits
89
  )
90
  ```
 
 
 
 
 
91
  Follow the instructions on our [One Shot With SparseML](https://github.com/neuralmagic/sparseml/tree/main/src/sparseml/transformers/sparsification/obcq) page for a step-by-step guide for performing one-shot quantization of large language models.
92
  ## Slack
93
 
 
31
  print(model(formatted_prompt, max_new_tokens=200).generations[0].text)
32
 
33
  """
34
+
 
 
 
 
 
 
35
 
36
  """
37
  ```
 
82
  splits = splits
83
  )
84
  ```
85
+ ## Export Model
86
+ Export the model while injecting the KV Cache
87
+ ```bash
88
+ sparseml.export --task text-generation output_finetune/
89
+ ```
90
  Follow the instructions on our [One Shot With SparseML](https://github.com/neuralmagic/sparseml/tree/main/src/sparseml/transformers/sparsification/obcq) page for a step-by-step guide for performing one-shot quantization of large language models.
91
  ## Slack
92