kyrylokumar commited on
Commit
c4f0001
·
verified ·
1 Parent(s): 8398dfc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -1,3 +1,11 @@
 
 
 
 
 
 
 
 
1
  # Quantizing gpt2: Analysis of Time and Memory Predictions
2
 
3
  This document outlines various quantization techniques applied to the gpt2 model and analyzes their impact on memory usage, loss, and execution time, focusing on explaining the observed trends in time and memory usage.
@@ -99,4 +107,4 @@ By understanding these factors, one can choose the appropriate quantization stra
99
  ## Part 3 - Quantization using llama.cpp
100
 
101
  * The PyTorch model (`pytorch_model.bin`) is converted to a quantized gguf file (`gpt2.ggml`) using llama.cpp.
102
- * The quantized model is uploaded to Hugging Face: [gpt2-quantized-gguf](https://huggingface.co/kyrylokumar/gpt2-quantzed-gguf)
 
1
+ ---
2
+ datasets:
3
+ - Salesforce/wikitext
4
+ metrics:
5
+ - perplexity
6
+ base_model:
7
+ - openai-community/gpt2
8
+ ---
9
  # Quantizing gpt2: Analysis of Time and Memory Predictions
10
 
11
  This document outlines various quantization techniques applied to the gpt2 model and analyzes their impact on memory usage, loss, and execution time, focusing on explaining the observed trends in time and memory usage.
 
107
  ## Part 3 - Quantization using llama.cpp
108
 
109
  * The PyTorch model (`pytorch_model.bin`) is converted to a quantized gguf file (`gpt2.ggml`) using llama.cpp.
110
+ * The quantized model is uploaded to Hugging Face: [gpt2-quantized-gguf](https://huggingface.co/kyrylokumar/gpt2-quantzed-gguf)