-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 55
Ahmad Khan
ahkhan
·
AI & ML interests
None yet
Recent Activity
updated
a collection
7 days ago
Quantization Reading-List
updated
a collection
7 days ago
Quantization Reading-List
updated
a collection
7 days ago
Quantization Reading-List
Organizations
None yet