inference-optimization/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
8B
•
Updated
•
62
Collection on FP8 Quantization of Weights, Activations and KV Cache
Totally Free + Zero Barriers + No Login Required