Mixed Precision MoE Models Collection Collection of Quantized Models for MoE • 13 items • Updated 3 days ago
Mixed Precision MoE Models Collection Collection of Quantized Models for MoE • 13 items • Updated 3 days ago
Mixed Precision MoE Models Collection Collection of Quantized Models for MoE • 13 items • Updated 3 days ago
LLaMA-3.2-11B-Vision-Instruct LangVision-LoRA-NAS Collection Collection of Base LoRA Models and LoRA-NAS Models • 1 item • Updated Jan 14
A Survey of Techniques for Optimizing Transformer Inference Paper • 2307.07982 • Published Jul 16, 2023