Apertus-8B-Instruct-2509-NVFP4

NVFP4-quantized version of swiss-ai/Apertus-8B-Instruct-2509 produced with llmcompressor.

Notes

  • Quantization scheme: NVFP4 (linear layers, lm_head excluded)
  • Calibration samples: 512
  • Max sequence length during calibration: 2048
Downloads last month
186
Safetensors
Model size
5B params
Tensor type
BF16
·
F8_E4M3
·
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llmat/Apertus-8B-Instruct-2509-NVFP4

Quantized
(14)
this model