MXFP4 utilization over NVFP4

#121

by pprovins - opened 7 days ago

Discussion

pprovins

7 days ago

Howdy! Quick question, why was MXFP4 used over NVFP4?

Recent publications such as: https://arxiv.org/abs/2505.19115

Propose FP4 format with E4M3 scaling factor + FP32 super scale, which tends to result in higher quality quantization.
Was this quantization approach explored when the quantized model was authored?

Thanks in advance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment