MXFP4 utilization over NVFP4
#121
by
pprovins
- opened
Howdy! Quick question, why was MXFP4 used over NVFP4?
Recent publications such as: https://arxiv.org/abs/2505.19115
Propose FP4 format with E4M3 scaling factor + FP32 super scale, which tends to result in higher quality quantization.
Was this quantization approach explored when the quantized model was authored?
Thanks in advance!