Can this version with quantization w4a16 run on V100?

#2
by underkongkong - opened

I have tried the llm-compressor with compressed-tensors library and choiced w4a16 . However the quantized model can run as the V100 Compute Capability is 70.

Sign up or log in to comment