Multi GPU and quantisation.

#7
by TheBigBlockPC - opened

i run the model in nf4, is there a way to have pipeline parallelism with this model that i can run it in 8 bit quantisation

Sign up or log in to comment