Multi GPU and quantisation.
#7
by
TheBigBlockPC
- opened
i run the model in nf4, is there a way to have pipeline parallelism with this model that i can run it in 8 bit quantisation
i run the model in nf4, is there a way to have pipeline parallelism with this model that i can run it in 8 bit quantisation