Running 1776 with draft model

#1
by ernestr - opened

Thanks for the quant @matatonic ! I'm currently running it with Llama3.2 as a draft model. It's working well but I'm curious if using a draft model degrades the thinking.

ernestr changed discussion status to closed

It doesn't degrade the thinking, I use the 1B 3.2 as draft also, worst case it doesn't make it faster.

Sign up or log in to comment