mmproj-F32.gguf

#2
by sulpher - opened

Hi. For 3.1 you provided a mmproj-F32.gguf. Could you please provide F32 for 3.2 as well?

Unsloth AI org

Hi. For 3.1 you provided a mmproj-F32.gguf. Could you please provide F32 for 3.2 as well?

We removed it because when you load it in lmstudio, it uses soooo much more memory so we erased it

But as I mentioned in another thread, some models such as this model cannot work with BF16 mmproj, and using F16 might mean there will be clipping issues, since the original weights seem to be in BF16. So the only way to maximize accuracy is to resort to F32, and simply move the mmproj out of the folder in LM Studio when more context is needed but not vision.
Anyway, is the mmproj file any different from the 3.1 release?

Anyway, is the mmproj file any different from the 3.1 release?

sha256 differs but that might come from meta data or whatever. Anyway, The F32 from 3.1 release seems to work for 3.2 as well.

Unsloth AI org

I added it in!

Are you running this in Ollama ? Is it as simple as making a new model file with two FROM statements (one for the mmproj and another for the rest of the model ) ? Thanks!

Are you running this in Ollama ? Is it as simple as making a new model file with two FROM statements (one for the mmproj and another for the rest of the model ) ? Thanks!

(not OP) I'd check LMStudio if you haven't, it runs an API just like Oollama and also keeps up to date w/ everything, not to mention comes with simple but gratifying chat interface.
I've heard LLVM is also good but have yet to try it.

Sign up or log in to comment