GGUFs PLUS:

Q8 and Q6 GGUFs with critical parts of the model in F16 / Full precision.

File sizes will be slightly larger than standard, but should yeild higher quality results under all tasks and conditions.

GGUF

Model size

10.7B params

Architecture

llama

Hardware compatibility

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including DavidAU/LemonadeRP-4.5.3-11B-GGUF-Plus