GGUF IQ3_M quant of cognitivecomputations/dolphin-2.7-mixtral-8x7b (both non-imatrix and imatrix)
It fits into 24GiB VRAM with 32768 context (@ 8bit KV cache quantization).

Downloads last month: 27

GGUF

Model size

46.7B params

Architecture

llama

View all files

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for NeoChen1024/dolphin-2.7-mixtral-8x7b-GGUF-IQ3_M

Base model

cognitivecomputations/dolphin-2.7-mixtral-8x7b

Quantized

(8)

this model