cogito-v1-preview-llama-3B-f32-GGUF

Cogito v1 preview - 3B is a 3 billion parameter hybrid reasoning language model built on the Llama 3.2 architecture and developed by DeepCogito, uniquely designed to operate in both standard and deep “self-reflective” reasoning modes through Iterated Distillation and Amplification (IDA), supporting 128k context windows and over 30 languages, and optimized for coding, STEM, multilingual tasks, tool-calling, and instruction following, consistently outperforming other models of similar size on industry benchmarks while being available under an open license for commercial use.

Model Files

File name Size Quant type
cogito-v1-preview-llama-3B.F32.gguf 12.9 GB F32
cogito-v1-preview-llama-3B.BF16.gguf 6.43 GB BF16
cogito-v1-preview-llama-3B.F16.gguf 6.43 GB F16
cogito-v1-preview-llama-3B.Q8_0.gguf 3.42 GB Q8_0
cogito-v1-preview-llama-3B.Q6_K.gguf 2.64 GB Q6_K
cogito-v1-preview-llama-3B.Q5_K_M.gguf 2.32 GB Q5_K_M
cogito-v1-preview-llama-3B.Q5_K_S.gguf 2.27 GB Q5_K_S
cogito-v1-preview-llama-3B.Q4_K_M.gguf 2.02 GB Q4_K_M
cogito-v1-preview-llama-3B.Q4_K_S.gguf 1.93 GB Q4_K_S
cogito-v1-preview-llama-3B.Q3_K_L.gguf 1.82 GB Q3_K_L
cogito-v1-preview-llama-3B.Q3_K_M.gguf 1.69 GB Q3_K_M
cogito-v1-preview-llama-3B.Q3_K_S.gguf 1.54 GB Q3_K_S
cogito-v1-preview-llama-3B.Q2_K.gguf 1.36 GB Q2_K

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
71
GGUF
Model size
3.21B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/cogito-v1-preview-llama-3B-f32-GGUF

Quantized
(20)
this model