II-Search-4B-GGUF
II-Search-4B is a 4-billion-parameter language model fine-tuned from Qwen3-4B specifically for advanced information seeking and web-integrated reasoning tasks, demonstrating strong capabilities in multi-hop information retrieval, fact verification, and comprehensive report generation; it excels on factual QA benchmarks compared to peers, features sophisticated tool-use for search and web visits, supports distributed inference with vLLM or SGLang (including a 131,072-token context window with custom RoPE scaling), and is suitable for factual question answering, research assistance, and educational applications, with Apple Silicon support via MLX, open integration examples, and full resources available on its Hugging Face repository.
Model Files
File Name | Size | Quant Type |
---|---|---|
II-Search-4B-GGUF.BF16.gguf | 8.05 GB | BF16 |
II-Search-4B-GGUF.F16.gguf | 8.05 GB | F16 |
II-Search-4B-GGUF.F32.gguf | 16.1 GB | F32 |
II-Search-4B-GGUF.Q2_K.gguf | 1.67 GB | Q2_K |
II-Search-4B-GGUF.Q3_K_L.gguf | 2.24 GB | Q3_K_L |
II-Search-4B-GGUF.Q3_K_M.gguf | 2.08 GB | Q3_K_M |
II-Search-4B-GGUF.Q3_K_S.gguf | 1.89 GB | Q3_K_S |
II-Search-4B-GGUF.Q4_K_M.gguf | 2.5 GB | Q4_K_M |
II-Search-4B-GGUF.Q4_K_S.gguf | 2.38 GB | Q4_K_S |
II-Search-4B-GGUF.Q5_K_M.gguf | 2.89 GB | Q5_K_M |
II-Search-4B-GGUF.Q5_K_S.gguf | 2.82 GB | Q5_K_S |
II-Search-4B-GGUF.Q6_K.gguf | 3.31 GB | Q6_K |
II-Search-4B-GGUF.Q8_0.gguf | 4.28 GB | Q8_0 |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 1,585
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit