Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

4-bit precision

AutoTrain Compatible

text-generation-inference

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

78

Full-text search

Active filters: Quantization

VPTQ-community/Meta-Llama-3.1-8B-Instruct-v12-k65536-4096-woft

Updated Jan 13 • 231 • 3

VPTQ-community/Mistral-Large-Instruct-2407-v8-k65536-65536-woft

Updated Nov 18, 2024 • 22 • 2

VPTQ-community/Llama-3.1-Nemotron-70B-Instruct-HF-v8-k65536-65536-woft

Updated Nov 18, 2024 • 11 • 5

mit-han-lab/svdquant-models

Text-to-Image • Updated 16 days ago • 897 • 65

mit-han-lab/svdq-int4-flux.1-schnell

Text-to-Image • Updated 10 days ago • 3.41k • 3

mit-han-lab/svdq-int4-flux.1-dev

Text-to-Image • Updated 10 days ago • 1.97k • 13

mit-han-lab/svdq-int4-flux.1-depth-dev

Image-to-Image • Updated 12 days ago • 454 • 1

thephimart/tinyllama-4x1.1b-moe.Q5_K_M.gguf

Updated Jan 24, 2024 • 11 • 2

Irathernotsay/qwen2-1.5B-medical_qa-Finetune

Text Generation • Updated Jul 17, 2024 • 18

Riyuechang/Breeze-7B-PTT-Chat-v2_AWQ

Text Generation • Updated Sep 18, 2024 • 9

169Pi/NeuroBit_1.0

Text Generation • Updated 3 days ago • 226

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v16-k65536-32768-woft

Updated Nov 18, 2024 • 15

VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-65536-woft

Updated Nov 18, 2024 • 45

VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-4096-woft

Updated Nov 18, 2024 • 28

VPTQ-community/Meta-Llama-3.1-8B-Instruct-v8-k65536-256-woft

Updated Nov 18, 2024 • 172

VPTQ-community/Qwen2.5-72B-Instruct-v16-k65536-65536-woft

Updated Nov 18, 2024 • 11 • 4

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v16-k65536-65536-woft

Updated Nov 18, 2024 • 19

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v8-k65536-256-woft

Updated Nov 18, 2024 • 57 • 1

VPTQ-community/Qwen2.5-7B-Instruct-v8-k65536-256-woft

Updated Nov 18, 2024 • 50

VPTQ-community/Qwen2.5-72B-Instruct-v16-k65536-32768-woft

Updated Nov 18, 2024 • 25 • 3

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v8-k32768-0-woft

Updated Nov 18, 2024 • 22 • 1

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v8-k65536-65536-woft

Updated Nov 18, 2024 • 10 • 2

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v8-k16384-0-woft

Updated Nov 18, 2024 • 4 • 2

VPTQ-community/Meta-Llama-3.1-70B-Instruct-v8-k65536-0-woft

Updated Nov 18, 2024 • 58 • 2

VPTQ-community/Qwen2.5-72B-Instruct-v8-k65536-4-woft-duplicated

Updated Nov 18, 2024 • 12 • 1

VPTQ-community/Meta-Llama-3.1-405B-Instruct-v16-k65536-1024-woft

Updated Nov 18, 2024 • 8 • 1

VPTQ-community/Meta-Llama-3.1-405B-Instruct-v8-k4096-0-woft

Updated Nov 18, 2024 • 9 • 1

VPTQ-community/Meta-Llama-3.1-405B-Instruct-v16-k65536-64-woft

Updated Nov 18, 2024 • 18 • 3

VPTQ-community/Meta-Llama-3.1-405B-Instruct-v16-k32768-32768-woft

Updated Nov 18, 2024 • 9 • 1

VPTQ-community/Meta-Llama-3.1-405B-Instruct-v16-k65536-128-woft

Updated Nov 18, 2024 • 7 • 1