Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

compressed-tensors

Inference Endpoints

AutoTrain Compatible

text-generation-inference

8-bit precision

Misc with no match

4-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

909

Full-text search

Active filters: compressed-tensors

vllmd/deepseek-r1-distill-qwen-32b-w8a8-dynamic

Updated 3 days ago • 29

reinforce20001/SakuraLLM.Sakura-14B-Qwen2.5-v1.0-FP8-V2

Updated 1 day ago • 5

reinforce20001/SakuraLLM.Sakura-14B-Qwen2.5-v1.0-FP8-Dynamic-V2

Updated 1 day ago • 4

reinforce20001/SakuraLLM.Sakura-14B-Qwen2.5-v1.0-GPTQ-Int4-V2

Updated 1 day ago • 6

reinforce20001/SakuraLLM.Sakura-14B-Qwen2.5-v1.0-GPTQ-Int8-V2

Updated 1 day ago • 5

context-labs/neuralmagic-llama-3.1-8b-instruct-FP8

Text Generation • Updated about 16 hours ago

thisnick/DeepSeek-R1-Distill-Llama-8B-abliterated-FP8-Dynamic

Updated about 12 hours ago

noneUsername/Cydonia-24B-v2-W8A8

Updated about 5 hours ago

Yehor/whisper-large-v2-quantized-uk

Updated about 4 hours ago