Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

8-bit precision

Misc with no match

Inference Endpoints

AutoTrain Compatible

text-generation-inference

4-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

20

Full-text search

Active filters: quark

fxmarty/llama-tiny-testing-quark-indev

Updated Oct 3, 2024 • 4

fxmarty/llama-tiny-int4-per-group-sym

Updated Oct 25, 2024 • 23

fxmarty/llama-tiny-w-fp8-a-fp8

Updated Oct 22, 2024 • 9

fxmarty/llama-tiny-w-fp8-a-fp8-o-fp8

Updated Oct 22, 2024 • 7

fxmarty/llama-tiny-w-int8-per-tensor

Updated Oct 22, 2024 • 7

fxmarty/llama-small-int4-per-group-sym-awq

Updated Oct 29, 2024 • 8

fxmarty/quark-legacy-int8

Updated Oct 10, 2024 • 17

fxmarty/llama-tiny-w-int8-b-int8-per-tensor

Updated Oct 22, 2024 • 15

fxmarty/llama-small-int4-per-group-sym-awq-old

Updated Oct 25, 2024 • 15

amd-quark/llama-tiny-w-int8-per-tensor

Updated Dec 18, 2024 • 270

amd-quark/llama-tiny-w-int8-b-int8-per-tensor

Updated Dec 18, 2024 • 264

amd-quark/llama-tiny-w-fp8-a-fp8

Updated Dec 18, 2024 • 264

amd-quark/llama-tiny-w-fp8-a-fp8-o-fp8

Updated Dec 18, 2024 • 263

amd-quark/llama-tiny-int4-per-group-sym

Updated Dec 18, 2024 • 264

amd-quark/llama-small-int4-per-group-sym-awq

Updated Dec 18, 2024 • 270

amd-quark/quark-legacy-int8

Updated Dec 18, 2024 • 103

amd/Llama-3.1-8B-Instruct-FP8-KV-Quark-test

Updated Jan 7 • 1.32k

amd/Llama-3.1-8B-Instruct-w-int8-a-int8-sym-test

Updated Jan 7 • 42

EmbeddedLLM/Llama-3.1-8B-Instruct-w_fp8_per_channel_sym

Text Generation • Updated Jan 22 • 28

amd-quark/llama-tiny-fp8-quark-quant-method

Updated 16 days ago • 80