Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

arxiv: 2405.03594

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

Misc with no match

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

31

Full-text search

Active filters: 2405.03594

neuralmagic/Llama-2-7b-pruned50-retrained

Text Generation • Updated May 7, 2024 • 152

neuralmagic/Llama-2-7b-pruned70-retrained

Text Generation • Updated May 7, 2024 • 84

neuralmagic/Llama-2-7b-ultrachat200k

Text Generation • Updated May 7, 2024 • 1.61k

neuralmagic/Llama-2-7b-ultrachat200k-pruned_50

Text Generation • Updated May 15, 2024 • 21

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70

Text Generation • Updated May 15, 2024 • 28

neuralmagic/Llama-2-7b-ultrachat200k-pruned_50-quantized-deepsparse

Text Generation • Updated May 7, 2024 • 21

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 18

neuralmagic/Llama-2-7b-evolcodealpaca

Text Generation • Updated May 7, 2024 • 31 • 1

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_50

Text Generation • Updated May 15, 2024 • 27

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_70

Text Generation • Updated May 15, 2024 • 23

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_50-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 21

neuralmagic/Llama-2-7b-evol-code-alpaca-pruned_70-quantized-deepsparse

Text Generation • Updated May 15, 2024 • 18

neuralmagic/Llama-2-7b-dolphin-open_platypus

Text Generation • Updated May 15, 2024 • 12

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50

Text Generation • Updated May 15, 2024 • 22

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70

Text Generation • Updated May 15, 2024 • 23

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50-quantized-deepsparse

Text Generation • Updated May 16, 2024 • 18

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70-quantized-deepsparse

Text Generation • Updated May 16, 2024 • 16 • 1

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-4bits

Text Generation • Updated May 10, 2024 • 78

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-8bits

Text Generation • Updated May 10, 2024 • 8

RichardErkhov/neuralmagic_-_Llama-2-7b-evolcodealpaca-gguf

Updated May 10, 2024 • 70

neuralmagic/Llama-2-7b-gsm8k-pruned_50

Text Generation • Updated Jun 20, 2024 • 16 • 1

neuralmagic/Llama-2-7b-gsm8k-pruned_70

Text Generation • Updated Jun 20, 2024 • 10

neuralmagic/Llama-2-7b-gsm8k

Text Generation • Updated Jun 20, 2024 • 163 • 3

RichardErkhov/neuralmagic_-_Llama-2-7b-dolphin-open_platypus-pruned_70-gguf

Updated Jul 16, 2024 • 40

RichardErkhov/neuralmagic_-_Llama-2-7b-pruned50-retrained-gguf

Updated Sep 13, 2024 • 62

RichardErkhov/neuralmagic_-_Llama-2-7b-ultrachat200k-gguf

Updated Sep 13, 2024 • 29

RichardErkhov/neuralmagic_-_Llama-2-7b-pruned70-retrained-gguf

Updated Nov 17, 2024 • 8

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4-FP8-dynamic

Text Generation • Updated Dec 19, 2024 • 41 • 1

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4-quantized.w4a16

Text Generation • Updated Dec 19, 2024 • 122 • 3

neuralmagic/Sparse-Llama-3.1-8B-ultrachat_200k-2of4

Text Generation • Updated Nov 21, 2024 • 17 • 1