Llamacpp imatrix Quantizations of Meta-Llama-3.1-8B-Instruct (Fork by NeurochainAI)

This repository is a fork of the original Meta-Llama-3.1-8B-Instruct GGUF quantizations, tailored for NeurochainAI's inference network. The models provided here are part of the foundation for NeurochainAI's state-of-the-art AI inference solutions.

NeurochainAI uses this model for optimizing and running inference on distributed networks, allowing for efficient and robust processing of language models across various platforms and devices.

While many technical aspects of the original repository are preserved, only three models from the original commit have been integrated into this fork, as they best fit the specific performance and inference requirements of our network:

  • Meta-Llama-3.1-8B-Instruct-Q6_K.gguf
    Size: 6.60GB
    Description: Very high-quality quantization. Near-perfect accuracy for high-performance inference.

  • Meta-Llama-3.1-8B-Instruct-Q6_K_L.gguf
    Size: 6.85GB
    Description: Uses Q8_0 for embedding and output weights, providing near-perfect inference quality. Highly recommended for demanding applications.

  • Meta-Llama-3.1-8B-Instruct-Q8_0.gguf
    Size: 8.54GB
    Description: The highest-quality quantization available. Typically not needed but essential for maximizing inference accuracy in specific cases.

The quantization process was conducted using the llama.cpp release b3472, leveraging the imatrix option to optimize performance for our inference pipeline. The original quantization dataset was sourced from bartowski's dataset.

License

The models and content here are licensed under the Llama 3.1 Community License, as provided by Meta. Please make sure to comply with the terms outlined in the Llama 3.1 license agreement.

Downloads last month
83
GGUF
Model size
8.03B params
Architecture
llama

6-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.