You need to agree to share your contact information to access this model

If you want to learn more about how we process your personal data, please read our Privacy Policy.

Log in or Sign Up to review the conditions and access this model content.

Mistral-Small-3.2-24B-Instruct-2506 (Quantized)

This is a quantized version of togethercomputer/mistral-3.2-instruct-2506, optimized for reduced memory usage while maintaining performance.

Mistral-Small-3.2-24B-Instruct-2506 is a minor update of Mistral-Small-3.1-24B-Instruct-2503.

Quantization Details

This model has been quantized to reduce memory requirements while preserving model quality. The quantization reduces the model size significantly compared to the original fp16/bf16 version.

Base Model Improvements

Small-3.2 improves in the following categories:

  • Instruction following: Small-3.2 is better at following precise instructions
  • Repetition errors: Small-3.2 produces less infinite generations or repetitive answers
  • Function calling: Small-3.2's function calling template is more robust

In all other categories Small-3.2 should match or slightly improve compared to Mistral-Small-3.1-24B-Instruct-2503.

Key Features

Usage

The quantized model can be used with the following frameworks;

Note 1: We recommend using a relatively low temperature, such as temperature=0.15.

Note 2: Make sure to add a system prompt to the model to best tailor it to your needs.

Memory Requirements

This quantized version requires significantly less GPU memory than the original model:

  • Original: ~55 GB of GPU RAM in bf16 or fp16
  • Quantized: Reduced memory footprint (exact requirements depend on quantization method used)

License

This model inherits the same license as the base model: Apache-2.0

Original Model

For benchmark results and detailed usage examples, please refer to the original model: togethercomputer/mistral-3.2-instruct-2506

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vschandramourya/mistral-3.2-instruct-2506-quantized

Quantized
(18)
this model