Quantization Details

This quantized model was created using AutoAWQ version 0.2.8 with quant_config:

{
  "zero_point": True,
  "q_group_size": 128,
  "w_bit": 4,
  "version": "GEMM"
}

pipeline_tag: text-generation inference: true license: apache-2.0 datasets: - simplescaling/s1K language: - en base_model: - simplescaling/s1-32B library_name: transformers

Model Summary

s1 is a reasoning model finetuned from Qwen2.5-32B-Instruct on just 1,000 examples. It matches o1-preview & exhibits test-time scaling via budget forcing.

Use

The model usage is documented here.

Citation

@misc{muennighoff2025s1simpletesttimescaling,
      title={s1: Simple test-time scaling}, 
      author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Cand猫s and Tatsunori Hashimoto},
      year={2025},
      eprint={2501.19393},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.19393}, 
}
Downloads last month
110
Safetensors
Model size
5.73B params
Tensor type
F32
I32
FP16
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for mhdaw/s1-32B-awq

Quantized
(11)
this model