OpenThinker-7B

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the OpenThoughts-114k dataset dataset.

The dataset is derived by distilling DeepSeek-R1 using the data pipeline available on github. More info about the dataset can be found on the dataset card at OpenThoughts-114k dataset.

This model improves upon the Bespoke-Stratos-7B model, which used 17k examples (Bespoke-Stratos-17k dataset). The numbers reported in the table below are evaluated with our open-source tool Evalchemy.

AIME24 MATH500 GPQA-Diamond LCBv2 Easy LCBv2 Medium LCBv2 Hard LCBv2 All
OpenThinker-7B 31.3 83.0 42.4 75.3 28.6 6.5 39.9
Bespoke-Stratos-7B 22.7 79.6 38.9 71.4 25.2 0.8 35.8
DeepSeek-R1-Distill-Qwen-7B 60 88.2 46.9 79.7 45.1 14.6 50.1
gpt-4o-0513 8.7 75.8 46.5 87.4 42.7 8.9 50.5
o1-mini 64 85.6 60 92.8 74.7 39.8 72.8

We are fully open-source. Our model weights, datasets, data generation code, evaluation code, and training code are all publicly available.

Open Weights Open Data Open Code
OpenThinker-7B βœ… βœ… βœ…
Bespoke-Stratos-7B βœ… βœ… βœ…
DeepSeek-R1-Distill-Qwen-7B βœ… ❌ ❌
gpt-4o-0513 ❌ ❌ ❌
o1-mini ❌ ❌ ❌

Intended uses & limitations

Apache 2.0 License

Training procedure

We used four 8xH100 nodes to train the model for 20 hours.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 32
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 96
  • total_eval_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0

Framework versions

  • Transformers 4.46.1
  • Pytorch 2.3.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3

More info can be found in our repository: https://github.com/open-thoughts/open-thoughts.

Citation

@misc{openthoughts,
  author = {Team, OpenThoughts},
  month = jan,
  title = {{Open Thoughts}},
  howpublished = {https://open-thoughts.ai},
  year = {2025}
}

Links

Downloads last month
9,612
Safetensors
Model size
7.62B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for open-thoughts/OpenThinker-7B

Base model

Qwen/Qwen2.5-7B
Finetuned
(490)
this model
Finetunes
2 models
Merges
8 models
Quantizations
16 models

Dataset used to train open-thoughts/OpenThinker-7B

Spaces using open-thoughts/OpenThinker-7B 3