Chirp-3b

Overview

Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.

Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.

Key Features

  • Parameters: 3 billion
  • Training Data: 50M tokens distilled from GPT-4o

Benchmarks

Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model.

MMLU Pro

Subject Average Accuracy
Biology 0.6234
Business 0.5032
Chemistry 0.3701
Computer Science 0.4268
Economics 0.5284
Engineering 0.3013
Health 0.3900
History 0.3885
Law 0.2252
Math 0.5736
Other 0.4145
Philosophy 0.3687
Physics 0.3995
Psychology 0.5589
Overall Average 0.4320
  • Improvement: 9 points above the base model.

IFEval

  • Score: 72%
  • Improvement: 14% better than the base model.

More benchmarks are in the works and will be shared soon!

Download

Access Chirp-3b here:
https://huggingface.co/ozone-research/Chirp-01

Usage

Requirements

  • Recommended GPU: 8 GB VRAM Minimum

Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ozone-research/Chirp-01"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "What’s the future of AI?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Future Work

The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases!

Feedback

We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed].

Acknowledgments

A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.

Downloads last month
28
Safetensors
Model size
3.09B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for ozone-research/Chirp-01

Base model

Qwen/Qwen2.5-3B
Finetuned
(153)
this model
Finetunes
1 model
Quantizations
6 models