|
--- |
|
license: apache-2.0 |
|
library_name: transformers |
|
language: |
|
- en |
|
base_model: |
|
- Qwen/Qwen2.5-14B |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Datarus-R1-14B-preview |
|
|
|
<div align="center"> |
|
<img src="https://i.postimg.cc/7hsStNgm/logo-icon-2-1.png" alt="Datarus Logo" width="150"/> |
|
|
|
[](https://huggingface.co/DatarusAI/Datarus-R1-14B-preview) |
|
[](LICENSE) |
|
[](https://datarus.ai) |
|
[](https://chat.datarus.ai) |
|
[](https://arxiv.org/abs/2508.13382) |
|
</div> |
|
|
|
## 🚀 Overview |
|
|
|
**Datarus-R1-14B-Preview** is a 14B-parameter open-weights language model fine-tuned from Qwen2.5-14B-Instruct, designed to act as a virtual data analyst and graduate-level problem solver. Unlike traditional models trained on isolated Q&A pairs, Datarus learns from complete analytical trajectories—including reasoning steps, code execution, error traces, self-corrections, and final conclusions—all captured in a ReAct-style notebook format. |
|
|
|
### Key Highlights |
|
|
|
- **🎯 State-of-the-art efficiency**: Surpasses similar-sized models and competes with 32B+ models while using 18-49% fewer tokens |
|
- **🔄 Dual reasoning interfaces**: Supports both Agentic (ReAct) mode for interactive analysis and Reflection (CoT) mode for concise documentation |
|
- **📊 Superior performance**: Achieves up to 30% higher accuracy on AIME 2024/2025 and LiveCodeBench |
|
- **💡 "AHA-moment" pattern**: Exhibits efficient hypothesis refinement in 1-2 iterations, avoiding circular reasoning loops |
|
|
|
## 🔗 Quick Links |
|
|
|
- 🌐 **Website**: [https://datarus.ai](https://datarus.ai) |
|
- 💬 **Try the Demo**: [https://chat.datarus.ai](https://chat.datarus.ai) |
|
- 🛠️ **Jupyter Agent**: [GitHub Repository](https://github.com/DatarusAI/Datarus-JupyterAgent) |
|
- 📄 **Paper**: [Datarus-R1: An Adaptive Multi-Step Reasoning LLM](https://arxiv.org/abs/2508.13382) |
|
|
|
## 📊 Performance |
|
|
|
### Benchmark Results |
|
|
|
| Benchmark | Datarus-R1-14B-Preview | QwQ-32B | Phi-4-reasoning | DeepSeek-R1-Distill-14B | |
|
|-----------|----------------|---------|-----------------|-------------------------| |
|
| **LiveCodeBench v6** | 57.7 | 56.6 | 52.6 | 48.6 | |
|
| **AIME 2024** | 70.1 | 76.2 | 74.6* | - | |
|
| **AIME 2025** | 66.2 | 66.2 | 63.1* | - | |
|
| **GPQA Diamond** | 62.1 | 60.1 | 55.0 | 58.6 | |
|
|
|
*Reported values from official papers |
|
|
|
### Token Efficiency and Performance |
|
|
|
<div align="center"> |
|
<img src="https://i.postimg.cc/NMSppNM4/perf-efficiency.png" alt="LCB-Efficiency" width="600"/> |
|
<img src="https://i.postimg.cc/nV341Ssf/efficiency.png" alt="Efficiency" width="600" /> |
|
</div> |
|
|
|
## 🎯 Model Card |
|
|
|
### Model Details |
|
|
|
- **Model Type**: Language Model for Reasoning and Data Analysis |
|
- **Parameters**: 14.8B |
|
- **Training Data**: 144,000 synthetic analytical trajectories across finance, medicine, numerical analysis, and other quantitative domains + A curated collection of reasoning datasets. |
|
- **Language**: English |
|
- **License**: Apache 2.0 |
|
|
|
### Intended Use |
|
|
|
#### Primary Use Cases |
|
- **Data Analysis**: Automated data exploration, statistical analysis, and visualization |
|
- **Mathematical Problem Solving**: Graduate-level mathematics including AIME-level problems |
|
- **Code Generation**: Creating analytical scripts and solving programming challenges |
|
- **Scientific Reasoning**: Complex problem-solving in physics, chemistry, and other sciences |
|
- **Interactive Notebooks**: Building complete analysis notebooks with iterative refinement |
|
|
|
### Dual Mode Usage |
|
|
|
#### Agentic Mode (for interactive analysis) |
|
- Use `<step>`, `<thought>`, `<action>`, `<action_input>`, `<observation>` tags |
|
- Enables iterative code execution and refinement |
|
- Best for data analysis, simulations, and exploratory tasks |
|
|
|
#### Reflection Mode (for documentation) |
|
- Use `<think>` and `<answer>` tags |
|
- Produces compact, self-contained reasoning chains |
|
- Best for mathematical proofs, explanations, and reports |
|
|
|
## 📚 Citation |
|
|
|
```bibtex |
|
@article{benchaliah2025datarus, |
|
title={Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis}, |
|
author={Ben Chaliah, Ayoub and Dellagi, Hela}, |
|
journal={arXiv preprint arXiv:2508.13382}, |
|
year={2025} |
|
} |
|
``` |
|
|
|
## 🤝 Contributing |
|
|
|
We welcome contributions! Please see our [GitHub repository](https://github.com/DatarusAI/Datarus-JupyterAgent) for: |
|
- Bug reports and feature requests |
|
- Pull requests |
|
- Discussion forums |
|
|
|
## 📄 License |
|
|
|
This model is released under the Apache 2.0 License. |
|
|
|
## 🙏 Acknowledgments |
|
|
|
We thank the Qwen team for the excellent base model and the open-source community for their valuable contributions. |
|
|
|
## 📧 Contact |
|
|
|
- **Email**: [email protected], [email protected] |
|
- **Website**: [https://datarus.ai](https://datarus.ai) |
|
- **Demo**: [https://chat.datarus.ai](https://chat.datarus.ai) |
|
|
|
--- |
|
|
|
<div align="center"> |
|
<strong>Experience the future of AI-powered data analysis with Datarus-R1</strong> |
|
|
|
[Try Demo](https://chat.datarus.ai) | [View Code](https://github.com/DatarusAI/Datarus-JupyterAgent) | [Read Paper](https://arxiv.org/abs/2508.13382) |
|
</div> |
|
|
|
## ⭐ Support |
|
|
|
If you find this model and Agent pipeline useful, please consider __Like/Star__! Your support helps us continue improving the project. |
|
|
|
Found a bug or have a feature request? Please open an issue on GitHub. |
|
|
|
--- |
|
|
|
<p align="center">Made with ❤️ by the Datarus Team from Paris</p> |