Update README.md

60b4cb8 verified 1 day ago

5.61 kB

	---
	license: apache-2.0
	library_name: transformers
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-14B
	pipeline_tag: text-generation
	---

	# Datarus-R1-14B-preview

	<div align="center">
	<img src="https://i.postimg.cc/7hsStNgm/logo-icon-2-1.png" alt="Datarus Logo" width="150"/>

	[![Model](https://img.shields.io/badge/Model-Datarus--R1--14B-blue)](https://huggingface.co/DatarusAI/Datarus-R1-14B-preview)
	[![License](https://img.shields.io/badge/License-Apache%202.0-green)](LICENSE)
	[![Website](https://img.shields.io/badge/Website-datarus.ai-orange)](https://datarus.ai)
	[![Demo](https://img.shields.io/badge/Demo-Try%20Now-purple)](https://chat.datarus.ai)
	[![Paper](https://img.shields.io/badge/Paper-arXiv-red)](https://arxiv.org/abs/2508.13382)
	</div>

	## 🚀 Overview

	Datarus-R1-14B-Preview is a 14B-parameter open-weights language model fine-tuned from Qwen2.5-14B-Instruct, designed to act as a virtual data analyst and graduate-level problem solver. Unlike traditional models trained on isolated Q&A pairs, Datarus learns from complete analytical trajectories—including reasoning steps, code execution, error traces, self-corrections, and final conclusions—all captured in a ReAct-style notebook format.

	### Key Highlights

	- 🎯 State-of-the-art efficiency: Surpasses similar-sized models and competes with 32B+ models while using 18-49% fewer tokens
	- 🔄 Dual reasoning interfaces: Supports both Agentic (ReAct) mode for interactive analysis and Reflection (CoT) mode for concise documentation
	- 📊 Superior performance: Achieves up to 30% higher accuracy on AIME 2024/2025 and LiveCodeBench
	- 💡 "AHA-moment" pattern: Exhibits efficient hypothesis refinement in 1-2 iterations, avoiding circular reasoning loops

	## 🔗 Quick Links

	- 🌐 Website: [https://datarus.ai](https://datarus.ai)
	- 💬 Try the Demo: [https://chat.datarus.ai](https://chat.datarus.ai)
	- 🛠️ Jupyter Agent: [GitHub Repository](https://github.com/DatarusAI/Datarus-JupyterAgent)
	- 📄 Paper: [Datarus-R1: An Adaptive Multi-Step Reasoning LLM](https://arxiv.org/abs/2508.13382)

	## 📊 Performance

	### Benchmark Results

	\| Benchmark \| Datarus-R1-14B-Preview \| QwQ-32B \| Phi-4-reasoning \| DeepSeek-R1-Distill-14B \|
	\|-----------\|----------------\|---------\|-----------------\|-------------------------\|
	\| LiveCodeBench v6 \| 57.7 \| 56.6 \| 52.6 \| 48.6 \|
	\| AIME 2024 \| 70.1 \| 76.2 \| 74.6* \| - \|
	\| AIME 2025 \| 66.2 \| 66.2 \| 63.1* \| - \|
	\| GPQA Diamond \| 62.1 \| 60.1 \| 55.0 \| 58.6 \|

	*Reported values from official papers

	### Token Efficiency and Performance

	<div align="center">
	<img src="https://i.postimg.cc/NMSppNM4/perf-efficiency.png" alt="LCB-Efficiency" width="600"/>
	<img src="https://i.postimg.cc/nV341Ssf/efficiency.png" alt="Efficiency" width="600" />
	</div>

	## 🎯 Model Card

	### Model Details

	- Model Type: Language Model for Reasoning and Data Analysis
	- Parameters: 14.8B
	- Training Data: 144,000 synthetic analytical trajectories across finance, medicine, numerical analysis, and other quantitative domains + A curated collection of reasoning datasets.
	- Language: English
	- License: Apache 2.0

	### Intended Use

	#### Primary Use Cases
	- Data Analysis: Automated data exploration, statistical analysis, and visualization
	- Mathematical Problem Solving: Graduate-level mathematics including AIME-level problems
	- Code Generation: Creating analytical scripts and solving programming challenges
	- Scientific Reasoning: Complex problem-solving in physics, chemistry, and other sciences
	- Interactive Notebooks: Building complete analysis notebooks with iterative refinement

	### Dual Mode Usage

	#### Agentic Mode (for interactive analysis)
	- Use `<step>`, `<thought>`, `<action>`, `<action_input>`, `<observation>` tags
	- Enables iterative code execution and refinement
	- Best for data analysis, simulations, and exploratory tasks

	#### Reflection Mode (for documentation)
	- Use `<think>` and `<answer>` tags
	- Produces compact, self-contained reasoning chains
	- Best for mathematical proofs, explanations, and reports

	## 📚 Citation

	```bibtex
	@article{benchaliah2025datarus,
	title={Datarus-R1: An Adaptive Multi-Step Reasoning LLM for Automated Data Analysis},
	author={Ben Chaliah, Ayoub and Dellagi, Hela},
	journal={arXiv preprint arXiv:2508.13382},
	year={2025}
	}
	```

	## 🤝 Contributing

	We welcome contributions! Please see our [GitHub repository](https://github.com/DatarusAI/Datarus-JupyterAgent) for:
	- Bug reports and feature requests
	- Pull requests
	- Discussion forums

	## 📄 License

	This model is released under the Apache 2.0 License.

	## 🙏 Acknowledgments

	We thank the Qwen team for the excellent base model and the open-source community for their valuable contributions.

	## 📧 Contact

	- Email: [email protected], [email protected]
	- Website: [https://datarus.ai](https://datarus.ai)
	- Demo: [https://chat.datarus.ai](https://chat.datarus.ai)

	---

	<div align="center">
	<strong>Experience the future of AI-powered data analysis with Datarus-R1</strong>

	[Try Demo](https://chat.datarus.ai) \| [View Code](https://github.com/DatarusAI/Datarus-JupyterAgent) \| [Read Paper](https://arxiv.org/abs/2508.13382)
	</div>

	## ⭐ Support

	If you find this model and Agent pipeline useful, please consider __Like/Star__! Your support helps us continue improving the project.

	Found a bug or have a feature request? Please open an issue on GitHub.

	---

	<p align="center">Made with ❤️ by the Datarus Team from Paris</p>