---
license: mit
base_model:
- deepseek-ai/DeepSeek-Prover-V1.5-SFT
---
# MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving
* Paper link: [https://arxiv.org/abs/2503.03205](https://arxiv.org/abs/2503.03205)
* LoT-Solver (Based on DeepSeek-Prover-v1.5-SFT): [https://huggingface.co/RickyDeSkywalker/LoT-Solver](https://huggingface.co/RickyDeSkywalker/LoT-Solver)
* LoT-Solver (Based on Godel-Prover-SFT): [https://huggingface.co/RickyDeSkywalker/LoT-Solver-Godel](https://huggingface.co/RickyDeSkywalker/LoT-Solver-Godel)
* Website: [https://ma-lot.github.io/](https://ma-lot.github.io/)
* LoT-CorrectionData: [https://huggingface.co/datasets/RickyDeSkywalker/LoT-CorrectionData/viewer/default/train](https://huggingface.co/datasets/RickyDeSkywalker/LoT-CorrectionData/viewer/default/train)
This is the model under MA-LoT paper presented training pipeline trained on DeepSeek-Prover-v1.5-SFT. For usage of the model, pelase refer to our GitHub page.
## Introduction
Solving mathematical problems using computer-verifiable languages like Lean has significantly impacted the mathematical
and computer science communities. State-of-the-art methods utilize a single Large Language Model (LLM)
to generate complete proof or perform tree search, but they fail to balance these tasks.
We propose **MA-LoT**: *Model-CollAboration Lean-based Long Chain-of-Thought*,
a comprehensive framework for Lean4 theorem proving to solve this issue.
It separates the cognition tasks of general NL for whole-proof generation and error analysis for proof correction using the
model-collaboration method.
We achieve this by structured interaction of the LLM and Lean4 verifier in Long CoT.
To implement the framework, we propose the novel *LoT-Transfer Learning* training-inference pipeline,
which enables the Long CoT thinking capability to LLMs without special data annotation.
Extensive experiment shows that our framework achieves a **61.07%** accuracy rate on the Lean4 version of the MiniF2F-Test dataset,
largely outperforming DeepSeek-V3 (33.61%), single-model tree search (InternLM-Step-Prover, 50.70%),
and whole-proof generation (Godel-Prover, 55.33%) baselines. Furthermore, our findings highlight the potential
of combining Long CoT with formal verification for a more insightful generation in a broader perspective.
## Evaluation Results
## Usage example
### General setup
Load model
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "RickyDeSkywalker/LoT-Solver"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto"
)
```
### Prover Model
#### Code-completion usage
The model can be used as a general code-completion prover as follows:
```python
# formulate prompt
prompt = """Complete the following Lean 4 code with explanatory comments preceding each line of code:
```lean4
import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
/--Suppose $a, b, c$ are the sides of a triangle. Prove that
$a^2(b+c-a)+b^2(c+a-b)+c^2(a+b-c)\le{3abc}.$-/
theorem imo_1964_p2
(a b c : ℝ)
(h₀ : 0 < a ∧ 0 < b ∧ 0 < c)
(h₁ : c < a + b)
(h₂ : b < a + c)
(h₃ : a < b + c) :
a^2 * (b + c - a) + b^2 * (c + a - b) + c^2 * (a + b - c) ≤ 3 * a * b * c := by"""
# tokenize and generation
input_tensor = tokenizer.encode(prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
outputs = model.generate(input_tensor.to(model.device), max_new_tokens=2048)
# print results
result = tokenizer.decode(outputs[0])
print(result)
```
#### Lean-based Long CoT Prover:
The model can also be used as a LoT(Lean long CoT) prover as follows:
```python
# formulate prompt
prompt = """<|begin▁of▁sentence|>You are an expert in Lean4 theorem proving with exceptional strategic reasoning abilities. When solving problems, strictly follow this process:
1. First, create a natural language proof draft explaining key insights and logical steps
2. Then analyze required Lean4 tactics, specifying exact syntax and their logical purpose
### Instruction: You will receive several Lean4 problems. For each:
- Prove the theorem **WITH** Long Chain-of-Thought in the block.
- **Do not** reveal your chain of thought outside the block.
- **Ensure** the final Lean4 code or final result is placed **only** in