TreePO - a m-a-p Collection

m-a-p 's Collections

TreePO

Hybrid Linear Attention Research

MARBLE

COIG-P-Datasets

YuE

MERT

MuPT

COIG

OpenCodeInterpreter

M-A-P Full Paper List

Amber-Reproduce-Intermediate-CKPTs (The Fine Line)

OpenLLaMA-Reproduce-Intermediate-CKPTs (The Fine Line)

Chinese Tiny LLM

TreePO

updated 2 days ago

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published 13 days ago • 78
m-a-p/TreePO-Qwen2.5-7B

8B • Updated 7 days ago • 13 • 2
m-a-p/TreePO_data

Viewer • Updated 7 days ago • 49.3k • 103
m-a-p/TreePO-Qwen2.5-7B_fixed-div

8B • Updated 7 days ago • 15
m-a-p/TreePO-Qwen2.5-7B_GRPO-TreePO-Sampling

8B • Updated 2 days ago • 9
m-a-p/TreePO-Qwen2.5-7B_Low_Prob_Encourage

8B • Updated 2 days ago • 8
m-a-p/TreePO-Qwen2.5-7B_Naive2Low_Scheduler

8B • Updated 2 days ago • 7