Seungyoun
/

qwen2.5-3b-it_searchR1-like-multiturn

Model card Files Files and versions Community

Seungyoun commited on Jun 3

Commit

2b48277

·

verified ·

1 Parent(s): b3d0c6e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -10,8 +10,8 @@ base_model:
 # Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
 > **Author · Seungyoun Shin**
-> 🤗 Model Hub: [https://huggingface.co/Seungyoun/qwen2.5-3b-it\_searchR1-like-multiturn](https://huggingface.co/Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn)
-> 📈 W\&B Report: [https://wandb.ai/yoon1001/search\_r1\_like\_async\_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
 A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
 During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.

 # Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
 > **Author · Seungyoun Shin**
+> 🤗 Model Hub: [hf](https://huggingface.co/Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn)
+> 📈 W\&B Report: [wandb](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
 A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
 During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.