Seungyoun commited on
Commit
2b48277
·
verified ·
1 Parent(s): b3d0c6e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -10,8 +10,8 @@ base_model:
10
  # Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
11
 
12
  > **Author · Seungyoun Shin**
13
- > 🤗 Model Hub: [https://huggingface.co/Seungyoun/qwen2.5-3b-it\_searchR1-like-multiturn](https://huggingface.co/Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn)
14
- > 📈 W\&B Report: [https://wandb.ai/yoon1001/search\_r1\_like\_async\_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
15
 
16
  A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
17
  During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.
 
10
  # Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
11
 
12
  > **Author · Seungyoun Shin**
13
+ > 🤗 Model Hub: [hf](https://huggingface.co/Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn)
14
+ > 📈 W\&B Report: [wandb](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
15
 
16
  A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
17
  During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.