Update README.md
Browse files
README.md
CHANGED
@@ -10,8 +10,8 @@ base_model:
|
|
10 |
# Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
|
11 |
|
12 |
> **Author · Seungyoun Shin**
|
13 |
-
> 🤗 Model Hub: [
|
14 |
-
> 📈 W\&B Report: [
|
15 |
|
16 |
A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
|
17 |
During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.
|
|
|
10 |
# Qwen2.5‑3B Search‑R1‑Multiturn (reproduce)
|
11 |
|
12 |
> **Author · Seungyoun Shin**
|
13 |
+
> 🤗 Model Hub: [hf](https://huggingface.co/Seungyoun/qwen2.5-3b-it_searchR1-like-multiturn)
|
14 |
+
> 📈 W\&B Report: [wandb](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
|
15 |
|
16 |
A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
|
17 |
During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.
|