Seungyoun commited on
Commit
8352556
·
verified ·
1 Parent(s): 7fab937

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -15,8 +15,7 @@ base_model:
15
  >
16
  > 📈 W\&B Report: [wandb](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
17
 
18
- A faithful re‑implementation of the *Search‑R1* retrieval‑augmented QA agent on **Qwen 2.5‑3B**, trained purely on the Wikipedia‑based Search‑R1 corpus with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
19
- During inference we replace the original SGLang runtime with a compact DuckDuckGo‑powered tool loop implemented in a single script.
20
 
21
  ---
22
 
@@ -26,7 +25,7 @@ During inference we replace the original SGLang runtime with a compact DuckDuckG
26
  pip install "transformers>=4.41" torch duckduckgo_search>=6.3.5 accelerate
27
 
28
  # one‑command demo
29
- python search_r1_infer.py "현재 대한민국 대통령은 누구야?"
30
  ```
31
 
32
  ### Full inference script
 
15
  >
16
  > 📈 W\&B Report: [wandb](https://wandb.ai/yoon1001/search_r1_like_async_rl/reports/Qwen2-5-3b-it-search-r1-reproduce--VmlldzoxMzA2NzA2NA)
17
 
18
+ A faithful re‑implementation of the *Search‑R1* on **Qwen 2.5‑3B-instruct**, trained purely on `nq-hotpotqa-train` with GRPO via the open‑source [VERL](https://github.com/volcengine/verl) framework.
 
19
 
20
  ---
21
 
 
25
  pip install "transformers>=4.41" torch duckduckgo_search>=6.3.5 accelerate
26
 
27
  # one‑command demo
28
+ python search_r1_infer.py "how's the weather in seoul?"
29
  ```
30
 
31
  ### Full inference script