rl-rag/qwen3-8b-base-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 6 days ago • 12
rl-rag/qwen3-8b-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 6 days ago • 13
rl-rag/qwen3-4b-it-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 4B • Updated 6 days ago • 11
rl-rag/qwen2.5-7b-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 6 days ago • 12
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_call_tool Viewer • Updated 4 days ago • 2.94k • 82
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_with_new_mcp_system_prompt Viewer • Updated 7 days ago • 2.94k • 121
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_longform_averaged_outcome_with_system_prompt Viewer • Updated 7 days ago • 2.94k • 94
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_outcome_with_new_mcp_system_prompt Viewer • Updated 7 days ago • 2.94k • 55