Jiang Jiwen's picture

Jiang Jiwen

jjw0126

·

AI & ML interests

RL, LLM

Recent Activity

upvoted an article 5 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

upvoted an article 5 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

liked a dataset 7 days ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

View all activity

Organizations

jjw0126's activity

upvoted 2 articles 5 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

27 days ago

• 771

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By

•

17 days ago

• 42

upvoted 2 collections 17 days ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 12 items • Updated 4 days ago • 79

Reasoning Datasets

Distilled synthetic Reasoning datasets • 7 items • Updated 22 days ago • 55

upvoted a collection 25 days ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 16 days ago • 195

upvoted 8 collections about 1 month ago

Thinking/Reasoning Datasets

16 items • Updated 25 days ago • 2

gemini-2.0-flash-thinking-exp-1219 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn. • 15 items • Updated Jan 16 • 4

gemini-exp-1206 Datasets

Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn. • 3 items • Updated Jan 16 • 1

story writing favourites

Models I personally liked for generating stories in the past. Not a recommendation, many of these are outdated. • 19 items • Updated 11 days ago • 42

long-cot-dataset

16 items • Updated Dec 22, 2024 • 7

LLMs - Best of 2025

Most interesting LLMs to play around with in 2025! (will be updated throughout the year) • 19 items • Updated 9 days ago • 2

Reasoning Models

If this really help, please upvote for researchers' hardwork • 14 items • Updated Jan 21 • 1

CoT Datasets

If this really help, please upvote for researchers' hardwork • 15 items • Updated Jan 20 • 1

upvoted 2 collections 3 months ago

DCLM

DCLM Models + Datasets • 6 items • Updated Oct 4, 2024 • 25

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated 13 days ago • 83

upvoted a collection 6 months ago

small language models

under 7b 🐁 • 62 items • Updated 28 days ago • 28

upvoted a collection 8 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 231