Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
1
Jerry Huang
PRO
jerry128
Follow
smadala2's profile picture
1 follower
·
5 following
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
1 day ago
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
updated
a dataset
2 months ago
jerry128/rag-rl-sft-linear
updated
a dataset
2 months ago
jerry128/rag-rl-sft-min-max
View all activity
Organizations
jerry128
's models
9
Sort: Recently updated
jerry128/ToolACE-axolotl
Updated
Jun 5
jerry128/Qwen2.5-7B-Instruct-Setwise-Reranker
Text Generation
•
8B
•
Updated
Mar 30
•
8
jerry128/Qwen2.5-7B-Instruct-MUSIQUE-GRPO-CL-Sorted-by_Hops
Text Generation
•
8B
•
Updated
Mar 7
•
5
jerry128/Qwen2.5-7B-Instruct-MUSIQUE-GRPO-STEP-CL
Text Generation
•
8B
•
Updated
Mar 5
jerry128/Qwen2.5-7B-Instruct-MUSIQUE-GRPO-CL-Shuffled
Text Generation
•
8B
•
Updated
Mar 5
•
7
jerry128/Qwen2.5-7B-Instruct-MUSIQUE-GRPO-Baseline
Text Generation
•
8B
•
Updated
Mar 5
•
7
jerry128/Qwen2.5-7B-Instruct-MUSIQUE-GRPO-CL
Text Generation
•
8B
•
Updated
Mar 4
•
7
jerry128/Qwen2.5-7B-Instruct-HOTPOTQA-GRPO-STEP-CL
Text Generation
•
8B
•
Updated
Mar 3
•
7
jerry128/Qwen2.5-7B-Instruct-HOTPOTQA-GRPO-CL
Text Generation
•
8B
•
Updated
Mar 3
•
7