hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8 Text Generation • 2B • Updated 8 days ago • 42
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_seq_end_mask_thin_mu_8_warmed Text Generation • 2B • Updated 6 days ago • 40
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_seq_end_mask_scale_thin_mu_8_warmed Text Generation • 2B • Updated 6 days ago • 35
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_abf Text Generation • 2B • Updated 3 days ago • 31
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_abf Text Generation • 2B • Updated 3 days ago • 33