ajagota71/gemma-3-270m-detox-checkpoint-epoch-20 Reinforcement Learning • 0.3B • Updated 6 days ago • 6
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-40 Reinforcement Learning • 0.5B • Updated 6 days ago • 5
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-60 Reinforcement Learning • 0.5B • Updated 6 days ago • 5
ajagota71/gemma-3-270m-detox-checkpoint-epoch-40 Reinforcement Learning • 0.3B • Updated 6 days ago • 5
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-80 Reinforcement Learning • 0.5B • Updated 6 days ago • 4
ajagota71/gemma-3-270m-detox-checkpoint-epoch-60 Reinforcement Learning • 0.3B • Updated 6 days ago • 5
ajagota71/Qwen2.5-0.5B-detox-checkpoint-epoch-100 Reinforcement Learning • 0.5B • Updated 6 days ago • 4
ajagota71/gemma-3-270m-detox-checkpoint-epoch-80 Reinforcement Learning • 0.3B • Updated 6 days ago • 5
ajagota71/gemma-3-270m-detox-checkpoint-epoch-100 Reinforcement Learning • 0.3B • Updated 6 days ago • 5
MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep20-2025-08-18_18-34-45 Reinforcement Learning • 0.4B • Updated 3 days ago • 3
MattBou00/smolLM-360m-detox_try_3_stable_retry-ckpt-ep40-2025-08-18_18-34-45 Reinforcement Learning • 0.4B • Updated 3 days ago • 3
MattBou00/smolLM-360m-detox_try_3_stable_retry Reinforcement Learning • 0.4B • Updated 3 days ago • 4
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep20-2025-08-18_18-50-03 Reinforcement Learning • 0.4B • Updated 3 days ago • 3
MattBou00/smolLM-360m-detox_try_4_closekl-ckpt-ep40-2025-08-18_18-50-03 Reinforcement Learning • 0.4B • Updated 3 days ago • 3
MattBou00/llama-3-2-1b-detox_v1-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 2 days ago • 2
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 2 days ago • 1
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 1 day ago • 1
MattBou00/llama-3-2-1b-detox_v1b-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 1 day ago • 1
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 1 day ago • 3
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 1 day ago • 1
MattBou00/llama-3-2-1b-detox_v1c-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 1 day ago • 1