HINT-lab
's Collections
Reward-Calibration
updated
HINT-lab/llama3-8b-final-ppo-c-v0.3
Text Generation
•
Updated
•
12
HINT-lab/mistral-7b-hermes-crm-skywork
HINT-lab/mistral-7b-hermes-cdpo-v0.2
Text Generation
•
Updated
•
19
HINT-lab/mistral-7b-ppo-clean-hermes
Text Generation
•
Updated
•
8
HINT-lab/mistral-7b-ppo-hermes-v0.3
Text Generation
•
Updated
•
10
•
1
HINT-lab/mistral-7b-ppo-m-hermes
Text Generation
•
Updated
•
8
HINT-lab/llama3-8b-cdpo-v0.2
Text Generation
•
Updated
•
6
HINT-lab/llama3-8b-final-ppo-v0.3
Text Generation
•
Updated
•
3
HINT-lab/mistral-7b-hermes-rm-skywork
Updated
•
17
HINT-lab/llama3-8b-final-ppo-m-v0.3
Text Generation
•
Updated
•
23
HINT-lab/llama3-8b-crm-final-v0.1
HINT-lab/llama3-8b-final-ppo-clean-v0.1
Text Generation
•
Updated
•
3
HINT-lab/mistral-7b-hermes-dpo-v0.2
Text Generation
•
Updated
•
4
HINT-lab/mistral-7b-ppo-c-hermes
Text Generation
•
Updated
•
11
HINT-lab/llama3-8b-dpo-v0.2
Text Generation
•
Updated
•
8