Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
RTO-RL
/
Llama3-8B-RDPO
like
1
Follow
Reinforced Token Optimization
4
Safetensors
HuggingFaceH4/ultrafeedback_binarized
llama
Model card
Files
Files and versions
Community
Train
main
Llama3-8B-RDPO
1 contributor
History:
3 commits
zkshan2002
Create README.md
bb678dc
verified
13 days ago
.gitattributes
Safe
1.57 kB
initial commit
17 days ago
README.md
Safe
337 Bytes
Create README.md
13 days ago
config.json
Safe
744 Bytes
initial commit
17 days ago
generation_config.json
Safe
164 Bytes
initial commit
17 days ago
model-00001-of-00004.safetensors
Safe
4.98 GB
LFS
initial commit
17 days ago
model-00002-of-00004.safetensors
Safe
5 GB
LFS
initial commit
17 days ago
model-00003-of-00004.safetensors
Safe
4.92 GB
LFS
initial commit
17 days ago
model-00004-of-00004.safetensors
Safe
1.17 GB
LFS
initial commit
17 days ago
model.safetensors.index.json
Safe
24 kB
initial commit
17 days ago
special_tokens_map.json
Safe
444 Bytes
initial commit
17 days ago
tokenizer.json
Safe
17.2 MB
LFS
initial commit
17 days ago
tokenizer_config.json
Safe
51 kB
initial commit
17 days ago