Ariel Kwiatkowski
RedTachyon
ยท
AI & ML interests
RL, MARL, Crowd Simulation
Recent Activity
upvoted
a
paper
16 days ago
PILAF: Optimal Human Preference Sampling for Reward Modeling
authored
a paper
16 days ago
PILAF: Optimal Human Preference Sampling for Reward Modeling
liked
a dataset
about 2 months ago
AI-MO/NuminaMath-CoT