Peng

pennlio

pennlio111

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

upvoted a paper 3 months ago

LLark: A Multimodal Foundation Model for Music

upvoted a paper 3 months ago

TALKPLAY: Multimodal Music Recommendation with Large Language Models

View all activity

Organizations

upvoted a paper 7 days ago

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Paper • 2508.21365 • Published 10 days ago • 23

upvoted 2 papers 3 months ago

LLark: A Multimodal Foundation Model for Music

Paper • 2310.07160 • Published Oct 11, 2023 • 2

TALKPLAY: Multimodal Music Recommendation with Large Language Models

Paper • 2502.13713 • Published Feb 19 • 3

liked 2 models over 1 year ago

xai-org/grok-1

Text Generation • Updated Mar 28, 2024 • 457 • 2.36k

gradientai/Llama-3-8B-Instruct-Gradient-1048k

Text Generation • 8B • Updated Oct 29, 2024 • 54.1k • 679

liked a dataset over 1 year ago

m-a-p/COIG-CQIA

Viewer • Updated Apr 18, 2024 • 44.7k • 5.98k • 648

upvoted an article over 1 year ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

and 3 others •

Dec 9, 2022

• 336

liked 2 models over 1 year ago

meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 1.79M • • 6.3k

unsloth/llama-3-8b-bnb-4bit

Text Generation • 5B • Updated Jan 7 • 40.6k • 199

liked 2 models almost 2 years ago

stabilityai/stable-diffusion-x4-upscaler

Updated Jul 5, 2023 • 20.9k • 710

stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 2.14M • • 6.92k

liked a model over 2 years ago

Vision-CAIR/MiniGPT-4

Updated Apr 19, 2023 • 425

liked a dataset over 2 years ago

fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 42.8k • 9k

updated a model over 2 years ago

pennlio/test

Updated May 22, 2023

Peng

AI & ML interests

Recent Activity

Organizations

pennlio's activity

Illustrating Reinforcement Learning from Human Feedback (RLHF)