Stoney Kang's picture

8 14

Stoney Kang

sikang99

·

AI & ML interests

Remote Control based on Vision

Recent Activity

liked a model 1 day ago

facebook/MobileLLM-1B

liked a model 1 day ago

MBZUAI/MobiLlama-1B

liked a model 5 days ago

perplexity-ai/r1-1776

View all activity

Organizations

sikang99's activity

upvoted a paper 6 days ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 9 days ago • 49

upvoted an article 25 days ago

Article

We now support VLMs in smolagents!

about 1 month ago

• 84

upvoted 2 papers 5 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 141

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

upvoted a collection 5 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

upvoted an article 6 months ago

Article

Train Custom Models on Hugging Face Spaces with AutoTrain SpaceRunner

By

•

May 9, 2024

• 16

upvoted 2 papers 7 months ago

UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling

Paper • 2408.04810 • Published Aug 9, 2024 • 24

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9, 2024 • 47