Haoning Wu, Teo PRO

teowu

https://teowu.github.io

AI & ML interests

Lead of Q-Future: https://github.com/Q-Future. I love MLLMs/LMMs/LVLMs/(any names you call them). Part of two great MoE VLMs as core contributors: Kimi-VL & Aria. Living and Cooking in Singapore Now.

Recent Activity

new activity 20 days ago

moonshotai/Kimi-VL-A3B-Thinking-2506:Updates Transformers Inference code in README.md

upvoted a paper 30 days ago

Generative Frame Sampler for Long Video Understanding

reacted to fdaudens's post with 👍 about 1 month ago

You might not have heard of Moonshot AI — but within 24 hours, their new model Kimi K2 shot to the top of Hugging Face’s trending leaderboard. So… who are they, and why does it matter? Had a lot of fun co-writing this blog post with @xianbao, with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks. 🧵 A few standout facts: 1. From zero to $3.3B in 18 months: Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan. 2. A CEO who thinks from the end: Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs. 3. A trillion-parameter model that’s surprisingly efficient: Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks. 4. The secret weapon: Muon optimizer: A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications. Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.” 👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs: https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

View all activity

Organizations

upvoted a paper 30 days ago

Generative Frame Sampler for Long Video Understanding

Paper • 2503.09146 • Published Mar 12 • 1

upvoted a collection 2 months ago

Kimi-VL-A3B

Collection

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Jul 1 • 74

upvoted an article 2 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

and 1 other •

Jun 21

• 66

upvoted 2 papers 2 months ago

ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks

Paper • 2503.06885 • Published Mar 10 • 4

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 79

upvoted a paper 3 months ago

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Paper • 2505.23359 • Published May 29 • 40

upvoted a collection 4 months ago

Kimi-VL Thinking

Collection

3 items • Updated Apr 17 • 1

upvoted 3 papers 4 months ago

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14 • 280

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 134

upvoted 4 papers 6 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 146

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 200

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 22

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 30

upvoted 2 papers 7 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 416

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 123

upvoted a paper 8 months ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 48

upvoted 3 papers 9 months ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 29

Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 26

AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

Paper • 2410.03051 • Published Oct 4, 2024 • 6

Haoning Wu, Teo PRO

AI & ML interests

Recent Activity

Organizations

teowu's activity

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation