4 7 4

Zili Wang

MarkWang

MarkXCloud

AI & ML interests

Multi-modality learning and inference acceleration

Recent Activity

upvoted a paper 6 days ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

upvoted a paper 23 days ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

liked a model 27 days ago

YannQi/R-4B

View all activity

Organizations

upvoted a paper 6 days ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published 10 days ago • 103

upvoted a paper 23 days ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published 24 days ago • 141

liked a model 27 days ago

YannQi/R-4B

Image-Text-to-Text • 5B • Updated 3 days ago • 41.3k • 135

upvoted a paper 3 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 42

authored a paper 3 months ago

Faster and Better LLMs via Latency-Aware Test-Time Scaling

Paper • 2505.19634 • Published May 26

upvoted an article 3 months ago

Article

Vision Language Models (Better, Faster, Stronger)

and 4 others •

May 12

• 522

authored a paper 10 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16

upvoted a paper 10 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16

commented a paper 10 months ago

Continuous Speculative Decoding for Autoregressive Image Generation

Paper • 2411.11925 • Published Nov 18, 2024 • 16 •

authored a paper 12 months ago

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published Sep 10, 2024 • 16

authored 2 papers about 1 year ago

Layerwise Recurrent Router for Mixture-of-Experts

Paper • 2408.06793 • Published Aug 13, 2024 • 33

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Paper • 2408.01708 • Published Aug 3, 2024 • 4

commented a paper about 1 year ago

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Paper • 2408.01708 • Published Aug 3, 2024 • 4 •

authored a paper about 1 year ago

A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 16

authored 2 papers over 1 year ago

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 54

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25, 2024 • 61

liked 3 Spaces about 2 years ago

Detection Metrics

📈

168

Open Object Detection Leaderboard

🏆

Request model evaluation on COCO val 2017 dataset

236

FastSAM

🐠

Segment images using texts, points, or everything mode

upvoted a paper about 2 years ago

Fast Segment Anything

Paper • 2306.12156 • Published Jun 21, 2023 • 34

Zili Wang

AI & ML interests

Recent Activity

Organizations

MarkWang's activity

Vision Language Models (Better, Faster, Stronger)

Detection Metrics

Open Object Detection Leaderboard

FastSAM