weixuchen's picture

22 6

weixuchen

KageXu

AI & ML interests

computer vision

Recent Activity

upvoted a paper 7 days ago

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

upvoted a paper 7 days ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

liked a model 8 days ago

AIDC-AI/Ovis2-34B

View all activity

Organizations

None yet

KageXu's activity

upvoted 2 papers 7 days ago

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Paper • 2502.08690 • Published 11 days ago • 39

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published 11 days ago • 181

liked 6 models 8 days ago

AIDC-AI/Ovis2-34B

Image-Text-to-Text • Updated 5 days ago • 1.16k • 110

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated Dec 21, 2024 • 453k • • 2.02k

deepseek-ai/DeepSeek-R1

Text Generation • Updated 15 days ago • 4.43M • • 10k

tomg-group-umd/huginn-0125

Text Generation • Updated about 8 hours ago • 9.36k • 225

microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 6 days ago • 4.54k • 894

deepseek-ai/Janus-Pro-7B

Any-to-Any • Updated 23 days ago • 476k • 3.1k

upvoted 12 papers 8 days ago

TVBench: Redesigning Video-Language Evaluation

Paper • 2410.07752 • Published Oct 10, 2024 • 6

ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models

Paper • 2410.09637 • Published Oct 12, 2024 • 4

Tree of Problems: Improving structured problem solving with compositionality

Paper • 2410.06634 • Published Oct 9, 2024 • 9

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 17

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Paper • 2410.10813 • Published Oct 14, 2024 • 10

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7

MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models

Paper • 2410.09733 • Published Oct 13, 2024 • 9

Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies

Paper • 2410.10803 • Published Oct 14, 2024 • 7

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Paper • 2410.10774 • Published Oct 14, 2024 • 26

Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Paper • 2410.10792 • Published Oct 14, 2024 • 30

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Paper • 2410.10783 • Published Oct 14, 2024 • 27

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published Oct 14, 2024 • 19