10 20 3

Taki WU

taki555

https://wutaiqiang.github.io/

AI & ML interests

None yet

Recent Activity

commented on a paper about 22 hours ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

authored a paper 1 day ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

commented on a paper 1 day ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

View all activity

Organizations

authored a paper 1 day ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Paper • 2601.09195 • Published 6 days ago • 11

submitted a paper to Daily Papers 1 day ago

ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

Paper • 2601.09195 • Published 6 days ago • 11

authored a paper 3 months ago

Revisiting Model Interpolation for Efficient Reasoning

Paper • 2510.10977 • Published Oct 13, 2025 • 9

authored a paper 4 months ago

Timber: Training-free Instruct Model Refining with Base via Effective Rank

Paper • 2509.23595 • Published Sep 28, 2025 • 1

authored 5 papers 8 months ago

A Survey on the Honesty of Large Language Models

Paper • 2409.18786 • Published Sep 27, 2024 • 31

Autoregressive Models in Vision: A Survey

Paper • 2411.05902 • Published Nov 8, 2024 • 19

LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation

Paper • 2501.12976 • Published Jan 22, 2025

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21, 2025 • 49

Shadow-FT: Tuning Instruct via Base

Paper • 2505.12716 • Published May 19, 2025 • 4

authored a paper about 1 year ago

LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models

Paper • 2411.06839 • Published Nov 11, 2024 • 1

authored 6 papers over 1 year ago

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

Paper • 2212.06385 • Published Dec 13, 2022

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Paper • 2304.05659 • Published Apr 12, 2023

Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

Paper • 2405.14507 • Published May 23, 2024

Mixture-of-Subspaces in Low-Rank Adaptation

Paper • 2406.11909 • Published Jun 16, 2024 • 3

Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models

Paper • 2404.02657 • Published Apr 3, 2024 • 2

Weight-Inherited Distillation for Task-Agnostic BERT Compression

Paper • 2305.09098 • Published May 16, 2023

Taki WU

AI & ML interests

Recent Activity

Organizations

taki555's activity

🎉 Free Image Generator Now Available!