MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Paper • 2406.14909 • Published Jun 21, 2024 • 15
MPCFormer: fast, performant and private Transformer inference with MPC Paper • 2211.01452 • Published Nov 2, 2022 • 1
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition Paper • 2308.14929 • Published Aug 28, 2023 • 1
Cuttlefish: Low-Rank Model Training without All the Tuning Paper • 2305.02538 • Published May 4, 2023 • 1
Redco: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs Paper • 2310.16355 • Published Oct 25, 2023