Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 7 days ago • 49
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking Paper • 2501.00244 • Published Dec 31, 2024 • 1
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 17
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 19 days ago • 190
ReLearn: Unlearning via Learning for Large Language Models Paper • 2502.11190 • Published 7 days ago • 28
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Paper • 2310.06770 • Published Oct 10, 2023 • 5
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published 10 days ago • 38
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published 11 days ago • 53
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper • 2502.06608 • Published 13 days ago • 32
ToMoE: Converting Dense Large Language Models to Mixture-of-Experts through Dynamic Structural Pruning Paper • 2501.15316 • Published 29 days ago • 1
DarwinLM: Evolutionary Structured Pruning of Large Language Models Paper • 2502.07780 • Published 12 days ago • 17
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 16
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 63
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Paper • 2203.07259 • Published Mar 14, 2022 • 4