Submitted by akhaliq 70 MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts · 5 authors 6
Submitted by akhaliq 49 Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM · 5 authors
Submitted by akhaliq 27 Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon · 6 authors 1
Submitted by akhaliq 21 GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation · 8 authors 1
Submitted by akhaliq 13 DiarizationLM: Speaker Diarization Post-Processing with Large Language Models · 6 authors 4
Submitted by akhaliq 13 AST-T5: Structure-Aware Pretraining for Code Generation and Understanding · 3 authors 2
Submitted by akhaliq 11 CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution · 6 authors
Submitted by akhaliq 10 Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach · 11 authors