Submitted by akhaliq 36 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models · 19 authors 3
Submitted by akhaliq 18 Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters · 7 authors 1
Submitted by akhaliq 17 MathScale: Scaling Instruction Tuning for Mathematical Reasoning · 4 authors 2
Submitted by akhaliq 14 MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets · 10 authors 1
Submitted by akhaliq 13 EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs · 6 authors 3
Submitted by akhaliq 11 Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use · 13 authors 1
Submitted by akhaliq 11 Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models · 6 authors 1
Submitted by akhaliq 9 RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches · 13 authors 1