-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ā¢ 2401.02038 ā¢ Published ā¢ 63 -
Learning To Teach Large Language Models Logical Reasoning
Paper ā¢ 2310.09158 ā¢ Published ā¢ 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ā¢ 2311.00176 ā¢ Published ā¢ 9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ā¢ 2308.09583 ā¢ Published ā¢ 7
Collections
Discover the best community collections!
Collections including paper arxiv:2410.02724
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper ā¢ 2309.14509 ā¢ Published ā¢ 18 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper ā¢ 2401.02412 ā¢ Published ā¢ 37 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper ā¢ 2401.06066 ā¢ Published ā¢ 49 -
Tuning Language Models by Proxy
Paper ā¢ 2401.08565 ā¢ Published ā¢ 23