view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 5 days ago β’ 87
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ Jan 23 β’ 63
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others β’ Jan 20 β’ 34
Jan 17 Releases βοΈ Collection Models and datasets of the second week of Jan 2025. β’ 23 items β’ Updated Jan 17 β’ 11
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 β’ 68
view article Article Announcing NVIDIA Cosmos World Foundation Models By mingyuliutw and 1 other β’ Jan 7 β’ 24
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi β’ 13 items β’ Updated Sep 18, 2024 β’ 227
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 β’ 63
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper β’ 2408.14906 β’ Published Aug 27, 2024 β’ 141