InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper β’ 2502.08910 β’ Published 11 days ago β’ 139
Running on Zero 19 19 Mixture Of Diffusers SDXL Tiling π Mixture of Diffusers implementation for XL Stable Diffusion
Running on Zero 1.88k 1.88k QR Code AI Art Generator π± QR Code AI Art Generator Blend QR codes with AI Art
view post Post 7217 π’ New Research Alert: Making Language Models Smaller & Smarter!Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance. The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.π Key Findings:β’ 77% parameter reduction.β’ Maintained model capabilities.β’ Improved generalization.Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORTCode: https://github.com/joaopauloschuler/less-parameters-llm See translation 2 replies Β· π 18 18 π₯ 8 8 π€― 3 3 π 2 2 π§ 1 1 + Reply
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 19 days ago β’ 190