Post
2056
🔥 Level up your model training w/ GaLore + Transformers for SOTA results on consumer-grade hardware!
⬇️ 82.5% less optimizer state memory footprint without performance degradation by expressing the gradient weight matrix as low rank.
👩🏿💻 Install via
The integration of GaLore into the training of large language models (LLMs) marks a significant advancement in the field of deep learning, particularly in terms of memory efficiency and the democratization of AI research. By allowing for the training of billion-parameter models on consumer-grade hardware, reducing memory footprint in optimizer states, and leveraging advanced projection matrix techniques, GaLore opens new horizons for researchers and practitioners with limited access to high-end computational resources.
🔬 Find out more about GaLore and investigate lots of juicy technical details: https://huggingface.co/blog/galore
🤗 Huge thanks to everyone involved ❤️:
• authors: @jiaweizhao @Kyriection @beidic Zhangyang Wang @animakumar @tydsh
• community contributors: @hiyouga @mdouglas and others!
• @ybelkada for taking such swift action in composing and coordinating necessary PRs to get this live at ⚡ speed!
🏗️📈 Super rewarding to see how @timdettmers work with optimizers is being built upon to achieve even greater heights!
🚧 Actually, there are ongoing works to integrate GaLore into bitsandbytes and optimize memory efficiency even further 💪. We'll keep you posted!
⬇️ 82.5% less optimizer state memory footprint without performance degradation by expressing the gradient weight matrix as low rank.
👩🏿💻 Install via
pip install transformers>=4.39.0 galore-torch
. #ProudlyGpuPoorThe integration of GaLore into the training of large language models (LLMs) marks a significant advancement in the field of deep learning, particularly in terms of memory efficiency and the democratization of AI research. By allowing for the training of billion-parameter models on consumer-grade hardware, reducing memory footprint in optimizer states, and leveraging advanced projection matrix techniques, GaLore opens new horizons for researchers and practitioners with limited access to high-end computational resources.
🔬 Find out more about GaLore and investigate lots of juicy technical details: https://huggingface.co/blog/galore
🤗 Huge thanks to everyone involved ❤️:
• authors: @jiaweizhao @Kyriection @beidic Zhangyang Wang @animakumar @tydsh
• community contributors: @hiyouga @mdouglas and others!
• @ybelkada for taking such swift action in composing and coordinating necessary PRs to get this live at ⚡ speed!
🏗️📈 Super rewarding to see how @timdettmers work with optimizers is being built upon to achieve even greater heights!
🚧 Actually, there are ongoing works to integrate GaLore into bitsandbytes and optimize memory efficiency even further 💪. We'll keep you posted!