The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities Paper • 2501.13921 • Published Jan 23 • 3
sentence-transformers/static-similarity-mrl-multilingual-v1 Sentence Similarity • Updated Jan 17 • • 51
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Paper • 2411.17691 • Published Nov 26, 2024 • 13 • 5
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Paper • 2411.17691 • Published Nov 26, 2024 • 13
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 138
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs Paper • 2410.10739 • Published Oct 14, 2024 • 2
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs Paper • 2410.10739 • Published Oct 14, 2024 • 2 • 1
Instruction Following without Instruction Tuning Paper • 2409.14254 • Published Sep 21, 2024 • 29 • 4
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B Paper • 2409.11055 • Published Sep 17, 2024 • 17 • 3