1 2 77

Jun Young Baek

jupiterbjy

jupiterbjy

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF

liked a model about 2 months ago

PowerInfer/SmallThinker-3B-Preview

replied to bartowski's post 2 months ago

Looks like Q4_0_N_M file types are going away Before you panic, there's a new "preferred" method which is online (I prefer the term on-the-fly) repacking, so if you download Q4_0 and your setup can benefit from repacking the weights into interleaved rows (what Q4_0_4_4 was doing), it will do that automatically and give you similar performance (minor losses I think due to using intrinsics instead of assembly, but intrinsics are more maintainable) You can see the reference PR here: https://github.com/ggerganov/llama.cpp/pull/10446 So if you update your llama.cpp past that point, you won't be able to run Q4_0_4_4 (unless they add backwards compatibility back), but Q4_0 should be the same speeds (though it may currently be bugged on some platforms) As such, I'll stop making those newer model formats soon, probably end of this week unless something changes, but you should be safe to download and Q4_0 quants and use those ! Also IQ4_NL supports repacking though not in as many shapes yet, but should get a respectable speed up on ARM chips, PR for that can be found here: https://github.com/ggerganov/llama.cpp/pull/10541 Remember, these are not meant for Apple silicon since those use the GPU and don't benefit from the repacking of weights

View all activity

Organizations

None yet

jupiterbjy's activity

liked a model about 1 month ago

bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF

Text Generation • Updated Jan 22 • 1.82M • 159

liked a model about 2 months ago

PowerInfer/SmallThinker-3B-Preview

Text Generation • Updated Jan 16 • 109k • • 387

liked 6 models 2 months ago

liked a Space 2 months ago

189

GPU Poor LLM Arena

🏆

Compact LLM Battle Arena: Frugal AI Face-Off!

liked 2 models 2 months ago

meta-llama/Llama-Guard-3-8B

Text Generation • Updated Oct 11, 2024 • 351k • • 169

NX-AI/xLSTM-7b

Updated 10 days ago • 848 • 77

liked 9 models 3 months ago

WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B

Text Generation • Updated Oct 9, 2024 • 725 • 49

bartowski/QwQ-32B-Preview-GGUF

Text Generation • Updated Nov 27, 2024 • 6k • 99

kromeurus/L3.1-Aglow-Vulca-v0.1-8B

Text Generation • Updated Sep 28, 2024 • 33 • 8

tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b-GGUF

Updated Aug 2, 2024 • 826 • 6

EricB/Llama-3.2-11B-Vision-Instruct-UQFF

Updated Dec 15, 2024 • 43 • 17

bartowski/LLaMA-Mesh-GGUF

Text-to-3D • Updated Dec 6, 2024 • 4.25k • 28

OuteAI/OuteTTS-0.2-500M-GGUF

Text-to-Speech • Updated Dec 3, 2024 • 1.9k • 73

bartowski/Qwen2.5-Coder-32B-Instruct-exl2

Text Generation • Updated Nov 13, 2024 • 172 • 11

bartowski/Qwen2.5-Coder-32B-Instruct-GGUF

Text Generation • Updated Nov 10, 2024 • 21.4k • 62