43 41

rtuuuuuuuur

urtuuuu

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

arcee-ai/Arcee-Maestro-7B-Preview:Good for coding?

new activity 19 days ago

mistralai/Mistral-Small-24B-Instruct-2501:Optimal settings for running Small-24b using Ollama?

new activity 19 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B:Using the Model

View all activity

Organizations

None yet

urtuuuu's activity

New activity in arcee-ai/Arcee-Maestro-7B-Preview 3 days ago

Good for coding?

#1 opened 3 days ago by

urtuuuu

New activity in mistralai/Mistral-Small-24B-Instruct-2501 19 days ago

Optimal settings for running Small-24b using Ollama?

#14 opened 24 days ago by

AaronFeng753

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B 19 days ago

Using the Model

#14 opened 24 days ago by

joel1610-hon

New activity in Qwen/Qwen2.5-14B-Instruct-1M 22 days ago

Better than distilled version

#5 opened 22 days ago by

urtuuuu

liked a model 22 days ago

bartowski/Qwen2.5-14B-Instruct-1M-GGUF

Text Generation • Updated 28 days ago • 30.9k • 36

New activity in bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF 26 days ago

R1 32b is much worse than QwQ ...

#6 opened about 1 month ago by

mirek190

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 28 days ago

Poor performance in the leaderboard?

#17 opened about 1 month ago by

L29Ah

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-7B about 1 month ago

System Prompt

#2 opened about 1 month ago by

Wanfq

liked a model about 1 month ago

unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF

Updated 30 days ago • 448k • 68

New activity in deepseek-ai/DeepSeek-R1-Distill-Qwen-14B about 1 month ago

System Prompt

#2 opened about 1 month ago by

Wanfq

New activity in bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF about 1 month ago

Not impressed?

#2 opened about 1 month ago by

urtuuuu

liked 2 models about 2 months ago

bartowski/phi-4-GGUF

Text Generation • Updated Jan 11 • 29.6k • 50

microsoft/phi-4

Text Generation • Updated 20 days ago • 608k • • 1.77k

New activity in deepseek-ai/DeepSeek-V3 about 2 months ago

How can we thank you enough, whale bros?

#1 opened about 2 months ago by

KrishnaKaasyap

liked a model about 2 months ago

bartowski/Falcon3-10B-Instruct-abliterated-GGUF

Text Generation • Updated Dec 30, 2024 • 857 • 3

New activity in huihui-ai/Falcon3-10B-Instruct-abliterated about 2 months ago

GGUFs eventually ?

#1 opened 2 months ago by

HMasaki

replied to bartowski's post 2 months ago

A bit annoying, isn't it? Some time ago I asked you for arm version of gemma-2-9b-it-abliterated. So now it won't work again. I guess there is no Q4_0 ?

reacted to bartowski's post with 👍 2 months ago

Post

53133

Looks like Q4_0_N_M file types are going away

Before you panic, there's a new "preferred" method which is online (I prefer the term on-the-fly) repacking, so if you download Q4_0 and your setup can benefit from repacking the weights into interleaved rows (what Q4_0_4_4 was doing), it will do that automatically and give you similar performance (minor losses I think due to using intrinsics instead of assembly, but intrinsics are more maintainable)

You can see the reference PR here:

https://github.com/ggerganov/llama.cpp/pull/10446

So if you update your llama.cpp past that point, you won't be able to run Q4_0_4_4 (unless they add backwards compatibility back), but Q4_0 should be the same speeds (though it may currently be bugged on some platforms)

As such, I'll stop making those newer model formats soon, probably end of this week unless something changes, but you should be safe to download and Q4_0 quants and use those !

Also IQ4_NL supports repacking though not in as many shapes yet, but should get a respectable speed up on ARM chips, PR for that can be found here: https://github.com/ggerganov/llama.cpp/pull/10541

Remember, these are not meant for Apple silicon since those use the GPU and don't benefit from the repacking of weights

16 replies

New activity in matteogeniaccio/phi-4 2 months ago

Notably better than Phi3.5 in many ways, but something is wrong.

#5 opened 2 months ago by

phil111

New activity in bartowski/EXAONE-3.5-7.8B-Instruct-GGUF 3 months ago

llama.cpp...

#1 opened 3 months ago by

urtuuuu