Paul Sharratt's picture

Paul Sharratt

paulsharratt
ยท

AI & ML interests

None yet

Recent Activity

Organizations

Mamamia's profile picture

paulsharratt's activity

commented on Grok 3 ai : Best AI model now! 4 days ago
view reply

@Eligy you really can't. You'll need somewhere between 4 and 8 H200's to host the full unquantised model, a fractional quantised model needs 151GB video ram at the minimum, and you start to see issues at that level, plus no end-user is realistically going to have that much VRAM, even with Strix Halo. Cost for a system that performs half decent at that model size is somewhere in the region of the hundreds of thousands. You can go down to the 70b model which will need either 2x 5090's or an A100, which is ~$10k investment just in GPUs.

So you've spent $10k on GPUs, plus extra on ram, you've invested significant effort into learning how to run a local model, gotten it running, connected up your local interface, and now you're pumping out 10 tokens per second with occasional crashes, for a model that performs NEARLY as good as the aging GPT-4o model, or Gemini 2.0 flash thinking (which is free).

OR you could spend a few dollars and get access to models that will compete or beat the full R1 model, output 120 tokens per second, require no local setup or effort, and are significantly more reliable.

It isn't even a comparison. You cannot realistically run Deepseek-R1 locally. You can maybe run the 14b model, which is inferior to everything online by a substantial margin, and has very little practical value, or you can use an online api endpoint which is faster, cheaper, and better.

commented on Grok 3 ai : Best AI model now! 4 days ago
view reply

Then you don't understand how the market works. The company with the best AI model (for the same price) would eventually take over the whole market if things remained static. Being 90% as good doesn't get you 90% of the money, it gets you 0%. Obviously the company's fight for leadership position and customer's have preferences on API integration etc. so it's not as simple as being the "best" but R1 is dead in the water. It's slow, expensive to host, and the Deepseek hosted solution is absolutely gobbling up everyone's data, no one with serious $$$ to spend is spending it on Deepseek.