The quality of local models has increased significantly since this time last yea...

icedchai · 2026-02-07T21:48:07 1770500887

The quality of local models is still abysmal compared to commercial SOTA models. You're not going to run something like Gemini or Claude locally. I have some "serious" hardware with 128G of VRAM and the results are still laughable. If I moved up to 512G, it still wouldn't be enough. You need serious hardware to get both quality and speed. If I can get "quality" at a couple tokens a second, it's not worth bothering.

They are getting better, but that doesn't mean they're good.

_aavaa_ · 2026-02-07T22:07:27 1770502047

Good by what standard? Compared to SOTA today? No they're not. But they are better than the SOTA in 2020, and likely 2023.

We have a magical pseudo-thinking machine that we can run locally completely under our control, and instead the goal posts have moved to "but it's not as fast as the proprietary could".

icedchai · 2026-02-07T22:27:58 1770503278

My comparison was today's local AI to today's SOTA commercial AI. Both have improved, no argument.

It's more cost effective for someone to pay $20 to $100 month for a Claude subscription compared to buying a 512 gig Mac Studio for $10K. We won't discuss the cost of the NVidia rig.

I mess around with local AI all the time. It's a fun hobby, but the quality is still night and day.

_aavaa_ · 2026-02-08T15:41:34 1770565294

The original pithy comment I was replying to was arguing that we’ll become dependent to a service run by another company. I don’t see that being true for two reasons:

1. You are not forced to use the AI in the first place.

2. If you want to use one, you can self host it one of the open models.

That at any moment in time the open models are not equivalent in capabilities to the SOTA paid models is beside the point.

icedchai · 2026-02-08T18:24:41 1770575081

Ok. I don’t think hosting a capable open model is seriously a realistic option for the vast majority of consumers.

_aavaa_ · 2026-02-08T18:48:52 1770576532

Full LLM, no. Not yet.

But there’s new things like sweep [0] that you now can do locally.

And 2-3 years ago capable open models weren’t even a thing. Now we’ve made progress on that front. And I believe they’ll keep improving (both on accessibility and competency).

[0]: https://news.ycombinator.com/item?id=46713106

IhateAI_2 · 2026-02-07T21:42:01 1770500521

These takes are terrible.

1. It costs 100k in hardware to run Kimi 2.5 with a single session at decent tok p/s and its still not capable for anything serious.

2. I want whatever you're smoking if you think anyone is going to spend billions training models capable of outcompeting them are affordable to run and then open source them.

_aavaa_ · 2026-02-08T14:24:53 1770560693

Quantize it and you can drop a zero from that price.

How much serious work can it do versus chatgpt3 (SOTA only a few years ago)?