Hacker News new | ask | show | jobs
by rvz 12 days ago
That's why Microsoft and all of the OEMs announced localized AI models that run on the new NVIDIA CPU on laptops.
1 comments

I don't think these small models are really that powerful yet and I don't really like the direction of per device localized models baked in to the OS. To wimpy and untrustworthy.

I want claude power, in a box, at my house, for my entire family completely compartmentalized from my operating system.

> I want claude power, in a box, at my house

This is still a six-figure commitment, possibly high five.

How do you know this and what does it really cost. Those numbers make no sense to me.
> How do you know this and what does it really cost

The cost of RAM and size of models. For Kimi K2.6 you need 2TB RAM. That’s $40k with DDR5. If you want it to run at the speeds you’re accustomed to with Claude, you need HBM memory, which costs more.

Practically speaking, you need to sink $250k+ into a 8x B200 node. So yeah, 6 figures to run properly. High 5 figures if you’re okay with really slow responses.

Wow these economics really make zero sense long term if they don't get more efficient.