Hacker News new | ask | show | jobs
by christkv 1139 days ago
Feels like its still the area of wait and see as the space shakes out. It would be great to be able to run our own models in some near future for applications but the amount of hardware needed to delivery service to a significant audience is pretty crazy. Right now I don't see any way but to re-bill the cost with a markup to end customers unless you have a giant pile of VC money that you can light on fire.
2 comments

I found running the model on rented hardware much more expensive than ChatGPT. Might work ok for local sunk cost hardware for those who game and don’t crypto mine.
WebGPU would be a way to shift that cost back to each device. RedPajama 3B could become useful for some tasks and run quite fast on most hardware available. Then as users have better computers, they can get access to better models?
You'd have to have each user download a 3GB payload first. For comparison, that's a good few hours of netflix at 1080p.
Browsers and operating systems will eventually include them.

But for now, you would need a good privacy reason to go this route.