Hacker News new | ask | show | jobs
by postalrat 65 days ago
I'm predicting now that there is a clear use-case for this tech that work will (and has) accelerate specialized hardware, software, models, etc that will run much more efficiently in 10 years. So that the real token costs will be a fraction of what they are now.
1 comments

You can run models on FPGAs and get massive cost, speed, and throughput gains (like 10x). The reason people don’t do it is because of other improvements (algorithmic) means that nobody really thinks locking into a model makes sense…yet. Would I want to use gpt 4o for anything today at 1/10th the price? That would be $0.40 per input, $1.50 per output. Gemma-4 31b is much more capable and cheaper. So a FPGA version of the model is just not worth it today.

But if progress begins to slow down, then the economics work. Maybe Gemma 4 is a good example. It feels really generally useful. Getting it at 1/10th the cost feels like it could be competitive in 2 years.

The fpga would be for prototyping. The real progress comes from asics ... exactly as we saw with bitcoin mining. This GPU-based approach will eventually give way to bespoke circuits once everyone picks a favorite model.