Hacker News new | ask | show | jobs
by mchusma 64 days ago
You can run models on FPGAs and get massive cost, speed, and throughput gains (like 10x). The reason people don’t do it is because of other improvements (algorithmic) means that nobody really thinks locking into a model makes sense…yet. Would I want to use gpt 4o for anything today at 1/10th the price? That would be $0.40 per input, $1.50 per output. Gemma-4 31b is much more capable and cheaper. So a FPGA version of the model is just not worth it today.

But if progress begins to slow down, then the economics work. Maybe Gemma 4 is a good example. It feels really generally useful. Getting it at 1/10th the cost feels like it could be competitive in 2 years.

1 comments

The fpga would be for prototyping. The real progress comes from asics ... exactly as we saw with bitcoin mining. This GPU-based approach will eventually give way to bespoke circuits once everyone picks a favorite model.