|
|
|
|
|
by mchusma
64 days ago
|
|
You can run models on FPGAs and get massive cost, speed, and throughput gains (like 10x). The reason people don’t do it is because of other improvements (algorithmic) means that nobody really thinks locking into a model makes sense…yet. Would I want to use gpt 4o for anything today at 1/10th the price? That would be $0.40 per input, $1.50 per output. Gemma-4 31b is much more capable and cheaper. So a FPGA version of the model is just not worth it today. But if progress begins to slow down, then the economics work. Maybe Gemma 4 is a good example. It feels really generally useful. Getting it at 1/10th the cost feels like it could be competitive in 2 years. |
|