Hacker News new | ask | show | jobs
by mechagodzilla 694 days ago
Well each new generation of model costs like 10x the previous one to train, and its value (and thus ability to generate a return) diminishes extremely rapidly. The only source of improved economics is the rapidly evaporating Moore's Law (and any opex savings are swamped by the crazy high capex if you're using chips from Nvidia).
1 comments

> rapidly evaporating Moore's Law

Algorithm (no, I don't mean Mamba etc, you can still use decoder-only transformers with some special attention layers) and engineering side there's still at least 10x improvement possible. Compared to what TensorRT-LLM is able to achieve now.

My concern is, this is only possible because of scale, so local LLMs are going to be dead in the water.