|
|
|
|
|
by rjknight
416 days ago
|
|
I think it depends on whether you think there's low-hanging fruit in making the ML stack more efficient, or not. LLMs are still somewhat experimental, with various parts of the stack being new-ish, and therefore relatively un-optimised compared to where they could be. Let's say we took 10% of the training compute budget, and spent it on an army of AI coders whose job is to make the training process 12% more efficient. Could they do it? Given the relatively immature state of the stack, it sounds plausible to me (but it would depend a lot on having the right infrastructure and practices to make this work, and those things are also immature). The bull case would be the assumption that there's some order-of-magnitude speedup available, or possibly multiple such, but that finding it requires a lot of experimentation of the kind that tireless AI engineers might excel at. The bear case is that efficiency gains will be small, hard-earned, or specific to some rapidly-obsoleting architecture. Or, that efficiency gains will look good until the low-hanging fruit is gone, at which point they become weak again. |
|