| >Even with that, there are obvious limitations described by Amdahl's law, which states there is a logarithmic maximum potential improvement by increasing hardware provisions. I don't know why so many people are obsessed with Amdahl's law as some universal argument. The quoted section is not only 100% incorrect, it sweeps the blatantly obvious energy problem under the rug. Imagine going to a local forest and pointing at a crow and shouting "penguin!", while there are squirrels running around. What Amdahl's law says is that given a fixed problem size and infinite processors, the parallel section will cease to be a bottleneck. This is irrelevant for AI, because people throw more hardware at bigger problems. It's also irrelevant for a whole bunch of other problems. Self driving cars aren't all connected to a supercomputer. They have local processors that don't even communicate with each other. >The latest innovations go far beyond logarithmic gains: there is now GPT-based software which replaces much of the work of CAD Designers, Illustrators, Video Editors, Electrical Engineers, Software Engineers, Financial Analysts, and Radiologists, to name a few. >And yet these perinatal automatons are totally eviscerating all knowledge based work as the relaxation of the original hysterics arrives. These two sentences contradict each other. You can't eviscerate something and only mostly "replace" it. This is a very disappointing blog post that focuses on wankery over substance. |
> This is irrelevant for AI, because people throw more hardware at bigger problems
GAI is a fixed problem which is Solomonoff Induction. Further Amdahl's law is a limitation on neither software nor a super computer.
Both inference and training rely on parallelization, LLM inference has multiple serialization points per layer. Vegh et al 2019 quantifies how Amdahl's law limits success in neural networks[1]. He further states:
"A general misconception (introduced by successors of Amdahl) is to assume that Amdahl’s law is valid for software only". It would apply to a neural network as it does equally to the problem of self-driving cars.
> These two sentences contradict each other
There is no contradiction only a misunderstanding of what "eviscerates" means and even with that incorrect definition resulting in your threshold test, it still remains applicable.
1. https://pmc.ncbi.nlm.nih.gov/articles/PMC6458202/
Further reading on Amdahl's law w.r.t LLM:
2. https://medium.com/@TitanML/harmonizing-multi-gpus-efficient...
3. https://pages.cs.wisc.edu/~sinclair/papers/spati-iiswc23-tot...