There are way too many personal definitions of what "Moore's Law" even is to have a discussion without deciding on a shared definition before hand.
But Goodhart's law; "When a measure becomes a target, it ceases to be a good measure"
Directly applies here, Moore's Law was used to set long term plans at semiconductor companies, and Moore didn't have empirical evidence it was even going to continue.
If you say, arbitrarily decide CPU, or worse, single core performance as your measurement, it hasn't held for well over a decade.
If you hold minimum feature size without regard to cost, it is still holding.
What you want to prove usually dictates what interpretation you make.
That said, the scaling law is still unknown, but you can game it as much as you want in similar ways.
GPT4 was already hinting at an asymptote on MMLU, but the question is if it is valid for real work etc...
Time will tell, but I am seeing far less optimism from my sources, but that is just anecdotal.
You are missing the economic component.. it isn't just how small can a transistor be.. it was really about how many transistors can you get for your money. So even when we reach terminal density, we probably haven't reached terminal economics.
I didn't say we have currently reached a limit. I am saying that there obvious is a limit (at some point). So, scaling cannot go forever. This is a counterpoint to the dubious analogy with deep learning.
The limits are engineering, not physics. Atoms need not be a barrier for a long time if you can go fully 3D, for example, but manufacturing challenges, power and heat get in the way long before that.
Then you can go ultra-wide in terms of cores, dispatchers and vectors (essentially building bigger and bigger chips), but an algorithm which can't exploit that will be little faster on today's chips than on a 4790K from ten years ago.
But Goodhart's law; "When a measure becomes a target, it ceases to be a good measure"
Directly applies here, Moore's Law was used to set long term plans at semiconductor companies, and Moore didn't have empirical evidence it was even going to continue.
If you say, arbitrarily decide CPU, or worse, single core performance as your measurement, it hasn't held for well over a decade.
If you hold minimum feature size without regard to cost, it is still holding.
What you want to prove usually dictates what interpretation you make.
That said, the scaling law is still unknown, but you can game it as much as you want in similar ways.
GPT4 was already hinting at an asymptote on MMLU, but the question is if it is valid for real work etc...
Time will tell, but I am seeing far less optimism from my sources, but that is just anecdotal.