IMO we are either limited by data or reaching the limits of what's possible with a transformer architecture. Hardware will get us efficiency but I am not sure if it will lead to smarter models
My doubts in the architecture is how different they are from human intelligence. They need an inordinate amount of training data and lack any sort of generational architectural intelligence.