I think it is too risky to build a company around the premise that someone won't soon solve the quadratic scaling issue. Especially, when that company involves creating ASICs.
Attention is not the primary inference bottleneck. For each token you have to load all of the weights (or activated weights) from memory. This is why Cerebras is fast: they have huge memory bandwidth.