Then you'll probably saturate the processor without using a larger LMUL, but I think many algorithms can work with LMUL=2, without running out of registers.