I believe the possibility of "making it fast" is taken in account in the existing design, to avoid designing a format which can't be cheaply optimised & hardware-implemented.
Right, the encoder is given a lot more options, which leads to a combinatorial explosion when searching for an optimal encoding. Once the options that actually pay off are identified, encoders will be able to tune their heuristics and narrow down the search space.