Different architectures, different RL training loops, maybe memory modules [1][2] as part of the architecture, focusing on efficiency, the giant troves of data we're generating by using claude code/gemini-cli/opencode, there's lots of research to be made.