Now do a programming task that requires more than 32k of context and see who’s “better”. If you don’t bench mark that you cannot get an overall pic. GitHub copilot for example could benefit big from the increased context
Obviously it's a drawback but the silver lining of the small context window is it forces me to decouple everything and have very sensible and strict api's where I just write the docs and it writes the code.