And that's exactly why llama.cpp is not usable by casual users. They follow the "move fast and break things" model. With ollama, you just have to make sure you're getting/building the latest version.
Its not possible to run the latest model architectures without 'moving fast'. The only thing broken here is that they are trying to use an old version with a new model.
I'm a bit unsure what that has to do with someone running an outdated version of the program while trying to use a model that is supported in the latest release.