The problem is swapping LLMs can require rework of all your prompts, and you may be relying on specific features of OpenAI. If you don't then you are at a disadvantage or at least slowing down your work.
I have a hierarchy of templates, where I can automatically swap out parts of the prompt based on which LLM I am using. And also have a set of benchmarking tests to compare relative performance. I treat LLMs like a commodity and keep switching between them to compare performance.
Language modelling, token prediction. It's not much different from generating code in a particular programming language; given examples, learn the patterns and repeat them. There's no self-awareness or consciousness or understanding or even the concept of capabilities, just predicting text.
Sure but that kind of sounds like it is building a theory of mind of itself.
If it does have considerable training data including prompt and response when people are interacting with itself then I suppose it isn't that surprising.
That does sound like self awareness, in the non magical sense. It is aware of its own behaviour because it has been trained on it.
Isn’t the expectation that “prompt engineering” is going to become unnecessary as models continue to improve? Other models may be lagging behind GPT4 but not by much.