Here's a shocking suggestion: maybe wait some time before these services could be implemented on-device, and implement them on-device, instead of shipping this half-baked something? Apple seems to be the perfect company to make it happen, they produce both the hardware and the software, tightly integrated with each other. No one else is this good at it.
They implemented way more on the device than anyone else is doing, and I don't see how it makes it "half-baked" that it sometimes needs to use an online service. Your suggestion is essentially just not shipping the product until some unspecified future time. That offers no utility to anyone.
Or we might not. LLMs are remarkably dumb and incapable of reasoning or abstract thinking. No amount of iterative improvement on that would lead to an AGI. If we are to ever get an actual AGI, it would need to have a vastly different architecture, at minimum allowing the parameters/weights to be updated at runtime by the model itself.
Right. But there's so much effort, money and reputation invested in various configurations, experimental architectures, etc. that I feel something is likely going to pan out in the coming months, enabling models with more capabilities for less compute.