Hacker News new | ask | show | jobs
by cube2222 743 days ago
If I understand correctly there's three things here:

- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri

- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute

- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation

1 comments

The second point makes sense. It gives Apple optionality to cut off the external LLMs at a later date if they want to. I wonder what % of requests will be handled by the private cloud models vs. local. I would imagine TTS and ASR is local for latency reasons. Natural language classifiers would certainly run on-device. I wonder if summarization and rewriting will though - those are more complex and definitely benefit from larger models.