Apple seems to be gearing up for significant advances in on-device inference using this LLMs
https://arxiv.org/abs/2312.11514