Hacker News new | ask | show | jobs
by pstuart 8 days ago
The work on LLM in a Flash will probably help, and Apple's NVMe architecture is well suited to maximize throughput could allow their devices to work better on larger models than other vendors.