Hacker News new | ask | show | jobs
by bigyabai 19 days ago
Unlike the M5 Max, it should have usable context prefill. It's feasible to run 256k token workflows that would take the better half of an hour for TTFT on the M5.