Hacker News new | ask | show | jobs
by larrysalibra 1033 days ago
Strange. I'm running Llama2 70b on Chrome Canary on a 64GB MacBook M1 Max...~1.5 older...and seeing better performance.

It's slow but usable!

prefill: 2.1963 tokens/sec, decoding: 3.4708 tokens/sec