Y
Hacker News
new
|
ask
|
show
|
jobs
by
zyl1n
1023 days ago
I got prefill: 26.9719 tokens/sec, decoding: 18.8827 tokens/sec on M1 Max 32GB laptop for llama 2 7b chat f32. Not bad.