Hacker News new | ask | show | jobs
by rfoo 483 days ago
Thanks for your advice.

> different opinions

I won't argue with you so hard if it's your "opinions". What you described is not an opinion. And facts could be wrong. Plainly wrong.

> Maybe it's because it is not off?

Yeah, as I said earlier your number might be correct as an estimation for prefilling 1000 tokens on Llama 3 8B. That's not what everybody here called "decode". Your number shows that prefill is compute-bound. So what?