Hacker News new | ask | show | jobs
by tome 908 days ago
nomel gave a good answer in a different thread

> This is not about the model, it’s about the relative speed improvement from the hardware, with this model as a demo.

To compare apples to apples look at the tokens per second of other systems running Llama 2 70B 4096. We're by far the fastest!

https://news.ycombinator.com/item?id=38742466