Hacker News new | ask | show | jobs
by mastadoum 1255 days ago
I did not expect that, when iterating with smaller models like nanoGPT, even tough the output is one token at a time it did not felt like it would take half a second between each of them, but I guess that's what happen with billions parameters models.