Hacker News new | ask | show | jobs
by recursivecaveat 906 days ago
Yeah, I think our desire for tokens/s and lower latency is likely insatiable. Same reason you have a terminal that can print out more than 300 words per minute. Life is way easier when you don't have to be super parsimonious with your output. You suggest a bug fix and regenerate the whole code snippet, or you spit out a webpage on demand and the user scrolls to the bottom immediately, etc.