Hacker News new | ask | show | jobs
by keketi 857 days ago
Every output token costs GPU time and thereby money. They could have tuned the model to be less verbose in this way.