Hacker News new | ask | show | jobs
by Gigachad 28 days ago
Limiting token quotas would be fine. Encourage developers to use efficient models, plan the work first, and to not burn thousands of GPU hours on waste.

It's much like when developers would waste tons of money on AWS spinning up massive test VMs and leaving them running without care. Until the finance people cracked down on it.