Hacker News new | ask | show | jobs
by becomevocal 14 days ago
First thought was "only 30 tasks" however the findings map to what I've seen personally: code review consumes majority of tokens
1 comments

Code review could also be run as an unattended/batched task though, possibly with at least some use of on-prem inference (which excels at this). That would be a major saving compared to the usual cloud inference scenario.
with which models, though?
Yeah wasn’t there a report recently on how local models after energy costs didn’t weren’t actually more efficient to complete the same task?