|
|
|
|
|
by bisonbear
106 days ago
|
|
yikes, using AI without tests is not fun. with testing at least you have some confidence that the AI isn't going completely off track, without them you're pretty much flying blind having linters is super important IMO - I never try to make the AI do a linter's job. let the AI focus on the hard stuff - architecture, maintainability, cleanliness, and the linter can handle the boring pieces. I also definitely see the AI making changes that are way larger than necessary. I try to capture that in the eval by comparing a "footprint risk" which is essentially how many unnecessary changes did the AI make vs the original PR. I would certainly like to move beyond using PRs as a sole source of truth, since humans don't always write great code either. Maybe having LLM-as-a-judge looking for scope creep/bloat would be a decent band-aid? |
|