Hacker News new | ask | show | jobs
by shinycode 312 days ago
I’m curious to know how people use PR review platforms with LLMs. Because what I feel is that I need to do the review and then review the review of the LLM which is more work in the end. If I don’t review anymore (or if no one does it) knowledge is kind of lost. It surely depends on team size but do people use those to only to have better hints or to accelerate reviews with no/low overlook ?
3 comments

Disclosure: my current employer has a product in this space (graphite.dev)

IME the highest value (at the moment) is having an LLM integrated into the PR page, that reads your code + CI log, and effectively operates as a sanity check / semantic linter.

A common workflow for us: is Draft PR -> Passes CI (inclusive of an LLM 'review') -> Published -> Passes Human review -> Scheduled to merge

The goal is to get a higher margin of confidence that your code (1) will not blow up in production (2) faithfully does what it's trying to do.

The value of the LLM reviewer is maybe 80% in the first bucket and 20% in the second bucket, IME. It often catches bugs like "off by one" and "you meant this to be `if not x`, based on the flag name and behavior, not `if x`".

Thank you for the feedback, it answers my question of the fact that as of now it’s just an other step in a human review. Nothing fully automatic (which is reassuring in a way) it’s just an other step to review & validate
Only has a sanity check/better hints. But I use it for my own PRs, not others'. Usually it's not much to review and easy to agree/disagree with.

I haven't found it to be really useful so far, but it's also very little added work, so for now I keep on using it. If it saves my ass even just once, it will probably be worth it overall.

> If it saves my ass even just once, it will probably be worth it overall.

That's a common fallacy of safety by the way :)

It could very well "save your ass" just once (whatever that means) while costing you more in time, opportunity, effort, or even false sense of safety, to generate more harm than it will ultimately save you.

Sure, but so far the cost is very minimal. Like 1 minute per PR on average. A crash in production and the subsequent falloffs is probably a good week of work and quite a bit of stress. That gives me quite a few PRs.

And it's not even safety critical code.

I give the MR id to CC and let it review. I have glab cli installed so it knows how to pull and even add a comment. Unfortunately not at all specific line number afaict. I also have Atlassian MCP, so CC can also add a comment in the Jira work item (fka issue).