Hacker News new | ask | show | jobs
by CompoundEyes 480 days ago
Sorry I meant that it's javascript / typescript so we can deterministically orchestrate a series of prompts and shape their output exactly how we'd like. Returning the review as structured output as a JSON object is very helpful for this. If the review result seems bungled, run a judge prompt at the end and tell it to go try again ^_^.
1 comments

That makes sense. You could even run against multiple models or future models then, right? I can see some value in that because maybe 2 years from now the models will be able to surface issues that weren’t detected originally. I suppose you could run against the whole codebase in the future, but could also imagine something that could track down where a bug was introduced.

Do you save the reviews or discard them?

Not presently saving the reviews but we could go back and export the DIFF text for two commit hashes with git to recreate what was reviewed in a CI pipeline. This is also handy for tweaking the prompt towards what you're expecting to see on local machine.

I agree the script should be versioned because the output of the one model to the next varies so much and the potential that even an updated version of one model could break it. Treating model versions like an npm package dependency almost.