Hacker News new | ask | show | jobs
by DelightOne 735 days ago
The proposed approach used multiple LLM chains. Then you don't know how much context you need. I imagine it can get expensive. Per Commit.
2 comments

OpenAI's latest (GPT-4o) model is US$5.00 / 1M tokens [1]

For large repos it might get expensive, but many will have orgs footing the bill.

To satiate curiosity, I made a fresh rails app and it has 9141023 characters, let's /5 to estimate 'tokens' (wild guess), so to scan an entire rails app, that's about $10. Not nothing, but not back-breakingly expensive. Scanning could be reserved for non-trivial PRs and important applications where vulnerabilities are especially likely (e.g. perhaps not static sites, demos, or apps not in prod or a live environment).

Scanning only new or edited files, along with those they interact with (where sensible patterns could emerge over time) could decrease the total volume of files needing scans, thereby reducing costs.

[1] https://openai.com/api/pricing/

Or $2.50 / million tokens if you run it in batch mode (results in up to 24 hours, though in practice much faster than that): https://platform.openai.com/docs/guides/batch/getting-starte...
You don’t need this per commit unless you’re especially paranoid. Can easily do it just on a per release basis. The problem is that right now it’s still all wasted cost - this thing can’t really do the thing needed.