Hacker News new | ask | show | jobs
by qsera 68 days ago
> it's looking like assessment and evaluation are massive bottlenecks.

So I think LLMs have moved the effort that used to be spent on fun part (coding) into the boring part (assessment and evaluation) that is also now a lot bigger..

3 comments

You could build (code, if you really want) tools to ease the review. Of course we already have many tools to do this, but with LLMs you can use their stochastic behavior to discover unexpected problems (something a deterministic solution never can). The author also talks about this when talking about the security review (something I rarely did in the past, but also do now and it has really improved the security posture of my systems).

You can also setup way more elaborate verification systems. Don't just do a static analyis of the code, but actually deploy it and let the LLM hammer at it with all kinds of creative paths. Then let it debug why it's broken. It's relentless at debugging - I've found issues in external tools I normally would've let go (maybe created an issue for), that I can now debug and even propose a fix for, without much effort from my side.

So yeah, I agree that the boring part has become the more important part right now (speccing well and letting it build what you want is pretty much solved), but let's then automate that. Because if anything, that's what I love about this job: I get to automate work, so that my users (often myself) can be lazy and focus on stuff that's more valuable/enjoyable/satisfying.

Fuzz testing has existed long before LLMs...
When writing banal code, you can just ask it to write unit tests for certain conditions and it'll do a pretty good job. The cutting edge tools will correctly automatically run and iterate on the unit tests when they dont pass. You can even ask the agent to setup TDD.
Cars removed the fun part (raising and riding horses) and automatic transmissions removed the fun part (manual shifting), but for most people it's just a way to get from point A to B.