Hacker News new | ask | show | jobs
by Aachen 588 days ago
This type of not having to think about the implementation, especially in a language that we've by now well-established can't be written safely by humans (including by Google's own research into Android vulnerabilities if I'm not mistaken), at least with the current level of LLM, worries me the most

Time will tell whether it outputs worse, equal, or better quality than skilled humans, but I'd be very wary of anything it suggests beyond obvious boilerplate (like all the symbols needed in a for loop) or naming things (function name and comment autocompletes like the person above you described)

2 comments

> worries me the most

It isn't something I worry about at all. If it doesn't work and starts creating bugs and horrible code, the best places will adjust to that and it won't be used or will be used more judiciously.

I'll still review code like I always do and prevent bad code from making it into our repo. I don't see why it's my problem to worry about. Why is it yours?

Because I do security audits

Functional bugs in edge cases are annoying enough, and I seem to run into these regularly as a user, but there's yet another class of people creating edge cases for their own purposes. The nonchalant "if it doesn't work"... I don't know whether that confirms my suspicion that not all developers are aware of (as a first step; let alone control for) the risks

And especially if it generates bugs in ways different from humans - human review might be less effective at catching it...
It generates bugs in pretty similar ways. It’s based on human-written code, after all.

Edge cases will usually be the ones to get through. Most developers don’t correctly write tests that exercise the limits of each input (or indeed have time to both unit test every function that way, and integration test to be sure the bigger stories are correctly working). Nothing about ai assist changes any of this.

(If anybody starts doing significant fully unsupervised “ai” coding they would likely pay the price in extreme instability so I’m assuming here that humans still basically read/skim PRs the same as they always have)

Except that no one trusts Barney down the hall that has stack overflow open 24/7. People naturally trust AI implicitly.
It's worrying, yes, but we've had stackoverflow copy-paste coding for over a decade now already, which has exactly the same effects.

This isn't a new concern. Thoughtless software development started a long time ago.

As a security consultant, I think I'm aware of security risks all the time, also when I'm developing code just as a hobby in spare time. I can't say that I've come across a lot of stackoverflow code that was unsafe. It happened (like unsafe SVG file upload handling advice) and I know of analyses that find it in spades, but I personally correct the few that I see (got enough stackoverflow rep to downvote, comment, or even edit without the user's approval though I'm not sure I've ever needed that) and the ones found in studies may be in less-popular answers that people don't come across as often because we should be seeing more of them otherwise, both personally and in the customer's code

So that's not to say there is nothing to be concerned about on stackoverflow, just that the risk seems manageable and understood. You also nearly always have to fit it to your own situation anyway. With the custom solutions from generative models, this is all not yet established and you're not having to customise (look at) it further if it made a plausible-looking suggestion

Perhaps this way of coding ends up introducing fewer bugs. Time will tell, but we all know how many wrong answers these things generate in text as well as what they were trained on, giving grounds for worry—while also gathering experience, of course. I'm not saying to not use it at all. It's a balance and something to be aware of

I also can't say that I find it to be thoughtless when I look for answers on stackoverflow. Perhaps as a beginning coder, you might copy bigger bits? Or without knowing what it does? That's not my current experience, though