Hacker News new | ask | show | jobs
by eckesicle 124 days ago
My experience has been a mixed bag.

AI has led us into a deep spaghetti hole in one product where it was allowed free rein. But when applied to localised contexts. Sort of a class at a time it’s really excellent and productivity explodes.

I mostly use it to type out implementations of individual methods after it has suggested interfaces that I modify by hand. Then it writes the tests for me too very quickly.

As soon as you let it do more though, it will invariably tie itself into a knot - all the while confidently ascertaining that it knows what it’s doing.

1 comments

On localised context stuff, yeah no. I spent a couple of hours rewriting something Claude did terribly a couple of weeks back. Sure it solved the problem, a relatively simple regression analysis, but it was so slow that it crapped out under load. Cue emergency rewrite by hand. 20s latency down to 18ms. Yeah it was that bad.
For me it's just wildly unpredictable. Sometimes it gets a small task perfectly right in one shot, sometimes it invents an absurd new way to be completely wrong.

Anyone trusting it to just "do its own thing" is out if their mind

For me I would ask it to do a simple thing and it would give me the tutorial code you could find anywhere on the Internet. Then you ask it to modify it in a way that you can't find in any example online, it will tell you it's fixed everything, but actually nothing has changed at all or it's completely broken.

I think if someone's goal was just the tutorial code, it would have been very impressive to them the AI can summon it.

It only takes a cursory knowledge of what LLMs really are to understand why recreating tutorials is easy, but making actual new stuff that is well engineered (takes way way more than "passes the test suite") is difficult.

Actual novel stuff is so far out on the long tail of iterations that it's a gamble: it might pop up in an early run, or might take 2000 prompts and $20,000 worth of tokens. And it's still not really engineered, it's 10,000 monkeys with typewriters copying random shakespeare snippets off the chalkboard. At some point you'll get all of Hamlet, but most of the time you'll get garbage, and sometimes you'll get Romeo & The Taming of The Tempest.

this is what I've been using freebie gemini chat for mostly, example code, like reminding me of c stdlib stuff, javascript, a bit of web server stuff here and there. I think it would be fun to give googles agent or cli stuff a spin but when I read up here and there about antigravity, I'm reading that people are getting their accounts shutdown for stuff I would have thought was ok, even if they paid for it (well actually as usual the actual reasons for accounts getting zapped remain unknown as is today's trend for cloud accounts).

I'm too poor for local llms, I think there might be a 2 or 4gb graphics card in one of my junk pcs but thats about it lol

I found that unpredictability to be interesting. I'm doing super simple projects with these models and a year, or even six months ago, it would give me a block of code and as soon as you ran it, it would fail. And you'd have to paste the error in and keep going until it was smoothed out.

The other day though I asked for something simple and it one-shotted the problem. To me, that's new.

I know this success was a statistical outlier, however. I grok how to use it and to not trust it. I'm just shocked so many people smart people fail to understand it.