Hacker News new | ask | show | jobs
by dmje 6 days ago
Great piece, well written and succinctly sums up my thoughts.

The bit I still don’t understand is how we all put up with the hallucinations. I was questioning Gemini last night about whether it could analyse a Fourtet song and give me a break down of the structure from beginning to end. “Sure!” it said with the endless enthusiasm you get from Gen tools, and then proceeded to spit out an absolute sack of fabricated shit. I pushed back, it apologised, and then generated more crap that had nothing to do with reality, I pushed back, we looped again, still just total fiction: “the drums don’t come in until bar 16” on a song that opens with a drum loop, that kind of crap.

We’re so so far away from tools here that are anywhere near being trustworthy and accurate. And yet we (including myself) are chunking out code after code. It’s so bizarre.

I’m guessing it’s that humans don’t have capacity to deal with this kind of scenario - it’s like having a junior staff member who is utterly incredible 90% of the time - completely convincing in their certainty and skill level, and then 10% of the time you catch them doing a shit in their desk drawer because they couldn’t be arsed to walk to the toilet. AI’s are basically sociopaths.

2 comments

> We’re so so far away from tools here that are anywhere near being trustworthy and accurate. And yet we (including myself) are chunking out code after code. It’s so bizarre.

I think one more thing this whole LLM charade in the last few years has revealed is that no-one really cares. As long as it "looks" like it works, turns out, its all fine.

Convenience trumps quality.

Bic pens. Disposable razors. Whipped cream in a can.

Add LLM code to the list.

Seems correct. Weird thing is, every single piece of software that I use feels like it got shittier in the last couple of years.

But I am not using more software. I mean their source codes might have gotten larger, but the count of tools/services I use is basically the same.

So this feels more like giving up nice handcrafted fountain pens for bic pens. But I am still using a couple pens overall. So no added convenience, just shittier quality.

“Looks like code. Feels like code. Works (mostly) like code. It’s code”
Doesn’t the article make the argument that since you can write tests this is not as much of a problem for code gen ?

Its arguable whether it is a foolproof solution (I don’t think so) but it definitely makes it look like you can build a harness around the stochastic machine that will validate the correctness of the generated randomness.

Monkeys and typewriters when you can quickly validate whether it’s Shakespeare or not is a costly but theoretically feasible scenario. No?

Yeh I think you’re probably right. But in the wider use of these tools - less so. Yet the uptake for report writing and email sending and …whatever else - massive.