| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by CoolGuySteve 71 days ago

"give them difficult tasks beyond your own intellect?"

Lol no, I've yet to find a model with those properties. Sounds like a fast track to AI psychosis.

The domain I work in doesn't have enough public documentation for these models to be particularly helpful without a lot of handholding though.

2 comments

hombre_fatal 71 days ago

I've been working on a luks+btrfs+systemd tool (for managing an encrypted raid1 pool). While I have worked with each individually, it's not obvious what kind of cases you have to handle when composing them together. A lot of it is simply emergent, and the status quo has been to do your best and then see what actually happens at runtime.

Documentation is helpful to describe high-level intentions, but the beauty is when you have access to source code. Now a good model can derive behavior from implementation instead of docs which are inherently limited.

I implemented the luks+btrfs part by hand a few years ago, and I resurrected the project a couple months ago. Using source code for local reference, Claude discovered so many major cases I missed, especially in the unhappy-path scenarios. Even in my own hand-written tests. And it helped me set up an amazing NixOS VM test system include reproduction tests on the libraries to see what they do in weird undocumented cases.

So I think "tasks beyond our intellect (and/or time and energy)" can be fitting. Otherwise I'd only be capable of polishing this project if luks+btfs+systemd were specifically my day job. I just can't fit so much in my head and working memory.

link

zekica 70 days ago

And it can fail in great ways. Last example: I asked claude for a non-trivial backup and recovery script using restic. I gave it the whole restic repo and it still made up parameters that don't exist in the code (but exist in a pull request that's been sitting not merged for 10+ months).

link

hombre_fatal 70 days ago

Interesting. I don't think I've seen hallucinations at that level when it's referencing source code.

Though my workflow always starts in plan mode where Claude is clearly more thorough (which is the reason it takes 10x as long as going straight to impl). I rarely skip it.

link

kccqzy 69 days ago

> I've yet to find a model with those properties

You can just look at examples like Knuth Claude’s cycles that solved the problem. I have no doubt that if Claude didn’t exist perhaps Knuth would come up with a solution anyways, but given a limited amount of time/patience Claude came up with a solution while Knuth did not. That’s what I meant here.

Similarly the problems I give to Claude are also in that category where I myself did not come up with a solution within a set amount of time, and instead of keep working on it manually I decided to give them to Claude.

link