Hacker News new | ask | show | jobs
by UncleEntity 500 days ago
The problem is I'm not doing anything as complicated as what you're describing.

The task was/is to take a grammar for APL from some long forgotten paper and turn it into a lemon parser. Easy, peasy, well within its wheelhouse and it had spectacular initial results with the help of DeepSeek-R1 analyzing its work.

"Oh, good job, robot," me types, "let's work on a lexer. Hmm... you seem to have clipped out some important rules at some point, we need to add those back." Then, boom, Claude is completely worthless.

I want Claude to succeed. It was doing so well then it hit a self-reinforcing wall of failure that it just can't get over even though it can analyze its behavior and say exactly why it keeps failing.

I mean, exactly zero people think the world needs an APL interpreter written by the robots but the point of the project is to see how far they can get without having a human write a single line of code. I know they have limitations and have no problem helping them work around them.

But, alas, this project is shelved until the next big hype cycle.