Hacker News new | ask | show | jobs
by polotics 61 days ago
"Surgical "is the kind of wordage that LLMs seem to love to output. I have had to put in my .md file the explicit statement that the word "surgical" should only be used when referring to an actual operation at the block...
2 comments

you're right, they are tools. that's kind of the point. PAL is a subprocess that runs a python expression. Z3 is a constraint solver. regex is regex. calling them "surgical" is just about when they fire, not what they are. the model generates correctly 90%+ of the time. the guardrails only trigger on the 7 specific patterns we found in the tape. to be clear, the ~8.0 score is the raw model with zero augmentation. no tools, no tricks. just the naive wrapper. the guardrail projections are documented separately. all the code is in the article for anyone who wants to review it.
The core issue is that the LLM is using rhetoric to try to convince or persuade you. That's what you need to tell it not to do.
Which will not work. Don't think of a pink genitalia, I mean elephant...
An LLM that can't follow instructions wouldn't be able to write code anyway.
Nonsense. But even an LLM that can follow instructions cannot follow that one.
What is intrinsic to an LLM or its training that would prevent it from following the directive that it should not try to convince you of something?