| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by polotics 61 days ago
	"Surgical "is the kind of wordage that LLMs seem to love to output. I have had to put in my .md file the explicit statement that the word "surgical" should only be used when referring to an actual operation at the block...

2 comments

fredmendoza 61 days ago

you're right, they are tools. that's kind of the point. PAL is a subprocess that runs a python expression. Z3 is a constraint solver. regex is regex. calling them "surgical" is just about when they fire, not what they are. the model generates correctly 90%+ of the time. the guardrails only trigger on the 7 specific patterns we found in the tape. to be clear, the ~8.0 score is the raw model with zero augmentation. no tools, no tricks. just the naive wrapper. the guardrail projections are documented separately. all the code is in the article for anyone who wants to review it.

link

mrtesthah 61 days ago

The core issue is that the LLM is using rhetoric to try to convince or persuade you. That's what you need to tell it not to do.

link

throwanem 61 days ago

Which will not work. Don't think of a pink genitalia, I mean elephant...

link

mrtesthah 56 days ago

An LLM that can't follow instructions wouldn't be able to write code anyway.

link

throwanem 55 days ago

Nonsense. But even an LLM that can follow instructions cannot follow that one.

link

mrtesthah 55 days ago

What is intrinsic to an LLM or its training that would prevent it from following the directive that it should not try to convince you of something?

link