| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by spdustin 1008 days ago

Oh, it’s definitely ambiguity. Any given token’s attention is going to have its weight vary based on its context, and less-ambiguous terms are more likely to be used “near” the other terms that matter. For example, if you tell GPT not to ‘omit’ code from a code sample, it has to disambiguate the meaning of omit. Tell it not to ‘elide’ any code, and it performs a lot better. “Prompt engineering” is far more linguistic than people seem to realize. It’s not just “say what you mean” when the model has an easier time when you “say what you mean in the most linguistically precise way possible”. Simplified, but workable: it’s a matter of finding less ambiguous/more context-specific tokens/words with a better tf/idf in the pre-training corpus without getting too esoteric.

Another example: storytelling prompts that include “I dislike open-ended conclusions and other rhetorical hooks” often results in fewer (or no) closing statements like, “as night fell, they wondered about their future.”

Edit: GPT-4 is surprisingly good at answering these things if asked to: https://chat.openai.com/share/b97ad65f-f005-49b4-a64e-eb537d...