|
|
|
|
|
by codingdave
79 days ago
|
|
I fail to understand why people anthropomorphize LLMs. They are word calculators. Sure, impressive ones. Useful ones. But still just calculators. So it should be self-evident that limiting their output will change their output. It may be interesting to note how wide those changes are, but that is all it is. Also, original title is: "Umwelt Engineering: Designing the Cognitive Worlds of Linguistic Agents". HN frowns on editorializing titles. From the guidelines: "Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize." https://news.ycombinator.com/newsguidelines.html |
|
I don't think it's anthropomorphizing to study how vocabulary constraints change reasoning quality. The paper doesn't claim LLMs think. It measures accuracy on tasks with known correct answers under different constraints and finds structured patterns.
"Limiting output changes output" is true but undersells what's happening. If you removed random words from a calculator's input language you'd expect degraded or noisy results. Instead, removing possessive "to have" (so the model can't say "the argument has a flaw" and has to say "the argument fails because...") improves ethical reasoning by 19pp across all three models. Removing "to be" helps Gemini by 42pp on that same task but collapses GPT-4o-mini by 27pp on a different one. The cross-model correlation is r=-0.75, meaning the same restriction systematically helps one model and hurts another.
That's not just different output. The restrictions are forcing different reasoning paths depending on the task and the model. Why specific vocabulary removals produce specific, predictable accuracy changes is the question. Running a 15,600-trial follow-up now to dig into it further.