Hacker News new | ask | show | jobs
Language Modeling in the Limit of Infinite Data (matthewfl.com)
7 points by matthewfl 850 days ago
2 comments

This formulation of a language model in the limit is bad- conditioning on what day it is and what you actually ate is just assuming that the language model is god, which doesn't give intuition on how GPT4 will behave.

A much better formulation is "Search the internets of an infinite collection of earths until you have found the exact prompt 100 times (this will require looking at an exponentially growing number of earths as the prompt grows) and then return the frequencies of the next word."

This seems to make certain metaphysical assumptions, like the apparent deterministic nature of the universe extends flawlessly into reality, which in my experience seems like a bit of a leap of faith.