Hacker News new | ask | show | jobs
by mjburgess 792 days ago
text is not a valid measure of the world, so there is no "informative model" ie., a model of the data generating process to fit it to. there is no sine curve, indeed there is no function from world->text -- there are an infinite family of functions, none of which is uniquely sampled by what happens to be written down

transformers, certainly, arent "informative" in this sense: they start with no prior model of how text would be distributed given the structure of the world.

these arguments all make radical assumptions that we are in somethihng like a physics experiment -- rather than scraping glyphs from books and replaying their patterns