| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sunrunner 405 days ago

> Since an average PhD dissertation is around 70K words (translates to roughly 90-100K tokens in most LLM tokenization schemes), perhaps one benchmark could be whether an AI system can maintain the equivalent context.

This is a really interesting idea, and my immediate question around the average dissertation size is how many tokens are needed to represent all of the implicit/unstated knowledge that forms the basis for the dissertation itself. If the dissertation itself really is the tiny bump in the boundary of human knowledge that Matt Might's 'The illustrated guide to a Ph.D.' [1] shows then what's the token size for everything up to the bump created by the dissertation.

> Current AI systems, regardless of context window size, don't truly "understand" information the way humans do. They recognise patterns in data and generate outputs based on statistical relationships, but lack the deeper conceptual understanding, intentionality, and the other characteristics of human intelligence.

Whether or not I'm an AI believer, I'm not sure I could genuinely answer the question 'Do _you_ truly understand information?' if someone posed that to me, as I have no real understanding of how to measure that. I want to say it's meta-cognition, my ability to think about thinking and reason about my own knowledge, but that starts to feel pretty fuzzy and I wonder how much of that is anthropocentric thinking.

[1] https://matt.might.net/articles/phd-school-in-pictures/