| HN Mirror

>A crawler need to differentiate generated content from "real" content somehow.

"Somehow", aka using computing power and storing results, but that still turns into an explosion of computing time and data storage. I mean, what is the difference between the example I listed and Facebook's front page? They are both 'real' content in a generated format.

And a converse argument for your Sanskrit example is, what if I have the sanskrit number and don't know what it is? I put it in google and the site returns it as the number one.

> linked somewhere on it

And those links can all be generated by algorithms.

Anyway, back to your original statement. There is no 'real' content. Only data exists. Most content systems used on the internet allow this data to be combined and displayed in a multitude of different ways depending on the call method and attributes of the viewee. Many times these combinations of data can present novel value to the user. And with the future only presenting us more automated data collection and presentation methods, search engines have lost this battle.