|
|
|
|
|
by rightbyte
2611 days ago
|
|
True. A crawler need to differentiate generated content from "real" content somehow. I.e. a service:
www.thenumberinsanskrit.com/?q=1
that returns the queried number in Sanskrit, need to not be indexed (except the entry page) while:
www.news.com/?article=major-jones-in-scandal-20190103
needs to be indexed. Usually interesting pages are indexed on the site or linked somewhere on it, though. |
|
"Somehow", aka using computing power and storing results, but that still turns into an explosion of computing time and data storage. I mean, what is the difference between the example I listed and Facebook's front page? They are both 'real' content in a generated format.
And a converse argument for your Sanskrit example is, what if I have the sanskrit number and don't know what it is? I put it in google and the site returns it as the number one.
> linked somewhere on it
And those links can all be generated by algorithms.
Anyway, back to your original statement. There is no 'real' content. Only data exists. Most content systems used on the internet allow this data to be combined and displayed in a multitude of different ways depending on the call method and attributes of the viewee. Many times these combinations of data can present novel value to the user. And with the future only presenting us more automated data collection and presentation methods, search engines have lost this battle.