Hacker News new | ask | show | jobs
by borapdx 1547 days ago
The principal step will be to move on from keywords and indexing, which are now a legacy technology, almost 26 years after Google started it all.

Returning blue links is a thing of the past, as the Web of yesterday is long gone. Blue links always were about surfing i.e. following hyperlinks just for the sake of it since the main premise was most of them were of high quality and quickly proliferating.

All that is gone now and the links are a promotional thing how to get paid in one way or another. This is why Google results have been deteriorating, regardless of tens of trillions of archived pages on the Web. Google has lost the principal ranking signal years ago.

The next huge scale smart information system will be based on dense vectors (a few hundred dimensions) such as in AI but the key will be much bigger scale, of (tens of) billions of vectors. Contemporary AI works won datasets 4-5 orders of magnitude smaller, getting bogged down in gigantic transformer models such as GPT-3 with 175B+ parameters that take weeks and millions of dollars just to train. One might wonder what is innate knowledge of such a huge model, and it is not much as one can see for themselves as GPT-3 is now open (until Apr 1).

The future will be based on embeddings that are NOT contextualized i.e. no separate vectors for different senses in superpositions. Such systems will not be based on ads nor tracking as the resources required will be orders of magnitude less than what is currently required at Google.