Hacker News new | ask | show | jobs
by quantadev 391 days ago
Thought Experiment: Imagine if we had a true "Perma-Web" where something like IPFS was keeping a permanent record of every web page. Then we could have a "Semantic Web" where each webpage maps to a single point (it's Vector Embedding) in higher dimensional space.

This would mean you could do things like:

1) Write a blog post, and then find all other blog posts that were the 'closest to yours' that was ever written.

2) Do basic "Search" in a way that's probably more powerful than even the Google Page-Rank Algo, by being able to look up every web page that exists based on a Cosine Similarity.

It's a shame Web3 didn't really ever "go viral" in a big way, or else we'd be able to do this stuff right now.

2 comments

Cosine similarity alone can't give you good results. For example, if you want to search specific names or acronyms, cosine similarity won't help much.

People act like embeddings are all you need for search.

Curse of dimensionality also means what you think is the most similar is not necessarily the most similar thing in vector space. See the last Hn discussion on word embeddings for some examples.

I'm not claiming Cosine Similarity can do things it can't do. I'm claiming it's useful to find related pages, in a very powerful way, and I'm correct.
You literally wrote it can be more powerful than Google search.
Because Cosine Similarity is that powerful.
Not the exact implementation you suggest, but similar:

https://news.ycombinator.com/item?id=43797896

The idea is to then allow bloggers to link relevant posts so readers can easily traverse the web.

Lots of cool information there that's right in my wheelhouse, that I'll be going thru. thanks!