|
|
|
|
|
by gamegoblin
1261 days ago
|
|
There are semantic embeddings libraries that are fast enough to run on entire webpages in ~hundreds of millis or low single digit seconds. I've been thinking a lot recently about making a browser plugin that simply does semantic embedding on a paragraph level of every web page I ever visit, and store it in a vector database. This would enable querying my little private search engine like "the HN story a few weeks ago that talked about ancient greek mining techniques" or "the reddit comment that had an analysis comparing Orwell's 1984 to the bible". For those not familiar, semantic embeddings take a chunk of text and embed it in a high dimensional vector space (~hundreds of dimensions) where semantically similar texts are closer together. |
|
hundred of millis is what I experienced.