I also built such an extension half a year ago [0]. The first iteration was a local-first natural-language full-text search for the browser history [1]. The second iteration was focusing on bookmarks [2].
None of these could spark enough interest to get feedback on what users want. I am sharing this experience so that you may study my attempt, if you want.
I love it. I wrote an RFS[0] based on a similar idea. The big value prop in my mind isn’t the single user experience but rather automated trust signals on content quality based on access by members of your network graph and their proximity.
Hopefully yourself, author, and others continue to work on this idea.
Thank you! Will take a look at these. We mostly made this to solve something for ourselves and as a learning exercise, but hoping this may resonate with others too
This is great but I want just simple full text search on all the history. Not title and url search, but full text. If it has semantic embeds on top, all the better. I am losing too many of the things I find.
Wondering why browsers neglected bookmarks and search history so much. They never progressed in the last 2 decades. Storage is cheap, computers are fast and multi-core, yet we live with the mentality of paucity and don't save our digital crumbs.
Thank you! Yeah history is something we've been asked multiple times now. I'm sure this could be extended easily once we solve a few things (scraping pages faster and being smarter about what text we embed, parsing out irrelevant stuff). Will keep you posted.
Curious how this works. I've experimented with in-browser vector search using victor[1] with mixed results. Hadn't heard of this orama lib before checking out your project.
So the extension scrapes all your bookmarks' content, selects key parts of it (we have a naive heuristic for now), embeds them using Sentence Transformer, and indexes them in the browser local storage with Orama's vector DB. When you want to search, we embed the query and do a vector search against the index to get the semantically most similar ones. All in-browser so no data going to any API.
Didn't try victor, is that just for nodejs runtime or does it run at the edge as well? Orama's been pretty good, at least semantically. Haven't done any speed benchmarking so not sure if it's as fast as say HNSW.
I would assume it runs on the edge, since it runs in the browser via WASM. It's also implemented in Rust, which provides some flexibility. It will likely depend on the specific edge runtime though. The runtime would need something resembling a file system.
The thing I like about Victor is that it uses OPFS for storage when running in the browser, meaning it doesn't keep everything in memory. I looked at Orama a bit and from my reading I think they keep everything in memory, although I didn't dig too deep and would like to be wrong about that.
This is similar to Voy [1], which runs entirely in memory.
Unfortunately, for my own usage running in memory feels like a non-starter since the data size is unbounded.
Why is this necessary, I can just export my bookmarks, use ChatGPT to create a python script to download all of their contents, put it all in a big text file, and then CMD+F what i am looking for
None of these could spark enough interest to get feedback on what users want. I am sharing this experience so that you may study my attempt, if you want.
[0] https://getpinbot.com/
[1] https://www.youtube.com/watch?v=GYwJu5Kv-rA
[2] https://www.youtube.com/watch?v=PQh1qhvxZzc