Hacker News new | ask | show | jobs
by solotronics 1336 days ago
It would be interesting to save every single thing you ever see online to a local cache and index it using machine learning so you can search by topic. This is the kind of storage I imagine you could use for something like that.
3 comments

I do this by printing-to-PDF every web page I've ever read for longer than 30 seconds. 70,000+ files since 1997 .. its my own private internet. I take great delight in being able to grep for subjects through this directory ..
Yeah, seconding the question as to what you use to do this?

Also I’m curious how large the collection is in total file size.

What do you use to do that?
You can also run Recoll on it .. awesome full text searchable personal history and knowledge base.

grep is nice, but Recoll gives you stemming and such.

> ever see online

Or ever seen. Screen record desktop at 1080p on codec optimized for text, maybe even grayscale with small samples of color data. Then have system index content (text, label graphics). When viewing, system will upscale, colorize, interpolate frames. Can probably throw in neural audio codec to store speech/audio.

I think there are at least a dozens of comments on HN suggesting the same and 10x more upvote. It would be a great feature to add to a NAS.
I have an idea to implement this technology ,it will just need a dns server ,routing techniques used in cdn and SSL decryption with custom cert