Hacker News new | ask | show | jobs
by Kholo 3322 days ago
Hey opera folks who might be reading this - work on simplifying offline access to web content. I want to be able to maintain ~100 GB of offline data store, full text searchable. Wikipedia, stackexchange\stackoverflow, khan academy, zealdocs and whole bunch of other sources of useful browser renderable web content provide dumps of their data.

And then they end up having to build spl apps and extensions and other garbage that work around and hack cross domain/local file policy, just so the content already on my disk can be read and searched by the browser.

When such a treasure trove of web content is accessible offline the browser can and should be the way to access it.

As the web becomes more and more a corporate sponsored attention sink strong offline support can be a valuable browser feature.

4 comments

There was a time when Opera was not a google chrome skin and offered indexed full page search of local cache of visited page. This was incredibly useful and practical. It also had the option to save the whole web page in a single file.

Sadly this kind of feature disappeared when opera moved to a quarterly profit based strategy.

The kind of feature you want is achievable today but requires a set of tools beyond a web browser and some human work. It's also a bit tedious to maintain up to date.

This would be possible with a chrome extension. They can implement their own search engines.
are you talking about the 12.x versions? the pre-Chrome era?
Whilst I don't know many people who would be in your situation I do think that an offline, indexed data store could be very useful. Bookmarks even to this day are dreadfully implemented on all of the major browsers.
Bookmarks are obsolete and badly implemented for decades now.

Bookmarks are subject to link rot, having more than a hundred of them is a timesink nightmare to manage, bookmarks are useless when you are offline,... the list goes on and on and I have yet to see any browser even try to do something to address these.

These are UI complaints that can be addressed with a bit of investment and thought, but it's unfashionable work - Mozilla even tried to outsource it to Pocket, sorta...

> Bookmarks are subject to link rot

That can be fixed pretty easily: when the user bookmarks, submit a job to the Internet Archive, then use it whenever you get an error page from that link. FF already does half of it with the experimental no-404 extension. Alternatively, you can have your own InternetArchive-like service to do that, which is basically what services like Pocket and Pinboard do.

> having more than a hundred of them is a timesink nightmare to manage

You don't really have to manage them - these days they show up as soon as you type in the address bar. Ideally you'd have some sort of AI-powered categorisation system auto-filing them in dedicated folders, which could probably work with a bit of specialized meta tags. It's basically the age-old problem with filing documents of any type.

> bookmarks are useless when you are offline

Yeah well, a browser is also kinda useless when offline...

What's wrong with them? Bookmarks work great.

They are not designed to be used as a personal knowledge base or a wiki.

Edit: One of my most used bookmarks is to a specific url on localhost. I'd say bookmarks are as useless as a browser gets while you're offline.

Is this actually possible with a extension or another program? Id love that!

kiwix doesnt support stackoverflow...

http://zestdocs.org/ is my old project I've abandoned with similar idea. If anyone feels like continuing/rewriting, or perhaps even taking over the domain, GitHub organization, macOS app, or whatever really, you can email me via hn at the website's domain. (Note that I've set up the MX records there only a few minutes ago, so it may take some retries to be able to reach this email.)
Kiwix does. But you have to download the dump and run a couple scripts to convert the XML to kiwix archive format (ZIM). They have a repo on github with the scripts. But Kiwix is built on top of xulrunner which is basically Firefox but not supported by Mozilla anymore. And that goes back to my original post why not just bake the offline functionality into the browser.
Kiwix