Hacker News new | ask | show | jobs
by ninalanyon 3 days ago
> kage serve $HOME/data/kage/paulgraham.com

If the result is static why does it need a server? Isn't it possible to make it so that it can simply be opened by the browser? Like:

$ firefox $HOME/data/kage/paulgraham.com

Then the result would be useable on machines without kage nstalled.

3 comments

You could use python -m http.server instead. I haven't tried it yet, but it should work.

Actually, Kage has two parts: a crawler that crawls pages and converts them to clean HTML by capturing the DOM after rendering in Chrome/Chromium, and a pack/serve component that packages the result as either a ZIM file for Kiwix or an executable file.

Usually JavaScript is blocked when you load pages that way.
Not all JavaScript, but a lot of APIs are restricted
I thought all the JS was stripper?
Since when? You won't be able to make HTTP requests to localhost, as it'd be a different Origin, but I don't think any mainstream browser blocks JS outright when you use file:// to load and view HTML files.
Somewhere around 2019, each document loaded from file:// became its own origin in Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1500453 (I didn't check when this happened in Chromium)

Related WHATWG discussion: https://github.com/whatwg/html/issues/3099

Yeah, but that's fine, the document is .html, and it can load ./app.js or ./style.css just fine even if loaded by file:// (as long as it isn't initiated by JS itself, then Origin starts to matter a lot more), otherwise basically every single local HTML file would suddenly be broken, I don't think anyone would have accepted that even with the origin changes.
I tried this on a small example and it works indeed. In my head this would have been something like a restrictive CSP script-source directive, even if not exposed in response headers or anything.
> I tried this on a small example and it works indeed.

I was thinking "of course it works, how else would people get started creating websites otherwise?" then I remember what's the most common approaches in the frontend ecosystem nowadays.

Back in the days of yore, every tutorial/book started with "First we create a index.html file which you open in your browser ...", even a JavaScript resource would start with this of course :)

React and Angular are completely broken through file://
I don't know about Angular but React works perfectly fine through file://. I'd think the bundler/packager matter more than whar JS libraries you use, you sure you're not actually thinking of something else not handling file:// properly?
I am quite familiar with this and it is factually false
Js modules don’t work on file urls (classic js does).
They can be made to work with blob urls. I have done this.
Okay that’s super interesting and I would love to see an example or writeup - I have a project which would benefit from being able to do that.
It's a technique I created (someone else must have done it first??) for a sandbox demonstrating a web UI framework I made. https://mutraction.dev/sandbox

To see it work, click "Download self contained .html" from the menu.

Here's the source file that handles this part: https://github.com/tomtheisen/mutraction/blob/master/mutract...

The idea is to use <script type="inline-module" name="foo">...</script> to define modules. That's something I just made up. For each such script, provision a blob URL. The main blocker is usually the same origin policy. Crucially, these blob URLs count as the same origin. So then you need to rewrite the imports from the named modules to the blob URLs. I used some regex rather than a proper parser, but it was more than good enough for me.

It seems quite doable to make some proper bundling tools around this concept.

You’ll likely run into a ton of CORS issues doing that.
I don't think so, there is no HTTP requests being done from JS as it's stripped away, and all the other resources are pulled down (and I'm assume their reference made relative), so really shouldn't be any issues because of CORS at all.