Hacker News new | ask | show | jobs
by Hupo 4885 days ago
>But you know when the data is changing -- when an article has been updated and republished ... or when you've done another load of the government dataset that's powering your visualization. Waiting for a user request to come in and then caching your response to that (while hoping that the thundering herd doesn't knock you over first) is backwards, right?

>I think it would be fun to play around with a Node-based framework that is based around this inverted publishing model, instead of the usual serving one. The default would be to bake out static resources when data changes, and you'd want to automatically track all of the data flows and dependencies within the application. So when your user submits a change, or your cron picks up new data from the FEC, or when your editor hits "publish", all of the bits that need to be updated get generated right then.

You mean most things don't already do this? I've been working on a personal blog engine with this as one of the core ideas (basically all static assets and pages are compiled on edit), and I thought it was a pretty obvious way to go about it. Looks like I'm indeed not the only one to think of it, but how "new" you present the idea as is a bit surprising to me.

1 comments

For simple problem domains (Blog, Mom and Pop store website, etc) it's trivial to pre-generate content. For larger content systems you can run into a more complicated dependency tree. Then you have the choice between keeping the dependency logic accurate vs regenerating the entire content set on any change.

It also turns out that content sets that change infrequently, but also unpredictably are a pain to cache. You can cache them for a short time (as long as stale content can be tolerated), but then you lose cache effectiveness. Or you can cache it forever with some sort of generation/versioned cache, but that doesn't interface with named, public resources very well. Telling your visitors and Google that it's yourdomain.com/v12345/pricing not yourdomain.com/v12344/pricing doesn't really fly.

I definitely concur with your surprise about it being novel though. I think that for many situations it's just easier to run extra boxes to handle the increased load of generating dynamic content on the fly over and over again. It's good for SuperMicro and AWS. It's not so good for the planet.

I'm very excited to see Jeremy's approach to addressing the problem.