Hacker News new | ask | show | jobs
by jedberg 3888 days ago
When I was heading up reliability at Netflix, we considered, and even began evaluating, turing the whole thing into one big static site. Each user had a custom listing, but generating 60+ million static sites is a very parallelizeable problem.

At the time, the recommendations only updated once a day, but an active user would have to dynamically load that content repeatedly, and at the same time, the recs were getting updated for users who hadn't visited that day. By switching to static, we could generate a new static site for you every time you watched something (which could change your recommendations), and increase reliability at the same time, so it would have been a much better customer experience. Unfortunately we couldn't get enough of the frontend engineers to buy into the idea to get it off the ground, and also they were already well along the path to having a data pipeline fast enough to update recs in real time.

7 comments

I might be wrong, but I believe that 4chan does something similar to this: Every time a post is made, the board is updated and new static pages are generated. All the server does then is serve this static pages.

(I can't find any official reference to this though, but another user has referenced this some time ago: https://news.ycombinator.com/item?id=8060200)

That's also how the most popular guestbooks and forums in the 90s worked. e.g., WWWBoard, which seemed to be used almost everywhere for quite a while. A perl script would generate a new HTML file and update the index HTML for each post.
The only trouble was they didn’t do it atomically, so if two people posted at the same time everything would get horribly mangled.

There was a reason we moved to database-backed sites.

I surely hope that wasn't the reason.
I've seen several startups using this technique not only for their content but also for their API responses — they would just store them in S3 with the right headers and serve them through cloudfront. My guess is that this will only get more common with AWS Lambda and other "serverless" technologies.
This is a neat story. I'd love to hear more stuff like this from other well known companies. Stories of ideas that never got off the ground but could have worked.
How is what you're saying different from "normal" caching?
The question is whether you pre-warm the cache or cache a response after it goes through. If every single response is personalized, caching doesn't buy you a huge amount because the first request will be a cache miss, and the user may never call back a 2nd time. If you prewarm every single possible personalized result then there is almost always going to be a cache hit, which in effect is the same thing.

I guess theres also a question of whether you are caching data from the backend that powers a front end app, or actually caching the full front end itself.

This sounds crazy enough to work.
What did these frontend engineers dislike about the idea?
"turing the whole thing", Alan should be flattered.