Hacker News new | ask | show | jobs
by netaustin 4166 days ago
High traffic and high volume sites driven by CMSes, like newspapers, tv stations, etc., largely cannot rely on static files to deliver their content. Rather, they use caching layers for speed and security. There are two better ways to improve security for sites like these, which are highly targeted and poor candidates for static sites:

1) Use a headless CMS. WordPress on the backend that provides and API which is consumed by a Node app, for example.

2) Shift any user-facing dynamic feature off the CMS. Commenting, login, subscription management, etc., can be handled by purpose-built apps that tie into the CMS-driven site via Javascript, preserving the security and cacheability of the CMS.

That's not to say that it's impossible to drive a large-scale news site with static files. I believe CNN does exactly that with their in-house CMS. But no open-source CMS that generates static files is powerful enough to use in a newsroom context, or popular enough to gain traction.

3 comments

It's simply a matter of inversion. Is the page generation and publication performed upon each change (by editors/authors/etc) or upon each access? Obviously, the former is much more efficient, even for frequent changes, and even across millions of data points. (Just ask Twitter).

Just because we don't really have common, enterprise-grade authoring tools for non-technical people that publish static sites anymore doesn't mean that it's not the better way.

Sure we do, it's called caching.

There is a minimal difference between a webserver serving static pages and a caching server serving static content. When you get down to it caching is simply a more flexable approach to the classic (autoring tool) -> static webpage approach. In many ways the only difference is the authoring tool is a website not a stand alone program.

Caching is usually done on demand, though, not ahead of time. That means the dynamic portion is still launched by the request, which increases the probability of security flaws being exploited.
Exactly.

In other words, some fraction of the requests are responded to dynamically and then the result is cached. That dynamic nature can be exploited. Site search engines, etc, are also often (but not always) dynamic, server-generated results that have a greater likelihood of exploits via XSS, CSRF, SQL Injection, etc. Login forms almost always require server interaction and are great targets.

(I say "almost" because REST interactions might be stateless and thus login forms really just serve to generate an access token and verify that it's working; this is how Userify works, for instance. It's still theoretically more exploitable than pure static files, but it raises the bar quite a bit.)

Nice to see your input here, and relevant. Granted, it's no Wonderfile, but then what is? ;)
News is about as static as it gets.