Hacker News new | ask | show | jobs
by yen223 4488 days ago
Replacing the database with flat files? We have gone full circle, haven't we.

I can think of a lot of disadvantages with not using an actual database. What's the benefit of going back to flat files?

3 comments

> What's the benefit of going back to flat files?

Edit in your editor of choice. Use your source control for edit history. Trivial to keep a separate instance to edit/test on, and push with rsync. Makes it trivial to treat the code and the content as one unit, so that e.g. if you change parsing of the articles, update the articles and need to revert, you don't need to mess with reversing database updates separately).

Fewer moving parts. For a typical modern blog with comments farmed out to Disqus or similar, the data is likely to be tiny and very static. My blogs data, for example, is about 8.3MB of text that changes maybe a couple of times a month, so the 1-2 second cost of reading every article I've ever written in from individual files on disk is hardly an issue.

I wonder what simple blogging / CMS engines are out there that use sqlite as the backend. This also has the benefits of simple backups without the complexity of another process running. Most (if not all) webhosts will have this baked into their PHP installs.
Perhaps using sqlite as the "backed" in that you build it like any other blog, so you get the simplicity of building it out, using one good sql framework etc.

But then, when you click "post" it rips through the database using a library of markdown -> HTML or whatever the case may be. Thhere can't be must overheard to a single include to pull in the html, but you could render the enter page. I just can't see php falling down too much with a few includes and functions being called to generate a header, the included rendered HTML, and a footer with some design elements.

It's not much fun building your own database out of flat files. I cut my teeth on a Mac only system that more or less only talked to Filemaker, which was only as fast as the actual screen could redraw and search out the data. It could be painfully slow.

In a way I am glad, as I learned how to do things and think differently when most were just running a "select * from foo where bar = 'x'" which was a 15 minute luxury I didn't have. There were no joins, no tables, you actually ran applescripts on the database and it returned the data somehow back to a web server on the Mac.

So we used the database on the backend, where admins could be more patient, or do more intricate things, but almost always generated out some HTML, so in the end, the site was semi-dynamic. I think I was doing "caching" of data as a result almost 15 years ago.

The rule was, no more than 2 database calls per page, ever. And you couldn't do things like update foo set name = 'me' where id = 1, because there was no exposed notion of a record id, it was internal, so you had to select name from foo where name = 'whatever' which would return the name, but also a RedID value, which you then got to run another query to update foo set name = 'me' where if = <special RecID token>

Made me think a bit different.

cp/tar/rsync/git/svn to name a few.

HN is apparently flat file based.

I actually have a 100% static site deployed in production. It is served off nginx, built with make, shell and sed (does some include processing and index generation) and is deployed with rsync from make and uses git for version control.

The design of news.arc is notoriously quirky. Given the fact that paging is entirely based on continuations, I don't think "flat-file-based" covers it.
That explains a lot. I get the principle of why this might be done but boy is it ugly. You'd have to maintain state between pages. State is bad. Which probably explains all those link expired errors...