Hacker News new | ask | show | jobs
by mantrax5 4394 days ago
It's always interesting to observe my bad developer practices (such as stuffing JSON in SQL table columns) become flexible "architectural patterns" for building "schema-free, scalable data storage".
5 comments

Wait until you hear about "fractal storage" where every value in a table is a blob that contains a SQLite database, which in turn has blobs which contain SQLite databases, until it's turtles all the way down.
My God, it's full of stars. This is a joke, right?

Please?

Almost, I've seen these 2 levels deep before. I'm waiting to hear about the real thing.
If you watched the talk, he explains the specific kinds of that should be stuffed into a JSON text column. It makes a difference.

This practice been done by many production sites over the years http://backchannel.org/blog/friendfeed-schemaless-mysql Might not be considered bad practice now.

We actually use a storage format very similar to the FriendFeed schema in Dari. Dari is a open source Java persistence layer with a full query API that stores all it's data in a JSON blob in one table with a few extra tables for indexing the JSON data.

https://github.com/perfectsense/dari

Here is the SQL schema: https://github.com/perfectsense/dari/blob/master/db/src/main...

We've used this model for almost five years now with great success. It's simplified rolling out "schema changes" since no tables need to be changed. It's also been optimized to a point where it's extremely fast.

Well I'm saying it tongue in cheek. For a long time I and many others have stuffed JSON in SQL table columns, and I will continue to do so (heck, databases have started supporting JSON as a result).

But every time a developer sees an interesting twist on a piece of technology and goes for it, peers call it a bad practice.

I've been through many cycles like this, and inevitably some time passes, and one day you wake up to see yesterday's bad practices have turned into exciting advancements.

Moral of the story is, ignore the wisdom of the day and go for it, tiger. Stuff that JSON in an SQL table.

I'm not sure which authority gets to pronounce which code practices are good and which ones are bad (the Vatican?), but one thing that I've observed plainly is that this authority can't make up its mind and contradicts itself with regularity. My thoughts are that the whole project of trying to decide for each possible snippet of code whether it is good or bad is foolish and will never succeed. The fact is that everything in programming is a trade-off. Being able to make decisions that don't lead to disaster comes down to experience and wisdom.
What rebelidealist said. In the talk, I try to be pretty clear about what kinds of data are a good case for this kind of storage and what aren't. (In short? Data you only ever retrieve for a single row at a time, and that you never query on. Also, you really need to store it in a separate table that's 1-1 with the main table, or all you've really done is bloated the original table and made things worse.)

My own experience is that there's actually more data than you'd expect that can fit into this model. On the other hand, I am absolutely not pushing this as a panacea: if you don't really know what you're doing, tossing JSON in a RDBMS is probably a really, really bad idea. After all, that's part of the talk -- to discuss when it's a good idea and when it isn't.

(I personally think the low-card tables part of the talk -- http://github.com/ageweke/low_card_tables -- is the most interesting idea.)

Its not always a bad choice. For example, I use it for caching built JSON for requests when a less than reliable API goes down. I can now serve up the cached response for the view without rebuilding it and let the user know we're still awaiting real time data.

Its not storage friendly but it is what I believe to be a valid use case.

If you're working with a 3rd party api and want to store their response, it's waaay simpler to just stuff their response into a json blob, add a couple indexes if needed, and go on with life. (as opposed to making 30 tables to store all the data in a normalized format and writing the sql or configuring the orm to deal with that data)
Thats a great point. If the 3rd party changes their format or syntax then your inserts would likely continue to work.