Hacker News new | ask | show | jobs
by SmellTheGlove 3339 days ago
I don't see a problem with what you're doing as a means of development, since you're basically stuffing a bunch of data that you may or may not need in the future into a single field with the intent of breaking it out later if you do. I will ask whether this is really saving you any time or simplifying your development at all, given that you essentially impose structure iteratively as the need arises. You're not really schema-less, you're schema-lite, so you're maintaining some structure somewhere. Also, depending on the contents of the jsonb field, you may be defining its structure somewhere, even if it's just naming the elements.

I don't have a lot of experience with it, but Postgres supports indexing on jsonb fields. If you do end up in a situation where your keys are in the right place and you have some data in a jsonb, you may still be able to get at it efficiently.

Part of this is how we approach development as well. For one, I'm now some management type that makes powerpoints, so any code I write is on my own projects. But when I did more of this, my approach was always to start with an RDBMS unless I had a pretty specific use case to not use one. Having done it long enough, I am being honest that I find maintaining a database schema, migrating changes, and all of that to be pretty trivial. That's my approach, but I don't see an issue with yours if you think it's making you more efficient.

I should ask, because I'm assuming it, but make sure whatever you're stuffing into the jsonb field is true row-level data. Try to avoid putting data in there that "belongs" on another row - at the extreme it could provide some unintended exposure. Hypothetically, a row containing Customer A's demographics in the "person" table has a jsonb field - it may be tempting to put some basic demos about other customers B, C and D connected to that customer A in some way, but that's data you'd not want to expose as belonging to Customer A. Yet, if it comes back on the same row, it's as trivial to leak it as it is to retrieve it.

EDIT: To actually answer your question, I don't see how schema-less can have foreign constraints. Foreign constraints are defined in the schema. What I was saying is that if you're going to need joins later, have enough schema to support that.

1 comments

I like your schema-lite terminology and I think it describe pretty well my approach.

I've only start to explore all possibilities, but main use so far:

-Store experimental properties before next schema revision

-Particularly good alternative to composite-type in early stage dev if you want to make live modifications without replaying the whole deployement script. (Composite type usually have huge cascading constraints)

-Storing "metadata row" for later use

PS: Actually it may even be an alternative to composite type in production because I can totally picture case when you need row level "objects" that vary depending on the value of another column. Yes you can normalize, but it might be more maintainable to keep everything in one table an write an appropriate check constraint. And it will still be easily human readable as opposed to a literal representation of a custom type or whatever (think of PostGIS geometry type that is very versatile but totally opaque to humans).