Hacker News new | ask | show | jobs
by SmellTheGlove 3339 days ago
> PS: But of course don't store ID in an object value or you will lost major benefit of RDMS relational constraints.

Or anything that you might want to use as a foreign key down the road. It requires you to know that now, or modify later, so we might as well declare a schema and manage it unless we're totally certain we won't need any additional keys. But if we're certain of that, we could just use a schema.

1 comments

But do schema-less implement foreign constraints at all????

I'm currently implementing a mixed solution where clearly defined properties (and thus ID) have columns and constraints but I also include a schema-less "all you can eat" jsonb field so that experimental properties can be stored.

Upon each release I will migrate validated shema-less properties to static columns or even sub-tables. Similarly if a column is remove on next iteration I might transfer attributes to the jsonb to revert it later if needed. This does include a migration step but this seems only like good practice. Of course scaling... but I'm really not limited by this now.

The client will interact with the model through views or functions so that he don't have to care if columns are real or not. But anyway as soon as PK/FK are needed I will probably straight-out think that jsonb will be insufficient and take 5 minutes to think of a relational way to implement.

I don't see a problem with what you're doing as a means of development, since you're basically stuffing a bunch of data that you may or may not need in the future into a single field with the intent of breaking it out later if you do. I will ask whether this is really saving you any time or simplifying your development at all, given that you essentially impose structure iteratively as the need arises. You're not really schema-less, you're schema-lite, so you're maintaining some structure somewhere. Also, depending on the contents of the jsonb field, you may be defining its structure somewhere, even if it's just naming the elements.

I don't have a lot of experience with it, but Postgres supports indexing on jsonb fields. If you do end up in a situation where your keys are in the right place and you have some data in a jsonb, you may still be able to get at it efficiently.

Part of this is how we approach development as well. For one, I'm now some management type that makes powerpoints, so any code I write is on my own projects. But when I did more of this, my approach was always to start with an RDBMS unless I had a pretty specific use case to not use one. Having done it long enough, I am being honest that I find maintaining a database schema, migrating changes, and all of that to be pretty trivial. That's my approach, but I don't see an issue with yours if you think it's making you more efficient.

I should ask, because I'm assuming it, but make sure whatever you're stuffing into the jsonb field is true row-level data. Try to avoid putting data in there that "belongs" on another row - at the extreme it could provide some unintended exposure. Hypothetically, a row containing Customer A's demographics in the "person" table has a jsonb field - it may be tempting to put some basic demos about other customers B, C and D connected to that customer A in some way, but that's data you'd not want to expose as belonging to Customer A. Yet, if it comes back on the same row, it's as trivial to leak it as it is to retrieve it.

EDIT: To actually answer your question, I don't see how schema-less can have foreign constraints. Foreign constraints are defined in the schema. What I was saying is that if you're going to need joins later, have enough schema to support that.

I like your schema-lite terminology and I think it describe pretty well my approach.

I've only start to explore all possibilities, but main use so far:

-Store experimental properties before next schema revision

-Particularly good alternative to composite-type in early stage dev if you want to make live modifications without replaying the whole deployement script. (Composite type usually have huge cascading constraints)

-Storing "metadata row" for later use

PS: Actually it may even be an alternative to composite type in production because I can totally picture case when you need row level "objects" that vary depending on the value of another column. Yes you can normalize, but it might be more maintainable to keep everything in one table an write an appropriate check constraint. And it will still be easily human readable as opposed to a literal representation of a custom type or whatever (think of PostGIS geometry type that is very versatile but totally opaque to humans).

> But do schema-less implement foreign constraints at all????

In MySQL 5.7 you can have tables with JSON columns (like PostgreSQL's JSONB), and also "Generated columns" which are expressions over other columns (e.g. extract a value from the JSON). You can then use those generated columns as the source or destination of foreign key constraints.

Neat, I didn't know that. Is that more or less just having FK constraints on a view, though, or am I missing something conceptually here?
Right, "Computed Columns" are "view-like" columns added to normal tables. So you define your table with a few normal columns, and a few generated columns (which are each expressions over the normal columns, or over other generated columns).

On those "generated columns" you can create indexes, reference other tables (foreign keys) or be referenced by other tables (foreign keys).

MySQL has a binary JSON data type (like most databases) and also expressions to extract data from those JSON fields.

MySQL has a lot of disadvantages vs PostgreSQL, but having foreign keys on elements within JSON (via this mechanism) is a thing MySQL can do that PostgreSQL can't :) (Unless I'm mistaken?)

I think you are mistaken, even if implementation will vary you can have the same behavior (However I won't ever try in production because jsonb base FK is a pretty scary concept IMHO)

However on a solution I've implemented on PostgreSQL I expand jsonb as regular field in a view and made that view editable via trigger. This way users can update jsonb_field without being aware of because they edit it through a view as if they are normal fields.