Hacker News new | ask | show | jobs
by cstavish 3571 days ago
I use JSONB columns for similar use-cases, but to play devil's advocate, you can accomplish that a fourth way, which is almost certainly better than 1 or 2. A table for fields, one row per field. A table for forms, which has a many-to-many relation to fields. Entries in a link table compose a form of arbitrary fields. Answers can be stored in a separate responses table, indexed by form_id and column_id. I don't know enough about database implementation to speculate on how that would perform at scale, but conceptually that's how I think of the problem.
3 comments

This is roughly EAV. EAV is where you have a with a schema:

    entity_id | attribute_id | value
EAV is typically considered to be an anti-pattern for several reasons: it becomes very expensive to rematerialize a entity with all of its attributes/values, it becomes difficult to have any kind of constraints on the values (this is also a problem with jsonb), and it's hard for the database to maintain good statistics on the values of different attributes as most databases don't keep multi-column statistics. Don't worry, I've had similar ideas before.
EAV is an anti-pattern?

If you don't have JSONB, EAV is the only remotely-reasonable way to implement user-defined fields (e.g. product-specific attributes in e-commerce).

If you want to allow user-defined fields in a relational database, your realistic are either EAV or stuff json into a text column. EAV, if done extremely carefully, can be a good solution, but 99% of the time, it's going to be a huge pain.
> stuff json into a text column

You did see the article was about JSONB, which is significantly more sophisticated than "json in a text column", yes?

Yes, the person I was replying to was asking about what to do if you don't have jsonb.
Sounds kind of like the EAV relational data model: https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80....
I would expect that to perform poorly at scale.

There is also a fifth option, though it is not very space efficient. That is to store every field in the JSONB column. Personally, that is what I would do.

This approach is known to be used successfully by many high scale companies. It ensures the highest degree of flexibility and still allows for full indexing of the fields. The schema would need to be enforced by your application, but that should happen anyway.