Hacker News new | ask | show | jobs
by etatoby 3616 days ago
That's because custom fields in WordPress, as well as in Liferay and in a bunch of other products, are implemented as EAV, Entity–attribute–value model, which is an abomination.

It is a relational "design pattern" used to implement a database inside a database, instead of using the database itself as it was designed to be used.

This has all sorts of fun (not) and useful (neither) effects, including: worsened performance; inability to ensure data integrity through constraints; bloated / unreadable queries; hard to find bugs; needlessly complicated migrations; and so on.

3 comments

So, EAV and even "inner platforms" are not always necessarily bad, so long as you know that's what you're doing.

For example, Firefox extensions could be easily considered an inner platform but they have the benefit of providing a platform which is safer and more extensible than the underlying OS.

Likewise, EAV provides a poor functional replacement for an actual database. But they also prove a lower curve for implementing very simple extensions and limit the capacity for trivial mistakes. Speaking from experience, it's the WordPress sites where plugins have created a whole bunch of extra poorly-built tables that worry me most.

The alternative to EAV for custom fields is allowing users to modify the database schema. I'd rather stick with EAV.
What’s wrong with just namespacing tables per module/app/whatever?

`wordpress_%app_table1` etc. if you want to do it ad-hoc.

PostgresSQL support namespacing out of the box[1][2].

[1] https://www.postgresql.org/docs/9.3/static/ddl-schemas.html

[2] http://jerodsanto.net/2011/07/building-multi-tenant-rails-ap...

Because giving a webapp sufficient permissions to CREATE, ALTER, and DROP tables is risky.
It gets harder to make assumptions about what tables are there when you want to migrate things. I also think it's a hassle if you want to do something like have multiple tenants in a single DB (not impossible, I suppose).
Having dealt with both, I'm split. I think, in the end, I'd rather model the DDLs and have everything point to separate database on the same server or a new namespace in the existing database. You give enough rope for the customer to hang themselves, but sometimes their requirements are to build a gallows.
One does not imply the other. Whether the schema itself is modified or just the EAV definition does not mean that everybody has the permission to do it. EAV is just the way it is stored/described.
OK... you are nevertheless talking about some end user having the power to modify the database schema, which IMO is a bigger headache.
I also meant that either way, I would not grant a end user the possibility to modify the data structure, be it through the modification of the schema, or through the addition of an EAV definition.

Granted one is more dangerous than the other, but to me it's no reason to give them access. It's still a headache.

If only the developers have access to modify the structure, the way it is done is not important (it can be done through a webform that alters a database, though the addition of a json definition, through coding, etc.) After all, you can modify the schema through PHPMyAdmin. It's some kind of admin interface. You just don't grant your users the right to do it. Just as you don't grant them the right to add custom fields, unless they know what they are doing, which is seldom the case :)

I think that EAV gives a false sense of security, in that it does not "break" much if done by a non-knowledgable user. But still, if it is done badly, good luck reconciling the data.

"Reconciling" the data isn't my job; the user can use the custom fields for whatever they want.

It's certainly true that life is easier if you never implement certain kinds of features but at that rate why make a program at all?

> "Reconciling" the data isn't my job; the user can use the custom fields for whatever they want.

I disagree strongly here and see this attitude a lot. It's easy to say "not my problem, you fucked it up, you fix it" but that is a very customer-hostile attitude.

Instead we should be giving users functionality they want in a way that prevents them from screwing things up through a combination of feedback and constraints. Sure, if a user types in blatantly incorrect data there's not much you can do. But you can do a lot more than say "here's a narrow ledge and a pit of snakes, if you fall in it's not my fault". This just results in unhappy customers who don't care whose fault it is, all they know is that your software ate their data.

Sometimes, the thing you need to model in a database behind an application you sell to a customer turns out to be... a sort of limited database with a schema the customer can control and design. When that happens, EAV is one of the common attractors in design space that everyone gravitates towards.

Of course, it's worth remembering that real databases do not typically use an EAV storage pattern for their data, so maybe it's not the only way to model a custom data store within a relational system.

It's also an option when the properties of objects going in the table are going to be wildly different or are not known at write time.
This is a candidate for something like a schemaless datastore or a PostgreSQL JSONB column. Depending on your choice of datastore you can still at least get some indexing and schema help if you choose. With EAV you generally throw all of that out.
A whole separate datastore sounds like overkill in a lot of cases.