|
|
|
|
|
by theptip
2147 days ago
|
|
I am skeptical (with an open mind) of the JSONField pattern. We’ve used it in a few places where you have something like inheritance, so you want to handle all Event objects the same at the top level, and then dispatch differently based on Type. This works fine. My concern is that this is a pattern which experts advocate, but less precise/experienced engineers are likely to break when it comes time for a data migration. With normal SQL fields, if you change the schema you need to create a migration, and that gives an easy point to check for datamigrations too. If any commit can silently change the write-schema, it’s a lot harder to police. So I suspect if your team skews towards very experienced engineers the JSONField pattern probably gains in value. If that’s right it’s probably best to caveat the recommendation. Interested to know your experiences with schema changes though. I’m also interested to note that you are pointing out a general issue with Django I’ve experienced - the Models spread through the whole system. You can solve that in other ways, say by having a repository layer that used normal (non-JSONField) models and maps them through to a POPO. I’m not sure you _need_ the JSONField pattern to get the separation you’re looking for. |
|
In addition to schema validators, you really also need a data integrity async task that runs once a day to catch these kinds of issues. They're not that hard to fix if you catch them early, it's if you realize there's a data integrity issue years later and the original team isn't even there anymore that you get bigger problems. If you have a lot of data, a reasonable solution is just having it run on everything from the last 24 hours plus a random sampling of data older than that.
> Interested to know your experiences with schema changes though.
Haven't done a ton, so admittedly there could be complex cases I'm not seeing. I think it helps though, in cases like the example I gave of using it for form-like things, to set it up like:
This way you have four models rather than an indefinite number. But because it's not just one big model with lists of JSON objects for questions and responses, this makes running migrations and data integrity checks much simpler to write and easier to reason about.The idea being also that you have JSON schema validators for the different types of JSON blobs that get stored in QuizQuestionModel and QuizQuestionAnswerModel.