| Every decision is a trade-off between up-front work and later eventual work. Given "The law changed at one point, and we were legally bound to be able to locate a customers record by a piece of data in a blob.", just a few of the possible solutions might include: - Move to a fully relational schema: - - Pros: You can leverage the power of the database to efficiently index, search, and aggregate any specific column without too much trouble. - - Cons: Schema change may be expensive. Requires processing all data into a new structure. Requires massive changes to large swaths of your data handling routines. Must be done all at once, or you need to run both systems in parallel for a while during conversion. In this case, took 9+ months. - Add a field on that table: - - Pros: Simple. Requires very little change in processing routines. Efficient. Can rely on DB typing to require data be present for a record. - - Cons: Depending on database may require excessive downtime while schema is updated. - Add a table that tracks this field and links to the main record. - - Pros: Simple. Requires relatively little change in processing routines. Efficient. Zero downtime. Can fill old records with background process. - - Cons: Hard to enforce that the data exists for every record. Can be mitigated with report generated for records without this link. Now, given those choices, I would say the correct choice in any particular situation depends quite a bit on external constraints. Do you have 9+ months and the free developer time to essentially redesign what may be large swaths of your back-end? Do you have assurance (tests, language features, etc) that you won't increase enough bugs to negate the benefits of moving to a fully relational schema? Is there a looming deadline on when the works needs to be completed by? Is downtime not really an option, and/or is building a parallel data store for the migration not feasible? How useful is normalizing the data expected to be in the short, medium and long term? Like I said, I assume they had a reason for the full relational migration. I was just pointing out that there are easier solutions that in some cases fit business needs better than that. If your company goes under because your large migration project stole focus and manpower from other needed projects when there were simpler solutions available, then you made the wrong decision, period. |
But the relational model is essentially so good an approach that if you are building something that's close to a relational model but not fully relational, you're almost certainly creating more present and future potential problems than if you moved to fully relational model (edit: and that's not saying other approaches are bad if you have a problem a ways away from the relational domain).
Neither of us know the particulars of gp's problem and I so couldn't say if you're alternatives are better tradeoffs. However, I would wager if someone wound-up adding several fields and tables to an existing blob-filled database, the result would haunt it's creator on many lonely nights at the office.