Hacker News new | ask | show | jobs
by acquiesce 340 days ago
I wouldn’t do this personally because the downstream code very often has to handle differences where polymorphism breaks and you end up having to query the type. Polymorphism shouldn’t be used for data, only behavior, and only in very specific circumstances. Subclassing is a different topic.
2 comments

You wouldn’t do what? Have polymorphic data to begin with? I don't see how you can choose to avoid the scenario that record A has one of several possible related metadata, other than just ignoring it and allowing invalid representations
Correct. This data doesn’t meet criteria for polymorphism because insured and uninsured processes from a book keeping perspective have very distinct flows and requirements in reality. Using OOP here is a mistake. Straight forward procedural code in-flight as well as any batch jobs should deal separately with insured and uninsured data and there should be zero overlap unless we need to extract aggregate data like how many payments in total and what’s the total amount. For those situations you can use a separate domain model that distinctly deals with these queries as value objects themselves if you want to go down that rats nest.

Other top level comments covered what I wanted to say but my comment is the OG one. I deal with payments, transactions and all that with multiple currencies and other complexities. Just keep it simple and don’t use OOP for this stuff it’s the wrong tool for the job.

The primary impetus is to enforce the constraint: a billing is either insured or uninsured. Insured has additional metadata X, Y, Z. Uninsured has additional metadata A,B,C. A billing cannot be both.

If you separate them into different insured and uninsured tables, then any tables associated with billing generally needs to be cascaded into an insured/uninsured variants as well. Billing_customer and billing_customer_details now becomes uninsured_billing_customer and insured_billing_customer and uninsured_billing_customer_details, etc.

As you add more data of this constraint, everything fragments again, scaling at 2^n tables. This is similar to the async coloring problem; what you wanted is to locally fragment the data model, but instead anything that touches it gets poisoned.

Ideally the DB would let you enforce the constraint UNIQUE(uninsured.billing_id, insured.billing_id) and split-table would suffice

I’m not seeing any top-level comments that resolve this —- other than ignoring the problem altogether (let the data model encode invalid states, handle it in app code), or switching to a different database.

Letting the app code enforce these constraints isn't ignoring the problem, it's how you solve this. Your DB is never going to represent all the business logic by itself. You can also add the slightly clunky constraint it mentions if you really want.

I wouldn't do this with separate tables. I also wouldn't do this with polymorphism, or OOP in general, even if the DBMS properly supported OOP. Trying to represent these constraints by classifying things will get confusing fast.

You can have different tables for different data. You don’t have to put all in same table
Only the first strategy shoves it into the same table? The fundamental problem is record A can have type X Y or Z, with each type having additional metadata. You could flatten the model and have a table for each type X Y Z and query them independently, and pay the cost of duplicate schema structure and having to ensure they’re always synchronized manually (including any dependent tables), or you pluck out the common core and run into the article’s problem
I think the article alludes to the difficulty of this solution by discussing the need for invariants to be upheld when an insured patient becomes uninsured or vice versa. Different tables for each 'subclass' could be an option, but if that can change later on, you now need to move patients between the insured_patient and uninsured_patient tables and make sure you don't have duplicates.
Want to elaborate on how you're going to magically disappear the inherent polymorphism in your problem domain every time?

Sometimes you can indeed view things from a different perspective and come up with a simpler data model, but sometimes you just can't.

There is no polymorphism. There’s nothing polymorphic about the 2 types of payments. And furthermore you’re likely to run into situations where you have to have both an insured amount and an uninsured amount for a given treatment/procedure. So now you’re dealing with arrays of heterogenous data.

The process for handling the two cases is distinct. This is the classic OOP issue of trying to use a Shape object to represent both Box and Sphere. Just don’t. Stick with transaction/linear code and use transforms as it makes sense for certain cases (ie, MVVM style). Handle the two cases distinctly so there is no room for ambiguity.

People get this confused and they think it can’t be simpler.