Hacker News new | ask | show | jobs
by rpedela 4105 days ago
I think the data format is more the problem. What you describe is a problem, but of the annoying sort. If all the vendors spoke JSON (or some common, generic format) then you would only need to figure out a mapping for each vendor to your unified, internal schema. Granted it would be time-consuming because of the high dimensionality but it is doable. Instead they each have their own data format and in some cases binary and proprietary which is the worst combination. And they may not be willing to share a data format spec for some reason so you have to reverse-engineer their format.

Regarding my last statement, overall the healthcare software landscape feels like pre-2000 in the greater sofware industry where everyone is trying to build every possible product to own every market and there is little or no cooperation.

1 comments

I would actually think otherwise. The message format standard commonly used to pass HL7 messages right now are quite easy to parse, it is basically a CSV file using pipes and carets as field separators. Any technology worth integrating right now speaks HL7 one way of another already. Even if some reverse-engineering is needed, it is a 100% technical endeavour that does not need any political buy-in (except budgeting) - this is what I like to define as "annoying sort of a problem".

> If all the vendors spoke JSON (or some common, generic format) then you would only need to figure out a mapping for each vendor to your unified, internal schema.

This is not easy at all if your internal schema has less fidelity/dimensionality than the vendor. Expanding on my example above, suppose you have a smart weighing scale a la Withings scale that integrates with the EMR. The weighing scale has the patient's height input so it is able to send BMI <http://en.wikipedia.org/wiki/Body_mass_index> reading to the EMR as well. However, your EMR does not have a field for BMI because it is a computed/derived value of weight and height.

If your internal schema has higher fidelity than the vendor, you also are forced to impute data - this is not as bad but may cause unintended behaviours as well. A contrived example: the weighing scale only has the patient's identity information. However in your EMR the weight readings can only be stored associated with a visit/encounter (aka a hospital stay or appointment). You can associate the reading with the last open visit/encounter, but this will have undesirable repercussions e.g. during system downtime (it might become associated with the wrong visit).

There are ways to solve the above integration issue but they would cost a lot and may significantly impact the EMR all the way to end-user UI. So it is not only a technical issue but also whether the users are comfortable with the amount of complexity introduced in the UI, data imputation, etc.

I think the big assumption I was making, and probably should have said is that you have complete control over your own internal schema. The sort of control that allows you to modify that internal schema to incorporate a new vendor's schema regardless if it is more or less complex. If you don't have that control then yes it is a very difficult problem.