Hacker News new | ask | show | jobs
by libc 3069 days ago
It is using JSON (or XML), but the underlying data model is largely carried over from V3 (or more specifically, the RIM). The problem with HL7 isn't the data format itself, but how information is encoded and the amount of variation that exists. The parent comment was a little misleading in posting a V2 message since that isn't what Apple is using, but as someone who works with HL7 on a regular basis V2 is actually more straightforward to work with a lot of the time.

FHIR is definitely a step in the right direction but it is plagued with the same issues as their other "standards" so I'm not holding my breath.

2 comments

Very confusing. You say it uses JSON/XML (for data encoding) and also the "problem" is "how information is encoded". What is the encoding problem? And how could JSON/XML be the problem? These are fairly simple encoding formats and well understood.

Are you referring to the data model as the problem? How exactly?

The data model is a big part of the problem. There are lots of different ways to encode the same information within HL7. People refer to different "flavors" of HL7, different ones for different implementations (and implementations are not just vendor specific, but site specific — EHR software is very heavily customized for each health system). Add to that doctors and nurses entering information in their own ways within a given health system (since the software isn't very clear or usable), and the data that's getting transferred is a huge mess.

Beyond that, a lot of the most important information is encoded in free text fields, and so isn't directly analyzable. And even when information is mapped to codes from standard medical ontologies, there's no guarantee that when that information is transferred in HL7 formats it includes the code from the ontology.

It's not at all clear what the optimal way to structure medical information should be, so it's no surprise that there's a huge amount of variance out in the world. HL7 is quite old (v2 was made in 1989), and every new variant has to support the existing variants. EHRs were originally designed around billing and administrative workflows, so it's also not surprising that the data structures aren't great for analyzing data or treating patients.

Not the poster you were replying to, but I do healthcare data integration for my day job.

The problem is that HL7 is ostensibly a standardized interchange format, but there's enough ambiguity in the spec that literally every vendor implements things differently which leads to... my job existing.

Vendors implement the spec selectively. They may or may not support any given message trigger. They may have a different idea of what exactly constitutes something as basic as a patient account number and choose to send it in an unexpected field. Or send a piece of data you weren't expecting at all there. There may be a business case for capturing data that wasn't in the spec for a version of HL7 being used -- email addresses are common one today -- that lead to user-defined fields being added ad-hoc.

Honestly, working with HL7 v2 messages like posted above isn't really any substantially harder than working with CSVs. The real headache comes from actually integrating the underlying data.

Poster of the V2 message here. You describing the integration problem and that is the exact problem faced. As far as I understand it, isn't integration exactly what the format is for? It doesn’t do it well.
The standard will generally get you 80-90% of the way there.

There are a lot of factors that go into why the standard fails to be plug-and-play. The fact that v2 is essentially a glorified, somewhat standardized CSV instead of a prettier JSON has next to nothing to do with it.

troyastorino's sibling comment nails a lot of it. There's no standard model for the underlying data, which makes it incredibly difficult, if not impossible, to have a standard transmission format for the data. Literally every individual facility you'll look at is unique and will have their own registration workflows, code sets, etc.

The old V2 spec isn't what I'd call good, but it works. It's ugly to look at, but it's not difficult to work with, either.

The problems you're addressing, however, are far more fundamental to the industry itself and aren't going to be solved by an interchange format.

Maybe so, but I would argue a lot of the key information isn't that different for each type of medical event. (I'm leaving aside scheduling and insurance claims for now because I'm less familiar with things there but there are still probably some commonalities).

Each medical event should have:

- Patient it relates to

- Date it happened (possibly date it started and date it ended instead)

- Who did/prescribed/ordered it

- List of medical codes+coding system tuples that happened on that event

There's tons of other information of course, but these very basic things are universal and should always be in the same place (I refer to FHIR in this case, but format is somewhat irrelevant if the API is good). I understand they're not for historic reasons, and that some might complain because it doesn't exactly fit how they think about things, but a consistent API provides more value and I think will lead to better process down the line.

HL7 is moving along with FHIR, and I think it's a good start, I look forward to where it ends up.

>Maybe so, but I would argue a lot of the key information isn't that different for each type of medical event.

Well, I mean, that's pretty much the entire basis of the HL7 segment paradigm.

>- Patient it relates to

This is a much, much, much harder problem than you'd think.

Patients are going to have multiple identifiers attached to them and resolving them cleanly is literally an industry of its own within healthcare.

And that's precisely why it's a common problem during integrations - which identifier gets used how is generally a workflow and design decision made for a specific site-level implementation.

>- Date it happened (possibly date it started and date it ended instead)

>- Who did/prescribed/ordered it

These usually aren't sticking points for integration because they're the easy ones to get people to agree on.

>- List of medical codes+coding system tuples that happened on that event

These aren't really standardized at the industry level beyond ICD-10 diagnosis codes. Things like insurance provider codes, procedure codes, order codes, etc are individual to sites; even things like ethnicity and gender codes are variable by location.

I don't want it to sound like I'm down on FHIR or that I think HL7v2 is the greatest thing since sliced bread because I don't think either is the case.

The point I'm getting at is that there are huge problems with healthcare data interchange that just plain aren't going to be solved by a better interchange format.

That's inherent in the problem domain. The FHIR data model is distinct from the V3 RIM but clearly influenced by it. Health care has a high degree of irreducible complexity. If we oversimplify the model then it won't support important use cases and implementers will be forced to use proprietary extensions.

As a practical matter we can't rely on low-level standards to achieve real interoperability. Since there is often more than one way to model the same clinical information we also need high-level implementation guides with comprehensive examples to show everyone exactly what to do, along with automated conformance testing tools.

I work with FHIR extensively and find the standard to be decent but not great. Perhaps my biggest complaint is around naming strategies.

"What date did this event occur?", can be assertedDate or effectiveDateTime or performedDateTime or any number of things based on resources.

In the old versions it was even hard to know "What's the patient ID associated with this resource?", sometimes it was "patient" sometimes it was "subject". This has gotten better in STU3, and improvements have also been made to the way practitioners are identified as well.

I think there's a ton of irreducible complexity, but there are also common themes and common parts of medicine that should form the foundation of the API. It seems that instead of thinking, "What are the real commonalities here? Lets make them flexible enough to handle 90% of cases, but with a standard interface for extension." HL7 started from a current cumbersome standard and shoehorned it in.

I really am optimistic though, it's getting better all the time and FHIR is actually remarkably extensible to handle corner cases. I really hope it becomes the standard moving forward, especially as it smooths out the rough edges.

If you see a naming discrepancy then just go ahead and submit a tracker to fix it. None of the resources are normative yet and in my experience the work groups are quite willing to fix this kind of stuff as long as you provide sufficient justification. You might need to explain why the change is needed and ensure it doesn't get delayed. Don't sit back and expect someone else to identify and fix the problems; most implementers only work with a few specific resources so they might not even notice inconsistencies.

https://gforge.hl7.org/gf/project/fhir/tracker/?action=Track...