Hacker News new | ask | show | jobs
by chrisfosterelli 143 days ago
Health metrics are absolutely tarnished by a lack of proper context. Unsurprisingly, it turns out that you can't reliably take a concept as broad as health and reduce it to a number. We see the same arguments over and over with body fat percentages, vo2 max estimates, BMI, lactate thresholds, resting heart rate, HRV, and more. These are all useful metrics, but it's important to consider them in the proper context that each of them deserve.

This article gave an LLM a bunch of health metrics and then asked it to reduce it to a single score, didn't tell us any of the actual metric values, and then compared that to a doctor's opinion. Why anyone would expect these to align is beyond my understanding.

The most obvious thing that jumps out to me is that I've noticed doctors generally, for better or worse, consider "health" much differently than the fitness community does. It's different toolsets and different goals. If this person's VO2 max estimate was under 30, that's objectively a poor VO2 max by most standards, and an LLM trained on the internet's entire repository of fitness discussion is likely going to give this person a bad score in terms of cardio fitness. But a doctor who sees a person come in who isn't complaining about anything in particular, moves around fine, doesn't have risk factors like age or family history, and has good metrics on a blood test is probably going to say they're in fine cardio health regardless of what their wearable says.

I'd go so far to say this is probably the case for most people. Your average person is in really poor fitness-shape but just fine health-shape.

8 comments

Many of those metrics are population or sampling measures and are confounded by many factors at an individual level. The most notorious of which is BMI; it is practically a category error to infer someone's health or risk by individual BMI, and yet doing so remains widespread amongst people that are supposed to know better.

Instrumentation and testing become primarily useful at an individual level to explain or investigate someone's disease or disorder, or to screen for major risk factors, and the hazards and consequences of unnecessary testing outweigh the benefits in all but a few cases. For which your GP and/or government will (or should) routinely screen those at actual risk, which is why I pooped in a jar last week and mailed it.

An athlete chasing an ever-better VO2max or FTP hasn't necessarily got it wrong, however. We can say something like, "Bjorn Daehlie’s results are explained by extraordinary VO2max", with an implication that you should go get results some other way because you're not a five-sigma outlier. But at the pointy end of elite sport, there's a clear correlation between marginal improvement of certain measures and competitive outcomes, and if you don't think the difference of 0.01sec between first and third matters then you've never stood on a podium. Or worse, next to one. When mistakes are made and performance deteriorates, it's often due to chasing the wrong metric(s) for the athlete at hand, generally a failure of coaching.

> The most notorious of which is BMI; it is practically a category error to infer someone's health or risk by individual BMI, and yet doing so remains widespread amongst people that are supposed to know better.

BMI works fine for people who aren't very muscular, which is the great majority of people. Waist to height ratio might be more informative for people with higher muscle mass.

As a person who has been told I'm "morbidly obese" for decades now, I will say that doctors at almost every level look at your chart not you. I've been told time and time again that until I get my weight under control, my health will suffer.

I'm 5'8" and weigh on average 210lbs. My BMI isn't even morbidly obese, it is 31, which is just "regular" obese, but on top of that, a DEXA scan shows that I am actually only 25% body fat, with only 1lb of visceral fat.

Doctor's don't care about that, they see on the Epic chart that my BMI is > 30 and have to tell me some spiel about a healthier lifestyle so they check check off a checkbox and continue to the next screen.

I'd consider 5'8 and 210lbs morbidly obese. An average male at 5'8 should generally weigh about 150lbs and no more than 164lbs.
> I'd consider 5'8 and 210lbs morbidly obese. An average male at 5'8 should generally weigh about 150lbs and no more than 164lbs

You would consider incorrectly then.

This person has ~155 pounds of lean body mass. 164 would put him at roughly a body builder level of fat, which basically requires a part time job in cooking and nutrition to maintain.

For reference, I’m in a similar situation to this person. I’m 5’11” (180cm) and about 200 lbs (91kg) with about 170 lbs of lean body mass. My dexa scan says that I’m 15% body fat, but I get the same lectures from doctors about being obese and needing a lifestyle change, all based on BMI and (I assume) my size (I’m barrel chested). It’s completely absurd.

Dexas are notoriously inaccurate. Your dexa scan is probably wrong, and you are fatter than you think. I've been lifting over a decade, so I have far more muscle mass than the average person, and I am 6'1", yet am still easily over 20% BF if I'm 200 lbs or more. Don't believe me? Try to get truly shredded. You'll see for yourself that you will have to lose far more weight than you think. Everyone is fatter and less muscular than they think they are, even if they're active. Unless of course you are a heavy steroid user, in which case you may actually be muscular enough for that to be valid. But for the average natural trainee? Nobody who's truly lean is getting an obese or morbidly obese BMI. Overweight at worst, maybe.

BMI is definitely inaccurate for those with greater amounts of muscle mass, but not as inaccurate as many would like to believe.

If I got rid of all of my fat and bones, I'd still weigh more than 150lbs. I have the most muscular 150lbs man inside of me.

Ideal body fat percentage is 18-24% - I'm at 25% (or was in November - might be +/- 2% since then - gained a few pounds weight, but not waist size).

So I would say I'm not morbidly obese or even regular obese based on the percentage of my body that is muscle vs fat.

You are fat, though. For a man, the ideal fat percentage is 15-20%. 20+%, let alone 25%, is not healthy at all.
Or that guy could be a burly bricklacker / concerete worker who can casually carry hundreds of pounds of weight all day every day in brutal conditions.

It's really hard to tell with the data provided.

burly - maybe, but I haven't done any hard labor most of my life. I ran track as a kid, and kept my high metabolism - (RMR: 2460kcal, TDEE: 3380kcal); well lost it when my thyroid failed, but medicated myself back to it. I eat what I want, but its a very high lean-meat diet (lots of chicken breast and turkey because my wife likes them), but I don't limit my carb intake either, as I mostly burn sugar for energy (according to my Respiratory Exchange Ratio).

Somehow my body is just amazing at working without any help from me. I don't even exercise much. Maybe a few pushups a day, up and down my stairs at my house a couple dozen times a day, and probably 5-10k steps a day max.

Huh. The standard in your case is to measure waist circumference if BMI is high. Did no doctor do that? As long as you are below 40” or 37” if Asian you are considered good to go.
None ever did.

On top of that, I'm not sure if that is a real indication of anything, either.

The reason to do that is to get an idea of your abdominal fat (which is the more dangerous place for fat to store), but there are two types of abdominal fat, one is dangerous (visceral fat) and one is completely benign (subcutaneous fat). And a measurement around your waist won't tell you which you have.

I personally have almost all of my fat subcutaneous, with only 1lb of visceral fat (which is right in the perfect range).

> Doctor's don't care about that

Literally all of them?

When humans talk, they use generalizations (and don't need to annouce them). Here it means that most doctors don't care about that.

Follow that rule next time you read such a statement in a context that's not formal math.

> most

That is not even true. We are talking anecdotal evidence here.

> When humans talk, they use generalizations

All humans?

Sorry :)

I can't say literally all, but in my experience with having to get a new GP almost every year because of health insurance changes, location changes, hospital consolidation buying my GPs practice, and multiple doctors retiring or just quitting medicine (my last GP was tired of medicine after practicing for only 3 years). Over the last 20 years, I've had almost 15 GPs across 5 states (NY, NJ, CT, TX, LA). I also have multiple auto immune diseases, so I have had a handful of specialists of various flavors (endocrine, oncology - not for cancer, cardiology, and urology), but only need them occasionally.

Almost every single start of every single appointment (including a follow up from just a couple days prior), they comment about my BMI. It is the rare time they don't that I remember. My last urology appointment the doctor was very congenial, didn't even go over the lab work, just said, everything is looking good, asked how I was feeling, everything good, alright, refilled my prescriptions and left.

I mean those stats arent good...
No. BMI does not work as a diagnostic measure for general population. The range of "normal" BMI values does depend at least on genetic lineage, gender and individual development history. Fine to compare two scandinavian lineage men, but if you compare e.g. a dutch man with an african woman oh boy, you error margins would be mid-to-high single digit units

> Waist to height ratio

Again, while not a bad metric per se, translates poorly between cohorts.

My understanding is that it doesnt even do that, because it creates false negatives for the so called skinny fat body type: significant visceral fat mass, which is what we are concerned about, but not much muscle or peripheral fat mass, thereby not being flagged by BMI screens, even though they are at risk.
> BMI works fine

An individual learns nothing from its calculation and it has no clinical value. I receive more constructive feedback from an auntie jabbing me in the chest and saying "you got fat".

> the great majority of people

There is wide morphological variety across human populations, so, no.

I dunno, basing life decisions off a metric that has a fudge factor built into it to make the regression work feels sub-optimal to me.
BMI underestimates in most cases and your body fat is higher then the chart would predict.

When people say "oh BMI isn't accurate" it means you are more overweight then it suggests unless you are literally an extreme body builder.

This underestimation has a name, "Normal Weight Obesity." Known by the slang "hot guy/girl fit" where the person looks like they would be physically fit because they're skinny but there's no muscle under there.
> But a doctor who sees a person come in who isn't complaining about anything in particular, moves around fine, doesn't have risk factors like age or family history, and has good metrics on a blood test is probably going to say they're in fine cardio health regardless of what their wearable says.

This is true of many metrics and even lab results. Good doctors will counsel you and tell you that the lab results are just one metric and one input. The body acclimates to its current conditions over time, and quite often achieves homeostasis.

My grandma was living for years with an SpO2 in the 90-95% range as measured by pulse oximetry, but this was just one metric measured with one method. It doesn't mean her blood oxygen was actually repeatedly dropping, it just meant that her body wasn't particularly suited to pulse oximetry.

It doesn't help when doctors are often unaware of outliers affecting the test results. E.g. I've had a number of doctors freak out over my eGFR (kidney function) test results because the default test they use is affected by body mass and diet, and made even worse by e.g. preworkout supplements with creatine. None of my doctors have been aware of this, and I've had to explain it to them.
I've not seen evidence that creatine actually has significant impact on eGFR. Anecdotally, mine does not budge even on 5g a day. Meta-analysis show minimal impact, e.g. https://pmc.ncbi.nlm.nih.gov/articles/PMC12590749/

Muscle mass obviously does, though. cystatin c is a better market if your body composition differs from the "average"

I did end up taking a cystatin c test privately to be able to prove to my GP that the results he freaked out over were nonsense. I'm in the UK, and for whatever reason the NHS just doesn't typically do them for basic kidney function - presumably cost, but they were dirt cheap to do privately so...
NICE guidelines. "Evidence on the specific eGFR equations or ethnicity adjustments seen by the committee was not from UK studies so may not be applicable to UK black, Asian and minority ethnic groups. None of the studies included children and young people. The committee was also concerned about the value of P30 as a measure of accuracy (P30 is the probability that the measured value is within 30% of the true value), the broad range of P30 values found across equations and the relative value or accuracy of ethnicity adjustments to eGFR equations in different ethnic groups. The committee agreed that adding an ethnicity adjustment to eGFR equations for different ethnicities may not be valid or accurate...."

https://www.nice.org.uk/guidance/ng203/chapter/rationale-and...

What does ethnicity has to do with anything?

My creatinine levels are high because my body mass - including muscle mass - is well above average. On the basic kidney tests my GP did, my numbers indicated kidney disease. Doing a Cystatin C test showed very clearly that my numbers were firmly in the normal range.

The page does go on to point out the muscle mass issue:

> The committee highlighted the 2008 recommendation, which states that caution should be used when interpreting eGFR and in adults with extremes of muscle mass and on those who consume protein supplements (this was added to recommendation 1.1.1).

Further down they do mention Cystatin C, and seem to have basically decided that a risk of false positives is acceptable because of a lower risk of false negatives. That part is interesting, and it may well be the right decision at a population level.

But if your muscle mass is sufficiently above average, the regular kidney tests done will flag up possible kidney disease every single damn time you do one, and my experience is that UK doctors are totally oblivious to the fact that this is not necessarily cause for concern for a given patient and will often just assume a problem and it will be up to the patient to educate them.

EDIT: What's worse, actually, is the number of times I've had doctors or nurses try to help me to "game" this test by telling me to e.g. drink more before the test next time, seemingly oblivious that irrespective of precision, making changes to conditions that also invalidates it as a way to track changes in eGFR is not helpful.

I'm not sure what point you're trying to make here. Have I missed somewhere in the discussion where eGFR equation adjustment based on ethnicity has been discussed?

Creatinine is the standard marker used for eGFR. It is also a byproduct of muscle metabolism. People who regularly lift weights or have lifestyles that otherwise result in a higher-than-normal muscularity will almost universally have higher creatinine levels than those who don't, assuming similar baseline kidney function. It's also problematic for people with extremely low muscle mass, for the opposite reason.

It's one of the reasons enhanced bodybuilders can get bit with failing kidney function - they know that their eGFR is going to look worse and worse based on creatinine formulas so they ignore it, when the elevated blood pressure from all the dbol they're popping is killing their kidneys.

Cystatin C is the better option for people with too much (or too little) muscle for creatinine to be accurate.

>I'd go so far to say this is probably the case for most people. Your average person is in really poor fitness-shape but just fine health-shape.

Modern medicine has failed to move into the era of subtlety and small problems and many people suffer as a result. Fitness nerds and general non-scientists fill the gap poorly so we get a ton of guessing and anecdotal evidence and likely a whole lot of bad advice.

Doctors won't say there's a problem until you're SICK and usually pretty late in the process when there's not a lot of room to make improvements.

At the same time, doctors won't do anything if you're 5% off optimal, but they'll happily give you a medicine that improves one symptom that's 50% off optimal that comes along with 10 side effects. Although unless you're dying or have something really straightforward wrong with you, doctors don't do much at all besides giving you a sedative and or a stimulant.

Doctors don't know what to do with small problems because they're barely studied and the people who DO try to do something don't do it scientifically.

A worthwhile book to read on this topic is Outlive by Peter Attia (MD). The core premise is that American healthcare focuses far too much on treating problems after they’re extremely severe. It is would be cheaper and healthier to invest more into conservative & preventative care, trying to prevent or minimize problems early in life before they become incredibly dangerous and expensive/difficult/impossible to treat.

I have a close friend who works in conservative care, and it’s astonishing what they see. For example, someone went to a number of specialists and doctors about a throat condition where they really struggled swallowing. They even had to swallow a radioactive pill to do some kind of imaging. Unnecessary exposure, and an expensive process to go through, and ultimately went exactly nowhere.

Meanwhile, it was a simple musculoskeletal issue which my friend was able to resolve in a single visit with absolutely no risk to the patient.

Medical schools need to stop producing MDs who reach for pills as the first line of defense without trying to root cause issues. Do you really need addictive pain killers, or maybe some PT, exercise, massage, etc. to help resolve your pain.

It’s not medicine. It’s healthcare system. Doctor isn’t paid enough to go thoroughly through the complaint and dig deeper. In Germany you get 5 minutes diagnose and that’s all from health insurance. And this from the better doctor. For normal one diagnose comes from 2 minutes interaction. Believing that the diagnose is right is very naive.
> Doctors won't say there's a problem until you're SICK and usually pretty late in the process when there's not a lot of room to make improvements.

As someone who is fit and active,in their 60s with zero obvious symptoms, but is nonetheless on cholesterol and blood pressure medication, this isn't true (in the UK, at least)

One of the things the NHS does surprisingly well, and is only really possible because it's a completely vertically integrated system, is population-level preventative medicine. Distributing insulin and salbutamol. Screening for various sorts of cancer. Cholesterol and BP checks. Encouraging people to stop smoking.
I think one of the major problems is that biologists/scientists cannot legally treat people. Physicians take their studies and have monopolistic treatment powers over them.

I think this creates a huge knowledge gap.

It’s also cultural. Most American doctors don’t bother to tell people if they are overweight and out of shape. It’s not something their customers reward.
> customers

And there's the problem. That they are "customers" that pay, either direct or via insurance, or via government insurance vs. a nationalized healthcare system, and I mean healthcare not nationalized health insurance

I mean... most people already know, it's not like either of those things come as a surprise to anybody.
Most obese people think "I am a bit on a heavy side, but I am not that fat and definitely not obese".

People are generally in denial about their fat percentage and their muscle mass. Even somewhat healthy people (~20% fat) who are calculating how much they must lose in order to get to a healthy 12-15%, are surprised when the reality shows that they calculations were 5-15kg off.

Most people are wrong about their body type, in the wrong direction (overweight think they aren't that overweight, skinny think they need to lose weight).

Having an objective voice from your doctor giving you annual feedback and suggestions is better than ignoring the topic entirely.

that's gym bro science, there's no compelling health reason to lower your fat percentage to 12-15% and it carries as much risk as being rather obese when accounting for all causes mortality, particularly for women and people getting older
Maybe I'm not getting you right, but IMO it hasn't? I, as a customer/patient, just don't weekly converse with my MD about small issues, and frankly, they have better things to do, for example treating sick people.

Instead I use the health benefits programs of my health care insurer. My insurer has an interest in prevention, so I can get consulting for free (or very low fees), and even kickbacks if I regularly participate in fitness courses and maintain my yearly check-up routine. Now, I live in Germany and it probably is different in other countries, but it just makes economic sense from the insurer's point of view so that I would be surprised if it were very different elsewhere.

>This article gave an LLM a bunch of health metrics and then asked it to reduce it to a single score, didn't tell us any of the actual metric values, and then compared that to a doctor's opinion. Why anyone would expect these to align is beyond my understanding.

This gets to one of LLMs' core weaknesses, they blindly respond to your requests and rarely push back against the premise of it.

I read somewhere that LLM chat apps are optimized to return something useful, not correct or comprehensive (where useful is defined as the user accepts it). I found this explanation to be a useful (ha!) way to explain to friends and family why they need to be skeptical of LLM outputs.
Measuring metrics is easy, it's the algorithm on the backend that matters.

There's a reason why Oura rings are expensive and it's not the hardware - you can get similar stuff for 50€ on Aliexpress.

But none of them predicted my Covid infection days in advance. Oura did.

A device like the Apple Watch that's on you 24/7 is good with TRENDS, not absolute measurements. It can tell you if your heart rate, blood oxygen or something else is more or less than before, statistically. For absolute measurements it's OK, but not exact.

And from that we can make educated guesses on whether a visit to a doctor is necessary.

> But none of them predicted my Covid infection days in advance. Oura did.

It actually warned you, or retrospectively looking at the metrics you could see that there was a pattern in advance of symptoms? (If the latter, same here with my Garmin watch - precipitous HRV decline in the 7 days before symptoms. But no actual warning.)

It actually told me, they've been doing this for a while: https://ouraring.com/blog/early-covid-symptoms/

Of course it didn't tell me "you have COVID19-B variant C" - but it did tell me I'm probably sick and should seek care.

I'm curious how the ring detected it in advance? I also discovered my Covid when I looked at my Garmin watch and my resting heart rate was 100, until then I had thought I had too much sun that day.
Some of the metrics were out of whack, I think my average body temp was up along with my resting heart rate both asleep and awake.

It somehow takes all that and gave me a "you might be sick" notification.

How is that predicting in advance though? Sounds like it measured active symptoms like a change in body temp etc. That's not prediction, that's reaction.
I think it is fair to assume they meant before symptoms? Which, yes, your heart rate is a symptom. No, it isn't one most people consider.
Device detects 0.1 degree average temp increase. I don’t.

Like your car will start with a small noise first, you can’t hear it. But in time the small noise becomes a big noise just before things break.

If you catch it in the small noise part, you can proactively prepare.

On the other hand, if compressing to a single number is not possible, a doctor will just refuse to give a grade in that way. In my experience, most doctors tend to be very careful about trying to avoid saying anything definitive that they're not actually sure of, even if they're reasonably confident, in large part because part of their job involves understanding how patients react to how things are communicated to them. Being willing to confidently give a misleading answer to a bad question is itself as bad thing when it comes to health data because regular people aren't able to (and shouldn't be expected to) figure out what various interferences from health data happen to feasible from a given data set.
>But a doctor who sees a person come in who isn't complaining about anything in particular, moves around fine, doesn't have risk factors like age or family history, and has good metrics on a blood test is probably going to say they're in fine cardio health regardless of what their wearable says.

The standard risk model for CVD based on SCORE-2 and PREVENT like parameters are very poor as reported in the recently published paper on the their accuracy performance by the Swedish team [1]. As all CVD risk stratification with cardiologist review, the most important accuracy is sensivity (avoiding false negative that will escape review) of SCORE-2 and PREVENT, 48% and 26%, respectively.

The paper alternative proposal increased the sensitivity to 58% by performing clustering instead of conventional regression models as practiced in the standard SCORE-2 (Europe) and PREVENT (US).

These type of models including the latest proposal performed very poorly as indicated by their otherwise excellent and intuitive display of graphical abstract results [1].

[1] Risk stratification for cardiovascular disease: a comparative analysis of cluster analysis and traditional prediction models:

https://academic.oup.com/eurjpc/advance-article/doi/10.1093/...

The problem is that the product itself invites the wrong expectation