Hacker News new | ask | show | jobs
by btilly 2286 days ago
Multi armed bandit methods work best with immediate success-fail metrics. This one has time delays.

An example of how machine learning goes wrong is if a treatment slows down the progression but increases the death rate. Given exponential ramp up in the incoming cases, it will look good until the final horrifying numbers are in. You need to slice and dice the numbers by cohort to detect/react to this.

2 comments

I decided that some numbers on how things go wrong would help.

Suppose that the treatment increased deaths by 50% but delayed death by a week. And we have a doubling rate for the disease of 1 week.

Back of the envelope that means that the treatment will have 1.5x the deaths from when the disease happened 0.5 times as much for 0.75 of the deaths at any point in time. It looks like it saves 25% of lives when in fact it kills 50% more people. The raw numbers will look good until you look at a cohort over time.

Current doubling time for deaths has been about 3 days. My assumption of a week is therefore optimistic. Perhaps we get there with social distancing.

"Multi armed bandit methods work best with immediate success-fail metrics. This one has time delays."

Well, sure, but everything works best with immediate success-fail metrics. That's one of the most basic results from learning theory is that the longer the latency between stimulus and response the slower the learning rate can be. I'm not sure how multi-armed bandit is special in this regard in any particular dimension. All learning techniques are going to be susceptible to the problem you outline in your second paragraph.

This is one of those "there is no perfect solution" situations. It's really easy to say that out loud. It's quite difficult to internalize it.

(Also, just as a note to your other post, bear in mind that our hard-core "social distancing" efforts in the US are just about to reach approx. 1 incubation period. It is only just this week that we're going to start seeing the results of that, and it'll phase in as slowly as our efforts 1-2 weeks ago did. My state just went to full lockdown today, though we've been on a looser lockdown for a week before that.)

Everything works better with immediate success/fail metrics. However the simplest approach is easiest to analyze, and is easiest to analyze after the fact in as many ways as you want. The more complex the decision making, the less we should be willing to put it under the control of a computer program. (Unless that program has been well-studied for our exact problem so that we trust it more.)

Which medicine looks effective? Which medicine gets people out of the hospital faster? What underlying conditions interacted badly with given medicines? These questions do not have to be asked up front. But they can be answered afterwards. And knowing the answers, matters.

Here is an example. Suppose that we find one medication that gets people out of bed faster but kills some. In areas with overwhelmed hospitals, cycling people through the bed may save net lives. If your hospital is not overwhelmed, you wouldn't want to give that medicine. Now I'm not saying that any of these medicines will come to a conclusion like that. But they could. And if one did, I definitely want human judgement to be applied about when to use it

I don't think anyone is proposing actually removing all humans from the loop, so I think that's an argument against a strawman.

Even if they were proposing it, there's no realistic chance of it happening.

I don't want people blindly copying "standard" scientific procedures either, where we run high-stastistical-power studies for months with double-blind scenarios then carefully peer-review it and come up with some result somewhere in 2022.

So, hopefully there will be blinded researchers who analyse the data.

They'll probably use sequential stopping rules to take samples of incoming data.

If one of the treatments works much much better, then they'll almost certainly recommend that (but doctors will probably figure this out first, anyway).