Hacker News new | ask | show | jobs
by godelski 246 days ago

  > Someone has probably studied this
There's even a name for it

https://en.wikipedia.org/wiki/Goodhart%27s_law

2 comments

Thanks for sharing. I did not know this law existed and had a name. I know nothing about nothing but it appears to be the case that the interpretation of metrics for policies assume implicitly the "shape" of the domain. E.g. in RL for games we see a bunch of outlier behavior for policies just gaming the signal.

There seems to be 2 types

- Specification failure: signal is bad-ish, a completely broken behavior --> local optimal points achieved for policies that phenomenologically do not represent what was expected/desired to cover --> signaling an improvable reward signal definition

- Domain constraint failure: signal is still good and optimization is "legitimate", but you are prompted with the question "do I need to constraint my domain of solutions?"

  - finding a bug that reduces time to completion of a game in a speedrun setting would be a new acceptable baseline, because there are no rules to finishing the game earlier
  
  - shooting amphetamines on a 100m run would probably minimize time, but other factors will make people consider disallowing such practices.
I view Goodhart's law more as a lesson for why we can never achieve a goal by offering specific incentives if we are measuring success by the outcome of the incentives and not by the achievement of the goal.

This is of course inevitable if the goal cannot be directly measured but is composed of many constantly moving variables such as education or public health.

This doesn't mean we shouldn't bother having such goals, it just means we have to be diligent at pivoting the incentives when it becomes evident that secondary effects are being produced at the expense of the desired effect.

  > This is of course inevitable if the goal cannot be directly measured
It's worth noting that no goal can be directly measured[0].

I agree with you, this doesn't mean we shouldn't bother with goals. They are fantastic tools. But they are guides. The better aligned our proxy measurement is with the intended measurement then the less we have to interpret our results. We have to think less, spending less energy. But even poorly defined goals can be helpful, as they get refined as we progress in them. We've all done this since we were kids and we do this to this day. All long term goals are updated as we progress in them. It's not like we just state a goal and then hop on the railroad to success.

It's like writing tests for code. Tests don't prove that your code is bug free (can't write a test for a bug you don't know about: unknown unknown). But tests are still helpful because they help evidence the code is bug free and constrain the domain in which bugs can live. It's also why TDD is naive, because tests aren't proof and you have to continue to think beyond the tests.

[0] https://news.ycombinator.com/item?id=45555551

You can measure revenue exactly; it has limited precision.
It’s a false law tho. Collapses under scrutiny
If I hadn't seen it in action countless times, I would belive you. Changelists, line counts, documents made, collaborator counts, teams lead, reference counts in peer reviewed journals...the list goes on.

You are welcome to prove me wrong though. You might even restore some faith in humanity, too!

Sorry, remind me; how many cobras are there in India?
The Zoological Survey of India would like to know but hasn't figured out a good way to do a full census. If you have any ideas they would love to hear them.

Naja naja has Least Concern conservation status, so there isn't much funding in doing a full count, but there are concerns as encroachment both reduces their livable habitat and puts them into more frequent contact with humans and livestock.

Could you elaborate or link something here? I think about this pretty frequently, so would love to read something!
Metric: time to run 100m

Context: track athlete

Does it cease to be a good metric? No. After this you can likely come up with many examples of target metrics which never turn bad.

If it were a good metric there wouldn't be a few phone books worth of regulations on what you can do before and during running 100 meters. From banning rocket shoes, to steroids, to robot legs the 100 meter run is a perfect example of a terrible metric both intrinsically as a measure of running speed and extrinsically as a measure of fitness.
> Metric: time to run 100m

> Context: track athlete

> Does it cease to be a good metric? No.

What do you mean? People start doping or showing up with creatively designed shoes and you need to layer on a complicated system to decide if that's cheating, but some of the methods are harder to detect and then some people cheat anyway, or you ban steroids or stimulants but allow them if they're by prescription to treat an unrelated medical condition and then people start getting prescriptions under false pretexts in order to get better times. Or worse, someone notices that the competition can't set a good time with a broken leg.

So what is your argument, that it doesn't apply everywhere therefore it applies nowhere?

You're misunderstanding the root cause. Your example works as the the metric is well aligned. I'm sure you can also think of many examples where the metric is not well aligned and maximizing it becomes harmful. How do you think we ended up with clickbait titles? Why was everyone so focused on clicks? Let's think about engagement metrics. Is that what we really want to measure? Do we have no preference over users being happy vs users being angry or sad? Or are those things much harder to measure, if not impossible to, and thus we focus on our proxies instead? So what happens when someone doesn't realize it is a proxy and becomes hyper fixated on it? What happens if someone does realize it is a proxy but is rewarded via the metric so they don't really care?

Your example works in the simple case, but a lot of things look trivial when you only approach them from a first order approximation. You left out all the hard stuff. It's kinda like...

Edit: Looks like some people are bringing up metric limits that I couldn't come up with. Thanks!

> So what is your argument, that it doesn't apply everywhere therefore it applies nowhere?

I never said that. Someone said the law collapses, someone asked for a link, I gave an example to prove it does break down in some cases at least, but many cases once you think more about it. I never said all cases.

If it works sometimes and not others, it's not a law. It's just an observation of something that can happen or not.

  > I never said all cases.
You're right. My bad. I inferred that through the context of the conversation.

  > If it works sometimes and not others, it's not a law.
I think you are misreading and that is likely what lead to the aforementioned misunderstanding. You're right that it isn't a scientific law, but the term "law" gets thrown around a lot in a more colloquial manner. Unfortunately words are overloaded and have multiple meanings. We do the same thing to "hypothesis", "paradox", and lots of other things. I hope this clarifies the context. (even many of the physics laws aren't as strong as you might think)

But there are many "laws" used in the same form. They're eponymous laws[0], not scientific ones. Read "adage". You'll also find that word used in the opening sentence on the Wiki article I linked as well as most (if not all) of them in [0]

[0] https://en.wikipedia.org/wiki/List_of_eponymous_laws

it doesn't break down - see comments about rules above. it was the perfect example to prove yourself wrong.
> Does it cease to be a good metric?

Yes if you run anything other than the 100m

Do you have an example that doesn't involve an objective metric? Of course objective metrics won't turn bad. They're more measurements than metrics, really.

  > an objective metric
I'd like to push back on this a little, because I think it's important to understanding why Goodhart's Law shows up so frequently.

*There are no /objective/ metrics*, only proxies.

You can't measure a meter directly, you have to use a proxy like a tape measure. Similarly you can't measure time directly, you have to use a stop watch. In a normal conversation I wouldn't be nitpicking like this because those proxies are so well aligned with our intended measures and the lack of precision is generally inconsequential. But once you start measuring anything with precision you cannot ignore the fact that you're limited to proxies.

The difference of when we get more abstract in our goals is not too dissimilar. Our measuring tools are just really imprecise. So we have to take great care to understand the meaning of our metrics and their limits, just like we would if we were doing high precision measurements with something more "mundane" like distance.

I think this is something most people don't have to contend with because frankly, very few people do high precision work. And unfortunately we often use algorithms as black boxes. But the more complex a subject is the more important an expert is. It looks like they are just throwing data into a black box and reading the answer, but that's just a naive interpretation.

This isn't what Goodhart's law is about.

Sure, if you get a ruler from the store it might be off by a fraction of a percent in a way that usually doesn't matter and occasionally does, but even if you could measure distance exactly that doesn't get you out of it.

Because what Goodhart's law is really about is bureaucratic cleavage. People care about lots of diverging and overlapping things, but bureaucratic rules don't. As soon as you make something a target, you've created the incentive to make that number go up at the expense of all the other things you're not targeting but still care about.

You can take something which is clearly what you actually want. Suppose you're commissioning a spaceship to take you to Alpha Centauri and then it's important that it go fast because otherwise it'll take too long. We don't even need to get into exactly how fast it needs to go or how to measure a meter or anything like that, we can just say that going fast is a target. And it's a valid target; it actually needs to do that.

Which leaves you already in trouble. If your organization solicits bids for the spaceship and that's the only target, you better not accept one before you notice that you also need things like "has the ability to carry occupants" and "doesn't kill the occupants" and "doesn't cost 999 trillion dollars" or else those are all on the chopping block in the interest of going fast.

So you add those things as targets too and then people come up with new and fascinating ways to meet them by sacrificing other things you wanted but didn't require.

What's really happening here is that if you set targets and then require someone else to meet them, they will meet the targets in ways that you will not like. It's the principal-agent problem. The only real way out of it is for principals to be their own agents, which is exactly the thing a bureaucracy isn't.