Hacker News new | ask | show | jobs
by DrRavenstein 1586 days ago
This headline and article are horrible and misrepresent the problem and the outcome.

The paper is about the manual process of re-assigning a credit score on a scale of 1 to 15 based on other customer criteria. Really the fact that this process exists at all shows that their initial credit scoring approach is flawed or too simplistic. The argument of "just replace it with an if statement" does not hold up in this scenario.

So this is not a "if number big lend. If number small no lend" problem. Its a 15 way multi-class classification problem. They even give a baseline for what happens if they randomly pick or always pick the biggest class in the paper

> As is typical in machine learning we also report the Accuracy p-value computed from a one-sided test (Kuhn et al., 2008) which compares the prediction accuracy to the "no information rate", which is the largest class percentage in the data (23.85%).

So yeah, 95% is somewhat better than 23.85%.

I agree with the general sentiment that is is likely a fairly straight forward problem to predict if you are familiar with the bank's operating procedures as there is no way these individuals are making their own risk models and independent decisions. They are there to follow the rules and provide human accountability.

An error analysis on the items the model couldn't predict would definitely have been most interesting.

1 comments

Plus, systemic risk of a repeatable exploitation is more likely without humans in the loop. Making a bad loan for $1M is bad, but if “attackers” can repeatedly prove until they get a bad risk $1B loan, it becomes business shattering.
Yes, which is why there are hard breakpoints where banks' credit policies change and specifically an absolute ceiling for automation in terms of risk of loss on lending that is set in the credit policies of the bank and any changes have to be approved by their regulator.

No bank in any credible jurisdiction will have an automated system approving 1bn USD equivalent loans any time soon. A typical system would be loans up to a certain amount can get approved automatically, up to $xM by a credit officer, up to $yM by a senior credit officer, anything over that by the credit committee. Regulators push back very hard on automated decisionmaking for large loans particularly because of "default correllation skew"[1] problems as were revealed in the 2008 crisis. Relatively few bad decisions on big loans can push a bank into difficulties if they are not well-capitalized. This is particularly a problem for automated decision-making because as loans get larger they also get more idiosyncratic and therefore it's much harder to fit a model with confidence because there simply aren't enough data points.[2]

[1] Often credit quality for a group of loans rises because of idiosyncratic factors but deteriorates together, so as loans become more risky in an adverse economic environment the correllation of the default probability between the loans goes up. An intuitive way of thinking about this is in housing loans. If 3 or 4 of your neighbours default on their loan, property prices on your street will go down (because the banks will be trying to sell all those houses at once), making it much more likely you and the rest of the residents will default too.

[2] Say I'm trying to approve a 200k loan to expand a pizza restaurant. If I'm in a big bank I have hundreds of similar loans to use as data points for pricing and risk. If I'm trying to approve a 200M loan to build a luxury hotel complex that includes 5 restaurants, accommodation, retail etc it is completely a one-off. Even if I'm the largest commercial lender this loan will be unique in my portfolio. I will have many other large loans but there will be lots of idiosyncratic factors that make them different.

Do they remove the human in the loop? That doesn't seem like a smart idea. A model would be good just for suggestions.

Humans are both biased and with high variance (not to mention corruptible), but the algorithm can benefit from much better scrutiny and ensure uniform application of the criteria. If a human overrides, then they got to have a good reason.

In Europe you have an absolute right to a human making your loan decision[1]. So if a loan decision is made automatically you can request a human decision instead.

[1] https://ec.europa.eu/info/law/law-topic/data-protection/refo...