Hacker News new | ask | show | jobs
by andreilys 1331 days ago
Having worked as a data scientist at multiple companies (From FANG to startup), the first thing I look at when I get my hands on data is the existence of the Pareto principle.

I still haven’t found one company where this principle didn’t show up.

4 comments

What does this prove? If you have lots of data and dimensions, I bet you could just as likely find distributions that are roughly 50/50, 60/40, 90/10, 100/0 if you looked for them.
Agreed. It’s not necessarily 80/20, it’s just that power laws show up a LOT. 90/10, 99/1, etc.
So you massage your features until they can produce a 20-80% split?

Very scientific.

The 50/50 principle always shows up in my histogram with interval of two.