Hacker News new | ask | show | jobs
by minimaxir 3395 days ago
> And please read up on Simspons Paradox, which is clearly the case here just by looking at your plot.

What is the instance of Simpson's paradox (https://en.wikipedia.org/wiki/Simpson's_paradox) in the scatterplot? There are skews on both X and Y axes, but I don't see disparate trends.

Would faceting by preferred language/experience/LinkedIn age show different trends in this context?

1 comments

If you breakup the plot by any number of categories of technical ability, then there are trends. But to your question, what I was suggesting was that the aggregations done, especially with categorical data that is averaged, are very susceptible to this. And those clusters are reminiscent of situations like this:

https://www.researchgate.net/figure/256074671_fig3_Visualiza...

I'd more so like to see this analyzed against who got to the next round (their binary signal), or yes against preferred language, which I suspect will be much more telling.

The takeaway from that plot is, there is more to the story.