| HN Mirror

To be fair, you said "There’s not really a principled way of doing this" but you were referring to cleaning up the data, not the analysis itself. My bad.

That said, you started with the following prior: Simpson ratings went down over time. Then you looked at 1 variable: The writing staff. You noticed: Aha, if we assign each writer a rating, then we see that later writers have lower ratings than writers that only worked on early episodes!

However that's a tautology. Of course the writers that worked on later episodes have lower scores than those that worked on earlier episodes. The fact the ratings went down over time was the prior we started with! This is the natural result of taking averages. The experimental setup was wrong from the beginning.

As for why I think you would want to do proper statistics, the reason is simple: I assume that people who publish these things are well intentioned, and they want to show off actual statistical correlations.