Hacker News new | ask | show | jobs
by TimPC 979 days ago
I think it makes a lot of sense. The best replacement for an outlier is the closest thing to the outlier that isn't an outlier. Resampling doesn't make a lot of sense because your new point is completely disconnected from your old one.

I don't really like the Winsorized mean but not for the reason you list. I think the main issue is that you are assuming exactly the top and bottom 10% are outliers instead of looking at the actual data distribution to see what the outliers are then using a similar replacement technique on only the outliers.