Hacker News new | ask | show | jobs
by kilbuz 4085 days ago
From the abstract: "forced to obey a certain constraint that is pathologically unlikely to be satisfied by any dataset"

Like assuming your data come from a normal distribution? Box guides us: 'all models are wrong, some are useful.'

1 comments

We found the same issue with non-normal distributions in our time series data sets when using SAX (developed by this paper's authors - assumes normality for dimensionality reduction) and addressed it by using quantiles in the piecewise aggregate approximation step. The quantile breakpoints behaved much better than the "normal" breakpoints.