Hacker News new | ask | show | jobs
by Bostonian 802 days ago
If the data is continuous, use kernel density estimation (KDE) instead of histograms to visualize the probability density, since KDE will give a smoother fit. A similar idea is to fit a mixture of normals -- there are numerous R packages for this and sklearn.mixture.GaussianMixture in SciPy.
1 comments

Yep! The next post would be on Kernel density estimation -- wanted to start from histograms as they are still a useful tool in 1-D and 2-D density estimation, and you don't have to store the data either (unlike KDE)
I should have read to the end of your post:

'I will describe a very popular nonparametric method, Kernel Density Estimation, that also follows strategy 1 and is much more scalable to higher dimensions than histograms.'

Haha no worries!