| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Bostonian 802 days ago
	If the data is continuous, use kernel density estimation (KDE) instead of histograms to visualize the probability density, since KDE will give a smoother fit. A similar idea is to fit a mixture of normals -- there are numerous R packages for this and sklearn.mixture.GaussianMixture in SciPy.

1 comments

vvanirudh 802 days ago

Yep! The next post would be on Kernel density estimation -- wanted to start from histograms as they are still a useful tool in 1-D and 2-D density estimation, and you don't have to store the data either (unlike KDE)

link

Bostonian 802 days ago

I should have read to the end of your post:

'I will describe a very popular nonparametric method, Kernel Density Estimation, that also follows strategy 1 and is much more scalable to higher dimensions than histograms.'

link

vvanirudh 802 days ago

Haha no worries!

link