| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by taeric 1233 days ago
	Right, but this just reinforces my thought here. In order to simulate sampling, I have to know the data well enough to simulate it. Which, for many things I'd care about, if I knew the underlying distribution that well, I probably don't need to sample. :(

3 comments

sfifs 1233 days ago

i meant doing it as a theoritical planning exercise. you can throw in any number of weird distributions you might guess and you'll be surprised at how quickly sampling will fairly reliably pick up patterns and this helps you plan your sampling around uncertainty.

Of course if your underlying distribution is likely to be Gaussian which is true for many phenomena, you don't need to bother except as a pedagogical exercise.

link

sn9 1232 days ago

If you know a bit of programming, that's actually sufficient to explore these ideas and verify them for yourself.

Allen Downey has a ton of open source books that use this philosophy [0] and Peter Norvig has used Python notebooks in a similar manner (look at the ones in the Probability section) [1].

[0] https://greenteapress.com/wp/ [1] https://github.com/norvig/pytudes#pytudes-index-of-jupyter-i...

link

KRAKRISMOTT 1233 days ago

> Which, for many things I'd care about, if I knew the underlying distribution that well, I probably don't need to sample

You don't have to sample directly. The entire field of Bayesian variational learning exist to deal with that very problem. Look up Markov chain Monte Carlo, Metropolis algorithm, conjugate priors, reparametrization tricks.

link

taeric 1232 days ago

Thanks for the pointers, will be looking into these!

link