Hacker News new | ask | show | jobs
by taeric 1233 days ago
Right, but this just reinforces my thought here. In order to simulate sampling, I have to know the data well enough to simulate it. Which, for many things I'd care about, if I knew the underlying distribution that well, I probably don't need to sample. :(
3 comments

i meant doing it as a theoritical planning exercise. you can throw in any number of weird distributions you might guess and you'll be surprised at how quickly sampling will fairly reliably pick up patterns and this helps you plan your sampling around uncertainty.

Of course if your underlying distribution is likely to be Gaussian which is true for many phenomena, you don't need to bother except as a pedagogical exercise.

If you know a bit of programming, that's actually sufficient to explore these ideas and verify them for yourself.

Allen Downey has a ton of open source books that use this philosophy [0] and Peter Norvig has used Python notebooks in a similar manner (look at the ones in the Probability section) [1].

[0] https://greenteapress.com/wp/ [1] https://github.com/norvig/pytudes#pytudes-index-of-jupyter-i...

> Which, for many things I'd care about, if I knew the underlying distribution that well, I probably don't need to sample

You don't have to sample directly. The entire field of Bayesian variational learning exist to deal with that very problem. Look up Markov chain Monte Carlo, Metropolis algorithm, conjugate priors, reparametrization tricks.

Thanks for the pointers, will be looking into these!