| HN Mirror

Simulations are fantastic, and often necessary for tricky statistics problems, however what you are describing is reinventing so much of the wheel using simulation that you are going to be spending multiple orders of magnitude extra computation to get an approximately correct solution. You also do have some conceptual errors in your plan.

For example

> Assuming the true rate of nurses quitting is X, what is the chance that a random sample of 200 nurses has the expectation of quitting >= 90%.

You have just described the Binomial distribution [0], which is probably the most elementary distribution you learn about when studying probability and statistics (even the Bernoulli is just a special case of it). There's no need to run simulations to answer this particular question.

There are also some fundamental misunderstandings with your approach:

> increasing until I get >95% chance that a random sample of 200 nurses has E(quitting) > 90%.

The probability of getting > 90% 'yes/quitting' (i.e. more than 180) if the true probability 'yes' is in fact 0.9 is only 0.46. You won't cross your threshold of 95% here until you reach X=0.933

If you wanted to construct the 95% CI from pure simulation, a better approach would be to sample 200 observations from a 0.9 Bernoulli random variable (just sample from a uniform, and check if it's less than 0.9), compute the mean of the samples, and repeat this 10,000 or so times. Then look at the empirical CDF [1] (fairly easy to implement in code) and look at the lower 2.5% and upper 2.5% values and you have your bounds (which will be the same as the ones I posted within some epsilon).

I do recommend, if you're seriously interested in understanding this, picking up a basic probability/stats book and work your way through it.

0. https://en.wikipedia.org/wiki/Binomial_distribution 1. https://en.wikipedia.org/wiki/Empirical_distribution_functio...