Hacker News new | ask | show | jobs
by throwawaymath 2620 days ago
You really should. If the sample is "all the gas turbines you own" and you disproportionately use Siemens sensors, your turbine failure forecast will (with high likelihood) reduce to a Siemens sensor forecast. This is easily plausible even if your sample's correlation between Siemens sensors and gas turbines is completely superfluous.
1 comments

You can't have a sampling bias when 'sampling' the entire population, because the definition of 'sampling bias' includes 'some members are not included in the sample'.
Precisely, yes. I'm talking about a sample including all representative gas turbine failures, across all sensor vendors.
You can't make predictions when sampling the entire population.