Hacker News new | ask | show | jobs
by currymj 1911 days ago
I don't disagree with any of that, but I still think a responsible, clear-thinking ML practitioner can avoid having to assume the form of the data-generating process, depending on their application.

In some cases if you care about PAC generalization bounds, it's even the case that the bounds do actually hold for all possible distributions.

1 comments

I think it's more meaningful to have the discussion in a specific problem domain since statistical inference or ML are just tools to better model a problem / phenomenon. The domain (prior) knowledge -- everything else that's not stats / ML, are the keys to build a more robust model. Leave the problem domain out we are left just with pure mathematical theories and the points can only be proved by simulated data.