Hacker News new | ask | show | jobs
by DataDaoDe 2124 days ago
If you think about the scientific lifecycle: Gather Information => Form Hypothesis => Test Hypothesis => Analyze Data => Interpret => Repeat.

Then I would say the hardest parts are the "Gather Information" and "Test Hypothesis" phases. But its like this in every scientific endevour and this is nothing unique to data science.

One interesting point is, perhaps, that we as data scientists are aware that our sources for gathering information and our means for testing hypothesis are often tied to man made software or hardware systems - as opposed to dynamical real world structures. This means that theoretically and practically there is only ingenuity and will-power keeping us from building better and less time consuming ways for automating away the tedius (data cleansing/prep/etc.) parts of those processes.