Hacker News new | ask | show | jobs
by semi-extrinsic 2259 days ago
There was a remark in the old school Linear Algebra book we had in university (Edwards & Penney) that stuck with me, to the effect (probably I recall the details wrong) that one of the authors were once involved in data analysis of water samples collected from a bunch of rivers by 15 engineers, and it turned out no 6 of these engineers' measurements were internally consistent. The moral of the story was that real world data is messy, you need to learn least squares and related methods to make sense of the data.

Now with "data science" you've taken a step further, and instead of applying the math to lab reports on meticulously filled out forms, you're going to aggregate all the messy sources you can get your hands on. Of course your headaches will multiply.