Hacker News new | ask | show | jobs
by pimlottc 491 days ago
Obviously this is just for fun, but I am a bit disturbed about similar projects that try to gain some meaningful insights out of vast public data sets without the slightest attention paid to the quality of the data used. It doesn’t matter how much processing you do or how clever your algorithms are if the underlaying data is inaccurate, out of date, inconsistent, non-normalized, incomplete, etc. No dataset is perfect, but at least take some time to address it.
2 comments

Also they didn't consider that a green lake, while it may not appear green, can still be more green than all the other types of lake.
> I am a bit disturbed about similar projects that try to gain some meaningful insights out of vast public data sets without the slightest attention paid to the quality of the data used.

Indeed, this story is a very good example of that in the academic world (Economics): https://nitter.poast.org/andreloez/status/181747110580160124...