Hacker News new | ask | show | jobs
by thanatropism 3244 days ago
Yes. The blogpost is about the organizational difficulties in unlocking the value of technically sound "data science" projects, but these in turn are the tip of an iceberg of "omg watson" on the executive side and "machine learning does well on $archetypal_dataset, it can do anything!" on the techie side.

A while ago there was a Kaggle project to solve certain conjectures on prime number theory. Seriously?

2 comments

> A while ago there was a Kaggle project to solve certain conjectures on prime number theory. Seriously?

I've seen a surprising number of Kaggle projects setting (or claiming to achieve) objectives that look impossible - things like extracting complex insights from such short signals that they apparently violate the pigeonhole principle.

The worst demonstration was looking at the results of a college class with "do a Kaggle project" as the final task. It was painfully obvious that all of the 'best' results were either extreme overfitting or fake data science (that is, using a strong algorithm to start and getting no gains from training).

Which means that many of the soon-to-graduate students had concluded that good data science meant getting strong results, not producing reliable and novel insights. It felt a bit like a software-centered version of what social psychology has been suffering from.

Got a link to that Kaggle competition?
Here's the link. It was a playground competition (i.e., no rewards) - "This competition challenges you create a machine learning algorithm capable of guessing the next number in an integer sequence. While this sounds like pattern recognition in its most basic form, a quick look at the data will convince you this is anything but basic!"

https://www.kaggle.com/c/integer-sequence-learning