Hacker News new | ask | show | jobs
by conjectures 3386 days ago
Kaggle is a great idea, but it's steadily getting more annoying to use.

1) Cruft on all landing pages and having to click through to get to the comps page which is the site.

2) Annoying focus on exploratory notebooks. Inevitably they aren't powerful enough and people link through to external sites.

3) Forcing the use of 3rd party compute platforms to enter comps. Half the fun for me is messing around with my own ideas and this just gets in the way. These should be optional rather than required.

4) Poor incentives. Many of the comps have tiny prizes for the value of work that gets done. They're also concentrated way too much at the top. Unless there's something I want to try out, the expected value of participating is way too low to do it just for the giggles.

3 comments

I do analytics for a huge corporation and have been quite happy however some of my peers who are unhappy with the pay here participate in Kaggle for the opportunity to do well and get a better (higher paying) job.

Some of the inherent value of the work for the small prize pool is more the opportunity of doing well and being recognized for that work.

Data Science, or trendy statistics, is inherently fun which is also what makes kaggle fun. Discovery in data will always be popular among people who love to solve problems.

To your other points, I don't disagree with you-- all the steps just to participate are becoming more work than its worth, at least for me. I do a lot of the same problems asked in kaggle naturally at work.

>>> participate in Kaggle for the opportunity to do well and get a better (higher paying) job

Obviously it's anecdotal data at best, but still curious, what are the results? Because it sounds very similar to the frequently given advice for software engineers 'push code to github to land a great job'.

I can say anecdotally that my ranking on Kaggle helped me recently land a good data scientist job offer, transitioning from academia. I have spent a lot of time on Kaggle though, probably it would have been more efficient (but less fun) to spend that time spamming job boards and studying machine learning, stats, and computer science.
I've hired many people, and I don't know anyone that's ever looked at either kaggle, or stack overflow, or github commits for anything. I've seen them on resumes before, but only from very junior people, and typically from people outside of the US.

Quite frankly it's a rather bullshit signal, since it's presence only tells you that the person spends all their free time on the computer. Maybe the know something, but a traditional interview will tell you that and more.

I disagree. From junior people, it shows that they can actually do something in practice, and it's not all theory that they don't know how to apply.

A person just outside of university does not have heaps of past jobs to show. So they should just leave it blank and describe their hobbies?!

No one cares about hobbies, and Kaggle is a hobby.

An NCG should write more about class projects. Everyone has class projects.

If an NCG wants to put it down, fine. But don't color me impressed. Why should I select someone that spends their evenings alone tweaking out an extra 0.001% on a AUC curve, when I could conceivably get a more rounded individual with better team skills?

> 3) Forcing the use of 3rd party compute platforms to enter comps. Half the fun for me is messing around with my own ideas and this just gets in the way. These should be optional rather than required.

Specially this. I loved playing around with data there, but the moment most competitions have datasets with sizes in the order of tens of GBs, I'm out. I can always take the opportunity to learn AWS / Google Cloud processing methodologies, but that kills a bit of the fun of the first days.

Points 2) and 3) seem limited in scope. For 3) that was only 1 competition that I can think of, but I agree that was a terrible competition format and executed poorly. For point 2), you can just ignore the exploratory notebooks if you so choose. I agree with point 1), the website is steadily getting slower and worse, and was already a pretty slow website. I am not a web developer so I don't really know why it is so slow, but it is really noticeable. I agree wholeheartedly with point 4), if you want to make money directly from the competitions Kaggle is not a wise investment. However if you want to transition to data science Kaggle is a great resource for learning.