Hacker News new | ask | show | jobs
by sklink 3251 days ago
Thanks! They are different sources, I use OpenDota's open source software to gather matches and dump them into Google's BigQuery for processing using our own algorithms we've created. The algorithms are run against 15 to 30 million matches depending on how deep in the patch is.

My guess is we've landed on similar algorithms to DotaPicker, it's usually the set of matches we're using that makes the difference so I'm surprised the suggestions were the same.

Feel free to add me on Steam if you'd like: http://steamcommunity.com/id/sklink

1 comments

Are you partitioning the matches by MMR, or just doing global suggestions?

I ask because this always seems the biggest weakness of all the existing picking tools. A 1k player and a 5k player throwing the same lineup into a picking tool should ideally get dramatically different suggestions, both because of their own skill and the skill of their teammates.

There is no access to MMR from the Steam API. Best we have is Normal, High, and Very High skill splits and the problem then becomes having a large enough sample size to have accurate results.

114 heroes that all influence each other when they play together so if you're looking at matchups for two heroes that don't get played frequently the accuracy is already limited.

That said, my strength is in front end, not statistics. Getting a Normal and High split might be fine as is.

True, but the OpenDOTA API does give access to estimated MMR stats for many games, which are pretty accurate. So you could use those if you wanted to.

That would limit the available match pool, granted. It depends on how many thousand games you need to make predictions, I suppose. I was able to pull some interesting data a while back from the 2k pool but I wasn't trying to solve as hard a problem as a global hero picker.

They do, but they don't allow sequential pulling of match data.

I would have to collect my own data and poll their API with each match id to get the MMR estimate.

Although that works in theory, their API rate limit is 3/s and last time I checked there were about 1M matches / day (~12/s) so we couldn't keep up.