| Hi all - I'm one of the founders of OpenSignal and wanted to add some technical detail on what we're doing here - It's a V1 (although 4 years in the making!) and would love any feedback. The source of the Wifi database is our existing app OpenSignal [1] which crowdsources data on the coverage of mobile networks. We also collect data on Wifi hotspots and have since mapped over 500 million of these since 2010. Although we can automatically detect if a password is required to connect, determining if it's free or not is another challenge as there are plenty of non password protected walled-garden hotspots and plenty of (what we consider) free hotspots where you still need to find out the password (e.g. cafes, restaurants etc). So our algorithm looks at a number of heuristics of each hotspots including (but not limited to): - Are there clues in the SSID (e.g. does it mention +ve keywords like 'free', 'cafe' or -ve keywords like 'staff', 'private', 'employees' etc). - Is it part of a wider network that we know more about (e.g. 'ATT-wifi', 'Starbucks-Wifi' etc) - Do we know what kind of place it is? (e.g. if it's a cafe, is it more or less likely to be free?) - Is there a walled-garden behind the hotspot (we attempt some automatic background checking of this similar to the way Android & iOS will do this on new hotspots). - How many distinct users have we detected connecting to a particular hotspot (If many, is that a sign it's a public place?) This is just a sample - we look at over 20 different heuristics on each Wifi, none of which are individually conclusive, but together give us a strong indication on whether a Wifi hotspot is free or not. However, it's unlikely we will ever be able to completely accurately classify free Wifi through a purely hueristic algorithm which is why with this new app we are asking for user input to help us curate this algorithm. Not only can users help tweak any hotspots that we have classified incorrectly, but we can train our algorithms to be smarter by learning from their input. We believe that this dual approach of an automated algorithm combined with manual curation from the crowd is the best way to solve a problem like this in the long run (not least because we didn't want to provide users with a blank canvas and ask them to classify Wifi without us doing any of the hard work first!). Any feedback much appreciated! [1] http://opensignal.com EDIT: Tweaked line formatting |