Biomedical applications of machine learning have great potential, but they suffer from a lack of data problem. It's great to see Little championing for a more open database, hopefully it has a positive effect.
For more traditional machine learning research, there are common sets of data (i.e. MNIST for handwriting recognition) which serve to benchmark new algorithms.
The main problem with biomedical data is the difficulty of acquisition, and the fact that many researchers are afraid of discovering findings that they may have missed.
I agree completely. Every time I do a talk on Big Data, especially to practitioners, I throw in a little speech about how sharing medical data could lead to one of the greatest periods of medical advancement ever.
I really think this is cool, machine learning has a lot to offer the world and this can improve quality of life for lots of people. However, two things come to mind:
1. http://archive.ics.uci.edu/ml/datasets/Parkinsons
Does it really take 5 years for research to go mainstream? Max Little's original research on this was published in 2007. I think if I were able to better diagnose Parkinson's I would want to get it out to the public as soon as possible.
2. Why the need for clinical testing? It's not like it's a drug. Last time I checked a voice recording wasn't something that had too many side effects.
For more traditional machine learning research, there are common sets of data (i.e. MNIST for handwriting recognition) which serve to benchmark new algorithms.
The main problem with biomedical data is the difficulty of acquisition, and the fact that many researchers are afraid of discovering findings that they may have missed.