There is also DeCAF, which actually includes a way to load a pretrained ImageNet network based on cuda-convnet. I have had pretty recent success using this blob as preprocessing for image classification, ala http://arxiv.org/abs/1310.1531.
My current code as an example of combining sklearn/pylearn2 with DeCAF preprocessing (under the decaf folder, sklearn usage is under previous commits):
Thanks for the DeCAF plug! Here's a demo of the classifier with the pre-trained ImageNet weights in action: http://decaf.berkeleyvision.org/
I also have to take the opportunity to plug Caffe [1] - Yangqing's replacement for DeCAF which he actually open sourced just a few hours ago. All the heavy processing (e.g., forward/backprop) can be run either on your (CUDA-enabled) GPU or on the CPU, and the GPU implementation is actually a bit faster than cuda-convnet. The entire core is (imo) very well-engineered and written in clean lovely C++, but it also comes with Python and Matlab wrappers. I've personally been hacking around inside the core for about a month and it has really been a pleasure to work with.
Quick question about your code for the conv net, why do you resize the images down to 32x32? I thought one of the big features of conv nets was the fact that they input does not have to be the same, it just slides a window around the image. Am I complete wrong with this one?
Would you be willing to maybe print out the weights for each layer? I'd be interested to see what features your conv net is capturing.
I was (and still am) trying to use an already trained CIFAR10 net in a similar manner to DeCAF/ImageNet. Because CIFAR10 operates on 32x32 color images, I did the same thing for the input of the DeCAF experiment. As far as I know, the inputs to the network need to be identical between train/test sets , though they can be 0-padded/color filled to make the dimensions match, it may affect results - haven't tried anything but scaling personally. I am pretty sure there are 2 sets of scaling happening for my DeCAF experiment: down to 32x32 with convert, then UP to 512x512, then the center 256x256 is pulled out. I think this may affect my results a little :)
The plan is to operate on 32x32 data for now, then try scaling up the input images or just scaling to 512x512 to see how input data size/resolution affects the DeCAF/pylearn2 classification result, either positively or negatively.
As far as network weights, I haven't tried to print/plot the DeCAF weights yet (though there are images in the DeCAF paper itself). For pure pylearn2 networks, there is a neat utility called show_weights.py in pylearn2/scripts.
I also have to take the opportunity to plug Caffe [1] - Yangqing's replacement for DeCAF which he actually open sourced just a few hours ago. All the heavy processing (e.g., forward/backprop) can be run either on your (CUDA-enabled) GPU or on the CPU, and the GPU implementation is actually a bit faster than cuda-convnet. The entire core is (imo) very well-engineered and written in clean lovely C++, but it also comes with Python and Matlab wrappers. I've personally been hacking around inside the core for about a month and it has really been a pleasure to work with.
[1] http://daggerfs.com/caffe/