| As a user of and contributor to Caffe [1], I have to take the opportunity to plug it here. Like the CCV classifier linked, Caffe is fully open-source [2], has a downloadable state-of-the-art model pre-trained on ImageNet [3], and scripts/documentation that make it very easy to compute features using our pre-trained model or other models [4]. Unlike the linked CCV release (unless I'm misinformed -- haven't actually tried it, please correct me if I say anything inaccurate), Caffe supports completely customizable architectures via a configuration language, fully supports training [5], finetuning [6], and inference (feature extraction/classification) in these customizable architectures, and seamlessly runs on both CPU and GPU. Caffe is also very fast; twice as fast at CPU feature computation as its predecessor DeCAF, and faster than cuda-convnet at training/testing ImageNet architectures on a Titan/K40 GPU. The linked CCV release does mention Caffe, but quickly dismisses it due to the license. It's true that our pre-trained model [3] is licensed only for non-commercial use, but ALL of the Caffe code is BSD-licensed, including the exact script we used to train said model. So if you're a commercial entity, using Caffe for feature extraction/classification from a state-of-the-art network is a matter of purchasing a $1000 GPU (NVIDIA Titan -- I'm assuming you own a computer), downloading the ImageNet dataset, and waiting about a week for training to converge. This will buy you the ability to adapt the classifier to YOUR visual classification problem by finetuning [6], rather than being stuck with the particular 1000 categories the pre-trained model knows about. [1] http://caffe.berkeleyvision.org/ [2] https://github.com/BVLC/caffe [3] http://caffe.berkeleyvision.org/getting_pretrained_models.ht... [4] http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/ex... [5] http://caffe.berkeleyvision.org/imagenet_training.html [6] http://caffe.berkeleyvision.org/caffe-presentation.pdf (see page 14) |
This is a preliminary implementation, but is a complete one includes both training and testing code. The big difference is that ccv is a computer vision library in general and Caffe is a artificial neural network library. This does mean quite a few different ways of approaching things, for example, ccv's implementation does allow you to specify network topology, but doesn't have a implementation of local non-weight-sharing layer (because CIFAR-10 and ImageNet doesn't need such type of layer).
You can also chop off the last full connect layer and train a SVM on top of it with ccv, I actually plan to do exactly what you guys did with that and train on VOC 2012 dataset.
All in all, ccv 0.6 is a preliminary implementation of convnet, but it is important for a library claims to be "modern" to contain the said implementation. And providing the pre-trained data model with a liberal license (so that you can fine-tune your classification problem on top of the pre-trained data model) is also aligned with ccv's goal.