| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jdonahue 4467 days ago

As a user of and contributor to Caffe [1], I have to take the opportunity to plug it here. Like the CCV classifier linked, Caffe is fully open-source [2], has a downloadable state-of-the-art model pre-trained on ImageNet [3], and scripts/documentation that make it very easy to compute features using our pre-trained model or other models [4].

Unlike the linked CCV release (unless I'm misinformed -- haven't actually tried it, please correct me if I say anything inaccurate), Caffe supports completely customizable architectures via a configuration language, fully supports training [5], finetuning [6], and inference (feature extraction/classification) in these customizable architectures, and seamlessly runs on both CPU and GPU.

Caffe is also very fast; twice as fast at CPU feature computation as its predecessor DeCAF, and faster than cuda-convnet at training/testing ImageNet architectures on a Titan/K40 GPU.

The linked CCV release does mention Caffe, but quickly dismisses it due to the license. It's true that our pre-trained model [3] is licensed only for non-commercial use, but ALL of the Caffe code is BSD-licensed, including the exact script we used to train said model. So if you're a commercial entity, using Caffe for feature extraction/classification from a state-of-the-art network is a matter of purchasing a $1000 GPU (NVIDIA Titan -- I'm assuming you own a computer), downloading the ImageNet dataset, and waiting about a week for training to converge. This will buy you the ability to adapt the classifier to YOUR visual classification problem by finetuning [6], rather than being stuck with the particular 1000 categories the pre-trained model knows about.

[1] http://caffe.berkeleyvision.org/

[2] https://github.com/BVLC/caffe

[3] http://caffe.berkeleyvision.org/getting_pretrained_models.ht...

[4] http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/ex...

[5] http://caffe.berkeleyvision.org/imagenet_training.html

[6] http://caffe.berkeleyvision.org/caffe-presentation.pdf (see page 14)

1 comments

liuliu 4467 days ago

Hey, Donahue, I actually referenced Caffe extensively in detailed documentation: http://libccv.org/doc/doc-convnet/

This is a preliminary implementation, but is a complete one includes both training and testing code. The big difference is that ccv is a computer vision library in general and Caffe is a artificial neural network library. This does mean quite a few different ways of approaching things, for example, ccv's implementation does allow you to specify network topology, but doesn't have a implementation of local non-weight-sharing layer (because CIFAR-10 and ImageNet doesn't need such type of layer).

You can also chop off the last full connect layer and train a SVM on top of it with ccv, I actually plan to do exactly what you guys did with that and train on VOC 2012 dataset.

All in all, ccv 0.6 is a preliminary implementation of convnet, but it is important for a library claims to be "modern" to contain the said implementation. And providing the pre-trained data model with a liberal license (so that you can fine-tune your classification problem on top of the pre-trained data model) is also aligned with ccv's goal.

jdonahue 4467 days ago

I hadn't seen the detailed documentation - thanks so much for the acknowledgments there!

And thanks for correcting me about CCV's support for custom architectures and training -- I'd just assumed that it wasn't supported since it wasn't mentioned in the post, but I guess this was more of a marketing decision as most users are probably just interested in feature extraction/classification from the pretrained net. :) I would argue that GPU support is pretty necessary for training modern network architectures a la Krizhevsky to be remotely practical, though.

I apologize if I came off as overly competitive or derisive, this is obviously very nice work and it seems like an attractive option for many users. Always happy to see deep learning made more accessible and open!