| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nsthorat 3180 days ago

We're using SqueezeNet (https://github.com/DeepScale/SqueezeNet), which is similar to Inception (trained on the same ImageNet dataset) but is much smaller - 5MB instead of inception's 100MB - and inference is much much quicker.

The application takes webcam frames and infers through SqueezeNet, producing a 1000D logits vector for each frame. These can be thought of as unnormalized probabilities for each of ImageNet's 1000 classes.

During the collection phase, we collect these vectors for each class in browser memory, and during inference we pass the frame through SqueezeNet and do k-nearest neighbors to find the class with the most similar logits vector. KNN is quick because we vectorize it as one large matrix multiplication.

I'll go deeper in a blog post soon :)

2 comments

eggie5 3179 days ago

So you're doing nearest neighbour search on the images features from the CNN. This is alluded to in Figure 4 of the DeCaf paper: https://twitter.com/eggie5/status/907120374575505408

link

eggie5 3179 days ago

alexnet paper not decaf paper!

link

amelius 3180 days ago

Interesting!

I'm curious why you've used a different classification algorithm on top of a neural network. I would expect that a neural network on top of a pretrained network could give similar results, with the benefit of simpler code. Is performance the reason?

Anyway, I'm looking forward to your blog post.

link

nsthorat 3180 days ago

Training a neural network on top would require a "proper" training phase, and finding the right hyperparameters that work everywhere turned out to be tricky. Actually, this is what we did originally, in the blog post we'll try to show demos of each of the approaches and explain why they don't work.

KNN also makes training "instant", and the code much much simpler.

link

amelius 3180 days ago

That makes sense.

By the way, I think your software could become very popular on the Raspberry Pi, because it would be very cheap and fun to use it for all sorts of applications (e.g. home automation).

link

nsthorat 3179 days ago

https://github.com/PAIR-code/deeplearnjs/issues/158

link

make3 3180 days ago

Basically, read this paper: https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf

link