| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by angerbot 3597 days ago
	Very cool. I've been toying with the idea of using something like this or perhaps the cloud vision API to automatically generate image captions for screen readers (e.g. through a browser extension) but the cost to run something like an EC2 GPU unit is prohibitive for a project like that which I wouldn't want to charge for. Running it locally on the user's machine would take far too long to train, especially as you would have to use the CPU in the majority of cases since many people don't have a separate GPU.

3 comments

GrantS 3597 days ago

While you would never do this kind of training on your user's machines (which takes multiple weeks even with a powerful GPU), you should be able to apply the trained model to a single photo nearly instantaneously. So the real roadblock is mostly that they don't appear to have a included a completely pre-trained model with this release, and it will take you as a developer a lot of GPU time to train one. But your users would not necessarily have a problem captioning images on their machines.

link

angerbot 3597 days ago

I hadn't considered that (this is really out of my depth). Any ideas on what the actual size of a trained model would be to distribute? Taking 150G on the user's hard drive is out as well, probably.

link

dharma1 3597 days ago

Depends on the model and dataset, inceptionv3 trained on imagenet is about 150mb but you can quantise the weights to 8bit and prune it much smaller without affecting perf much

link

matt4077 3597 days ago

Here's a complete model for image recognition that works fine on a notebook: https://www.tensorflow.org/versions/r0.10/tutorials/image_re...

link

nl 3597 days ago

You can run this model on a RasberryPi.

Training is another matter.

link