| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tlb 3586 days ago

Indeed, deployment is a whole set of interesting issues. We haven't deployed any learned models in production yet at OpenAI, so it's not at the top of our list.

If the data and models were small and training was quick (on the order of compilation time), I'd just keep the training data in git and train the model from scratch every time I run make. But the data is huge, training requires clusters of machines and can take days, so you need a pipeline.

An industrial strength system looks like this: https://code.facebook.com/posts/1072626246134461/introducing...

3 comments

platypii 3586 days ago

CTO of Algorithmia here. We've spent a lot of time thinking about the issues of deploying deep learning models. There are a whole set of challenges that crop up when trying to scale these kinds of deployments (not least of which is trying to manage GPU memory).

It would be interesting to compare notes since we have deployed a number of models in production, and seem to focus on a related but different set of challenges. kenny at company dot com.

link

programnature 3586 days ago

Yes, understandable. I encourage viewing this as part of the 'open' mandate.

link

agibsonccc 3586 days ago

When you're thinking of "deployment" here - wouldn't it make sense to use the google compute engine for this?

I'd be curious to see if there's a legit speed up there with the "real tensorflow".

For "on prem" stuff I think "deployment" is going to depend on the actual end use case.

Eg:no one in industry will keep their "training data" in git. They'd have an actual database with other systems surrounding it.

If it's just "run the model locally to view a web page running in a docker container I wouldn't see the problem here though.

The infra will also be different for training vs inference. For training you'll want gpus, but it's not realistic to run gpus with inference yet.

I'd love someone to comment on: https://developer.nvidia.com/gpu-inference-engine

though.

There's going to be a lot of non deep learning "stuff" involved here.

Much of it will be connected to the use case. Eg: deep learning for log analytics in production will be different than a computer vision pipeline.

Warning: highly biased player in the space.

link