| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ackbar03 1809 days ago
	So is this mainly focused on deployment for applications with high-speed inference requirements? I didn't dive into product in detail. I run my own deep-learning based web-app and inference speed optimization is pretty non-trivial. As far as I know production level speed requirements require use of tensorrt which is definitely not hot-start and requires more than a few minutes to load (i'm not too sure what's going on under the hood, not an expert) but has inference speeds of up to x2 or more, so not quite sure what your targeting or if you've actually managed to solve that problem which would be highly impressive

2 comments

stingraycharles 1809 days ago

I suspect the audience is more about the GPU hosting aspect, effectively making GPU-based applications “serverless”.

To me, adding GPUs into the devops mix typically increases the complexity significantly, and I would definitely pay money to someone who can just take my model, host it, and let them deal with the complexities around it.

link

theo31 1809 days ago

We don't use TensorRT at the moment, but it is something that we are exploring.

link