Hacker News new | ask | show | jobs
by vonnik 3626 days ago
The history of machine learning startups is littered with companies that thought a hosted web service was a good idea. The problem with this model is that big data, by definition, is costly to move. So if a managed service is not generating and storing the data you need to process with machine learning or deep learning (as you might conceivably with AWS), then you probably don't want to move your data to those algorithms or models. All you'll get are small-data users. The models and algos need to go to the data. That's the most efficient approach, and it means you have to go on prem... Fwiw, that's what we're trying to do with Skymind and Deeplearning4j.

https://skymind.io/ http://deeplearning4j.org/

2 comments

My understanding of the post is it would host learned models. You'd train them wherever, but host the learned model in algorithmia, which exposes it through an api, making it easy for others to use your model.
It takes big data to train but that trained model can work on "small data" or big data. One-off uses for apps really do lend themselves well to a hosted solution like this IMHO. If you need to classify lots of data – then you are probably at a point to either train your own model or buy it from the developer via this site, I'd think.
Because if I am capable of training a fancy pants deep learning model to do something helpgul, obviously I need a service to host my model so badly giving someone my model is a better idea than paying $10 a month for an EC2 instance.
There's a clear trend in the industry to increasingly rely on cloud services, so it seems reasonable that machine learning would follow the same trend. As long as the compute is in the same data center, data transfer is rarely the bottleneck for these kinds of deep learning algorithms, which is why we designed algorithmia to be able to operate anywhere -- on all the major cloud providers, as well as on premise.
Right, but the question is: Whose cloud and what kind of cloud? Are we talking private cloud, virtual private cloud? Who manages it? Even saying "as long as the compute is in the same data center" is a huge assumption. I think it's great that Algorithmia can go operate anywhere. How do you do that? What do you need to operate well on prem?
At "enterprise" level there is a lot of interest in Hybrid Clouds, because on premises is still a requirement.
(Disclaimer: I'm biased, I'm vonnik's cofounder):

I agree that most startups need to get an MVP out the door as soon as possible which leads to clouds. I think hybrid cloud will be the way to go long term.

If you think about it, on one side we have things like AWS and others where devops and "make running your own infra at scale easy" like docker and k8s. On prem in some form isn't going anywhere. What WILL be interesting are the plays like say: convox where you can manage a cloud like you would an on prem openstack/k8s deployment.

Can they anonymize the data before receiving it from their clients? It would be a great advantage to be able to use their service in a privacy-conscious way.