Hacker News new | ask | show | jobs
by m_ke 2198 days ago
I guess Sama plans on manufacturing growth metrics by forcing YC companies to pretend that they're using this.

Generic machine learning APIs are a shitty business to get into unless you plan on hiring a huge sales team and selling to dinosaurs or doing a ton of custom consulting work, which doesn't scale the way VCs like it to. Anybody who will have enough know how to use their API properly can jus grab an open source model and tune it on their own data.

If they plan on commercializing things they should focus on building real products.

3 comments

Not everyone wants to be an admin to their infrastructure. Real existing services like Heroku and Squarespace exist as useful services because even though you might know how to design and build a website from scratch, sometimes you just need something done quickly without too much worrying about details of the system that do not matter for your project at this point. I really don't see how this wouldn't apply to AI projects as well.

I could make a much better site coding my own website from scratch and setting up servers myself, but for some projects I wouldn't even think about it that way, because using Heroku or Squarespace I can save a LOT of time and get the results I need much quicker.

That's true, but machine learning models are not twilio or sendgrid, you have to tune them for your use case, monitor their performance and handle the uncertainty of their outputs. Doing that well requires a data scientist and if you have one they will be much more productive iterating on their own models instead of depending on a 3rd party black box.
Not a data scientist myself, but plenty of data scientists in a consultancy company that I used to work in said that they have to implement variants of a limited set of models over and over again, because they couldn't reuse code and infrastructure. The project contracts demanded that all IP created by the consultant is the property of the client. This even caused some of the data scientists to lose motivation, because the job wasn't challenging to them intellectually as it involved setting up the same stuff again and again. Very rarely would their actual expertise be needed in the job.

I am not sure if this particular service solves the problem for them in any way, but to my ear it sounds like there is a need for code and infrastructure reuse in the data scientists domain that is ripe for innovation.

I'm pretty sure people said the exact same thing about Algolia when it was getting started (you have to tune search for your use case! How could you possibly use a search provider?!?)

Truth about the situation: - Transformers generalize well and don't need much fine tuning - OpenAI can probably fine tune for your use case better than you can - Getting new models into production takes 6 months to a year at companies of this size, if you did have Data Scientists in house, it might just be better to go with a solution like this for velocity - Not every company has the talent to make an in house ML program successful.

Except the point of these larger transformer models is they generalize well over a wide range of domains or only require a small amount of transfer learning for really specific domains.

I'd say they're perfect candidates for the API as a service model.

> I guess Sama plans on manufacturing growth metrics by forcing YC companies to pretend that they're using this.

That's wrong in almost too many ways to list. Sam left YC over a year ago, nor would he do such a thing. Nor does YC have that kind of power over companies, nor would it use it that way if it did. That would be wrong and also dumb.

Sorry, that was supposed to be sarcastic. What I meant to say is that Sam has a huge network and is a phone call away from pitching any CEO in the valley. One of the biggest benefits of YC these days is the huge network of companies in your portfolio, which makes getting intros and pilots a lot easier, leading to "traction" and more VC dollars.
I imagine they’re considering offering GPT-3, which would be cost prohibitive to fine-tune for most people. I also I heard inference was too slow to be practical. Perhaps they have some FPGA magic up their Microsoft sleeves.
Nobody is putting these huge models in production, even the smaller transformer models are still too expensive to run for most use cases.

With the way the field is moving, GPT-3 will be old news in a month, when more advances are made and open sourced.

i don't understand. if they run it for you and you apply transfer learning and fine tuning on your specific use case that would reduce drastically the costs hence why their offer make sense
Precisely my point. If they could put a model as large as GPT-3 into production (at a reasonable price to the consumer), wouldn’t that be a 10x improvement?
GPT-3 isn't a 10X improvement. (At least from everything we know so far.)
If the OP is right that nobody is putting the largest models into production (which I think is in inaccurate statement), then GPT-3 in production would be a 10x (ok, 5x?) improvement over the small GPT-2s and BERTS in production? So 10x in practice, if the hypothesis is correct? Which like I said, I don’t believe to be the case.