Hacker News new | ask | show | jobs
by karpierz 1505 days ago
Let me try to make the point clearer:

1. Investors expect Huggingface to extract more than $100M from the market. Otherwise they'd be called 'donors'.

2. If they openly publish models, then their APIs will be undercut by other providers who can take the published model and host it for cheaper. It would be cheaper for other companies because: they don't need to pay the cost of training the model, and they can specialize in simply hosting models.

3. Because of 2), Huggingface would need to avoid allowing other companies to host models, including internal APIs (because then providers would simply spin up to making hosting those internal APIs easy).

4) Because of 3), their policy of publishing trained models openly has to change.

So the question that the original poster was asking is: what Huggingface policies will change, given the need to make returns on this investment?

The original poster is likely thinking of OpenAI, which went down a similar route (starting training open models, took in a bunch of money, realized that openly publishing them wasn't sustainable, kept the models secret and created locked down APIs for accessing them).

> So if you want someone to answer precisely how they'll extract hundreds of millions of dollars from an emerging market, I have to imagine this isn't the correct forum to expect such answers.

This market isn't new; Google, AWS, OpenAI, etc. all have APIs they charge for. They also have services to host trained models for you. How will Huggingface make money without resorting to hiding its models?

3 comments

"This market isn't new; Google, AWS, OpenAI, etc. all have APIs they charge for."

And if they were standalone businesses they'd be losing money, it's neither a big nor profitable market.

When the business model for a project is not 'really obvious' it's usually a bad sign.

AirBnB, Uber, Stripe etc. - 'how' they make money is obvious, it's intrinsic to the product.

I don't think OpenAI is a valid comparison. Huggingface's mission, unlike in the case of OpenAI, is not training models, but being the standard service for sharing them. The vast majority of models and datasets available at the Huggingface Hub have been provided by third-party companies or researchers. They aim to be the Github of ML models and data, not an AI-building startup.
Yep - "Huggingface Enterprise" just like there's "Github Enterprise" seems like a straightforward way to make money, at least to me? Does Microsoft make good money from Github Enterprise?
Hugging Face is selling CPU cycles. They're also letting you upload your own datasets that aren't "limited" like others. I'm not quite sure where you think their approach of "open models" is wrong, they still sell the CPU cycles.

The idea that restricting access to the data is the only way to profit is such an archaic way of thinking. Hugging Face, if they keep making a good user interface and a good front end, will very much be able to fill the niche it is designed for: people who can't afford a $10-20k rig to run a model but who need to run it for their backend project.

Also, it may be due to using HN, but when I think of "where can I run a model" or "get a dataset" I think Hugging Face. They are leveraging the democratization of the data.

Thanks for clarifying, I misunderstood what Huggingface's product was.

I see the niche. The risks are:

- the mid market is constantly churning; either players become too big and you can't meet their requirements or they go bankrupt. Customer acquisition becomes a pretty big expense.

- selling CPU cycles is a cutthroat business which competes pretty directly with AWS, Azure, and Google Cloud. Their edge will likely be ease of use, but at some scale, the larger providers will be able to undercut them hard.

- selling a solution for managing datasets and training models using cloud CPUs is a crowded market.

- not sure how trustworthy the company is with private datasets. Easier to trust an established vendor.

But it wouldn't be a startup if there weren't risks.