| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jsheard 641 days ago
	Decentralized inferencing perhaps, but the training is very much centralized around Metas continued willingness to burn obscene amounts of money. The open source community simply can't afford to pick up the torch if Meta stops releasing free models.

3 comments

leetharris 641 days ago

There's plenty of open source AI out there that isn't Meta. It's just not as good.

The #1 problem is not compute, but data and the manpower required to clean that data up.

The main thing you can do is support companies and groups who are releasing open source models. They are usually using their own data.

link

jsheard 641 days ago

> There's plenty of open source AI out there that isn't Meta. It's just not as good.

To my knowledge all of the notable open source models are subsidised by corporations in one way or another, whether by being the side project of a mega-corp which can absorb the loss (Meta) or coasting on investor hype (Mistral, Stability). Neither of those give me much confidence that they will continue forever, especially the latter category which will just run out of money eventually.

For open source AI to actually be sustainable it needs to stand on its own, which will likely require orders of magnitude more efficient training, and even then the data cleaning and RLHF are a huge money sink.

link

exe34 641 days ago

if you can do 100x more efficient training with open source, closeAI can simply take that and train a model that's 100x bigger/longer/more tokens.

link

bugglebeetle 641 days ago

AKA why Unsloth is now YC backed for their even better (but closed source) fine-tuning.

link

moffkalast 641 days ago

https://huggingface.co/datasets/HuggingFaceFW/fineweb

The #1 problem is absolutely compute. People barely get funding for fine tunes, and even if you physically buy the GPUs it'll cost you in power consumption.

That said, good data is definitely the #2 problem. But nowadays you can just get good synthetic datasets from calling closed model APIs or just using existing local LLMs to sift through trash. That'll cost you too.

link

citboin 641 days ago

>The main thing you can do is support companies and groups who are releasing open source models. They are usually using their own data.

Alternatively we could create standardized open source training data like wikipedia, wikimedia as well as public domain literature and open courseware. I'm sure that there are many other such free and legal sources of data.

link

KaiserPro 641 days ago

but the training data is one of the key bits that makes or breaks your model's performance.

There is a reason why datasets are private and the model weights aren't.

link

Der_Einzige 641 days ago

Compute is for sure the number one problem. Look at how long it’s taking for anything better than Pony Diffusion to come out for NSFW image gen despite the insane amount of demand for it.

Look at how much computer purple AI actually has. It’s basically nothing.

link

cynicalpeace 641 days ago

One area that's interesting, but easy to dismiss because it's the ultimate cross-section of hype (AI and crypto) is bittensor.

AFAICT it decentralizes the training of these models by giving you an incentive to train models which will mine the crypto if you're improving it.

I learned about it years ago, mined some crypto, lost the keys and now kicking myself cuz I would've made a pretty penny lol

link

jsheard 641 days ago

Does it actually work? AIUI the current consensus is that you need massive interconnect bandwidth to train big models efficiently, and the internet is nowhere near that. I'm sure the Nvidia DGX boxes have 10x400Gb NICs for a reason.

link

bloatedGoat 641 days ago

There are methods that make it feasible to train models over the internet. DiLoCo is one [1] and NousResearch has found a way to improve on that using a method they call DisTro [2].

1. https://arxiv.org/abs/2311.08105

2. https://github.com/NousResearch/DisTrO?tab=readme-ov-file

link

cynicalpeace 641 days ago

I have no idea. The idea is certainly interesting but I've never actually understood how to run inference on these models... the people that run it seem to be unable to just talk simply.

link

CaptainFever 641 days ago

I've seen bittensor before. I think it makes sense, as a way to incentivise people to rent their GPUs, without relying on a central platform. But I've always felt it was kind of a scam because it was so hard to find any guides on how to use it.

Also, this doesn't seem to actually solve the issue of fine tuners needing funding to rent those GPUs? One alternative is something like AI Horde, which pays GPU providers with "labour vouchers" that allow them to get priority next time they want GPU. Requires a central platform to track vouchers and ban those who exchange them. Basically a sort of real-life comparison of mutualism (AI Horde) vs capitalism (bittensor).

link

numpad0 641 days ago

Centralized production, decentralized consumption.

link