| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lz400 524 days ago
	I understood SETI style meaning crowdsourced. Instead of mining bitcoin you mine LLMs. It's a nice idea I think. Not sure about technical details, bandwidth limitations, performance, etc.

2 comments

HPsquared 523 days ago

Unfortunately, LLM training is not as computationally easy (embarrassingly parallel) as mining bitcoins.

link

ghxst 523 days ago

If that were to be solved (if at all possible, and feasible / competitive) I can definitely see "LLM mining" be a historic milestone. Also much closer to the spirit of F@H in some sense, depending how you look at it. Would there be a financial incentive? And how would it be distributed? Could you receive a stake in the LLM proportional to the contribution you did? Would that be similar in some sense to purchasing stock in an AI company, or mining tokens for a crypto currency? Potentially a lot of opportunity here.

link

rcxdude 523 days ago

This would require a revolution in the algorithms used to train a neural net: currently LLM training is at best distributed amongst GPUs in racks in the same datacenter, and ideally nearby racks, and that's already a significant engineering challenge, because each step needs to work from the step before, and each step updates all of the weights, so it's hard to parallelise. You can do it a little bit, because you can e.g. do a little bit of training with part of the dataset on one part of the cluster, and another part elsewhere, but this doesn't scale linearly (i.e. you need more compute overall to get the model to converge to something useful), and you still need a lot of bandwidth between your nodes to synchronize the networks frequently.

All of this makes it very poorly suited to a collection of heterogeneous compute connected via the internet, which wants a large chunk of mostly independent tasks which have a high compute cost but relatively low bandwidth requirements.

link

HPsquared 523 days ago

The models are too large to fit on a desktop GPU's VRAM. Progress would either require smaller models (MoE might help here? not sure) or bigger VRAM. For example training a 70 billion parameter model would require at least 140GB of VRAM in each system, whereas a large desktop GPU (4090) has only 24GB.

You need enough memory to run the unquantized model for training, then stream the training data through - that part is what is done in parallel, farming out different bits of training data to each machine.

link

mr_toad 523 days ago

Data parallel training is not the only approach. Sometimes the model itself needs to be distributed across multiple GPU.

https://www.microsoft.com/en-us/research/blog/zero-deepspeed...

The communications overhead of doing this over the internet might be unworkable though.

link

htrp 523 days ago

or if the internet became significantly faster fiber connections

link

HPsquared 523 days ago

A single GPU has memory bandwidth around 1000 GB/s ... that's a lot of fiber! (EDIT: although the PCIE interconnect isn't as fast, of course. NVLink is pretty fast though which is the sort of thing you'd be using in a large system)

link

brookst 523 days ago

Latency still matters a lot…

link

lz400 523 days ago

damn it! but nice research area

link

Mountain_Skies 523 days ago

SETI had a clear purpose that donors of computer resources could get behind. The LLM corps early on decided to drink the steering poison that will keep there from ever being a united community for making open LLMs. At best you'll get a fractured world of different projects, each with its own steering directives.

link

Aerroon 523 days ago

The internet is for ____.

That could be a factor that unites enough people to donate their compute time to build diffusion models. At least if it was easy enough to set up.

link

CaptainFever 523 days ago

Related: people donating computing power to run diffusion and text models, which is definitely largely used for porn.

https://stablehorde.net/

Or the large amounts of community efforts (not exactly crowd sourced though) for diffusion fine-tunes and tools! Pony XL, and other uncensored models, for example. I haven't kept up with the rest, because there's just too much.

link

lostmsu 522 days ago

You don't have to donate, we will pay you for idle time of your gaming GPU: https://borg.games/setup

link