| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stcredzero 888 days ago
	Open Source and Free Software wasn't formulated to deal with the need for this level of gargantuan amounts of data and compute. Can the public compete? What percentage of the technical public could we expect to participate, and how much data, compute, and data quality improvement could they bring to the table? I suspect that large corporations are at least an order of magnitude advantaged economically.

3 comments

RandomWorker 888 days ago

There is a big effort being worked on in China, Yuanqing Lin gave an interview on the deep learning course that works on this magnitude [1]. They suggest that they will host both the resources to store the data, train the data, and have all those algorithms available in China.

[1] https://www.youtube.com/watch?v=3GfOnI3goAk

link

tikhonj 888 days ago

The public doesn't have the resources to train the largest state-of-the-art LLMs, but training useful LLMs seems doable. Maybe not for most individuals but certainly for a range of nonprofits, research teams and companies.

link

stcredzero 888 days ago

Isn't is relatively easy for a smaller model to poke holes in the output of a larger model?

link

jncfhnb 888 days ago

But not nearly as in reach as modifying open source models.

link

edgarvaldes 888 days ago

Open Source and Free Software are not about the amount of data.

link