| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by austinkhale 664 days ago

There are legit criticisms of Sam Altman that can be levied but none of them are in this article. This is just reductive nonsense.

The arguments are essentially:

1. The technology has plateaued, not in reality, but in the perception of the average layperson over the last two years.

2. Sam _only_ has a record as a deal maker, not a physicist.

3. AI can sometimes do bad things & utilizes a lot of energy.

I normally really enjoy the Atlantic since their writers at least try to include context & nuance. This piece does neither.

3 comments

BearOso 664 days ago

I think LLM technology, not necessarily all of CNN, has plateaued. We've used up all the human discourse, so there's nothing to train it on.

It's like fossil fuels. They took billions of years to create and centuries to consume. We can't just create more.

Another problem is that the data sets are becoming contaminated, creating a reinforcement cycle that makes LLMs trained on more recent data worse.

My thoughts are that it won't get any better with this method of just brute-forcing data into a model like everyone's been doing. There needs to be some significant scientific innovations. But all anybody is doing is throwing money at copying the major players and applying some distinguishing flavor.

link

theptip 664 days ago

What data are you using to back up this belief?

Progress on benchmarks continues to improve (see GPT-o1).

The claim that there is nothing left to train on is objectively false. The big guys are building synthetic training sets, moving to multimodal, and are not worried about running out of data.

o1 shows that you can also throw more inference compute at problems to improve performance, so it gives another dimension to scale models on.

link

KaiserPro 664 days ago

> Progress on benchmarks continues to improve (see GPT-o1).

thats not evidence of a step change.

> The big guys are building synthetic training sets

Yes, that helps to pre-train models, but its not a replacement for real data.

> not worried about running out of data.

they totally are. The more data, the more expensive it is to train. Exponentially more expensive.

> o1 shows that you can also throw more inference compute

I suspect that its not actually just compute, its changes to training and model design.

link

senko 664 days ago

Actually, the sources we had (everything scraped from the internet) turns out to be pretty bad.

Imagine not going to school and instead learning everything from random blog posts or reddit comments. You could do it if you read a lot, but it's clearly suboptimal.

That's why OpenAI, and probably every other serious AI company, is investing huge amounts in generating (proprietary) datasets.

link

chuckledog 664 days ago

GitHub, especially filtered by starred repos, is a pretty high quality dataset.

link

askafriend 664 days ago

Any thoughts on synthetic data?

link

rocho 664 days ago

See "AI models collapse when trained on recursively generated data"

https://www.nature.com/articles/s41586-024-07566-y

link

slashdave 664 days ago

Dead end. You cannot create information out of nothing.

link

NiloCK 662 days ago

How did alpha zero succeed?

link

slashdave 661 days ago

By introducing new training data (train of thought) that was independently verified or explicitly constructed using external information.

link

Filligree 664 days ago

Which is why thought experiments are always useless.

link

Lerc 664 days ago

You're a creationist then?

link

_cs2017_ 664 days ago

To avoid disappointment, just think of the mass news media as a (shitty) LLM. It may occasionally produce an article that on the surface seems to be decently thought out, but it's only because the author accidentally picked a particularly good source to regurgitate. Ultimately, they just type some plausible sentences without knowing or caring about the quality.

link

croes 664 days ago

If Sam claimed we'll fix climate with the help of AI, he is either a liar or a fool.

Our problem isn't technology, it's humans.

Unless he suggests mass indoctrination per AI AI won't fix anything.

link