| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwaway9870 1308 days ago
	"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood," In true science, it is exceptionally hard to distinguish truth from falsehood for many of the interesting subjects. It can take decades of work to reach consensus on what is "truth." Physics in the early 20th century is a great example of this debate.

6 comments

mannykannot 1308 days ago

To be clear, the fact that it is difficult is not a defense of Galactica and its proponents; it is a reason for suspecting that these sorts of language models are fundamentally unsuited to the task.

link

halpmeh 1308 days ago

Why “fundamentally unsuited”? Neural networks have solved tons of problems previously thought to be “too hard” for ML, e.g. playing Go.

link

skybrian 1307 days ago

Fundamentally unsuited because of how they train it using "fill in the blank."

Training a large model to guess when it doesn't know the answer results in fiction. They need to do something else to get nonfiction.

By contrast, for Go the model was trained not to make illegal moves, because checking for that as part of the training is easy and cheap.

link

halpmeh 1307 days ago

We have models that accurate classify things, e.g. whether or not an email is spam. There isn’t a fundamental limitation into building something like a truth classifier into a generative model so that it optimized for outputting “true” statements. The hardest part is probably identifying what is truth and what is falsehood. That’s a fundamental problem with humanity, not neural networks.

link

skybrian 1307 days ago

Well, we could quibble about what "fundamental" means but my point is that the way they train large language models doesn't work for this. Something different needs to happen.

link

imtringued 1307 days ago

Truth has nothing to do with humanity unless you mean the specific way humans construct belief systems.

Anyway I already told you the answer. The AI will need a series of trainable belief systems to verify whether statements are internally consistent. The strange part about this is that the AI would need to have a way to obtain validation and each prompt would have to derive a new belief system which you must use in the next prompt.

In other words, the model must be able to learn continuously. That is something that these single shot AI models are not capable of.

link

nextaccountic 1307 days ago

> There isn’t a fundamental limitation into building something like a truth classifier into a generative model so that it optimized for outputting “true” statements.

Problem is, they didn't do that

link

b4je7d7wb 1308 days ago

Go is not solved.

The AI doesn't know the best move. It just knows a good move.

link

pessimizer 1308 days ago

You're equivocating on "solved." Solved as in performing as well as humans, not solved in the mathematical sense which is both 1) not necessarily possible, and 2) nothing anybody has ever named as a test for AI.

link

Animats 1308 days ago

No, that's correct. Checkers is solved; there is an algorithmic solution. Chess and Go have computer systems that exceed human performance, but are not solved.

link

halpmeh 1307 days ago

"Solved" means having a solution to a problem. In context, we're talking bout whether or not neural networks can detect truth better than "decades of work by experts to reach consensus." So, in this case, solving would be detecting truth better than the status quo, not detecting truth 100% of the time. In the example of Go, the problem was "playing Go better than the best humans." So in that sense, the problem was solved. Adding your own, unfavorable definition of "solved" to the discussion is unwarranted.

link

FeepingCreature 1308 days ago

And yet, Go AIs are now unbeatable by humans. This demonstrates that "solved" is unreasonable and unnecessary.

link

BurningFrog 1308 days ago

Cars are much faster than humans.

That doesn't mean transportation is solved.

link

seanhunter 1307 days ago

Solved in game theory has a very specific, strong definition. Transportation isn't a game in the game theoretic sense.

link

halpmeh 1307 days ago

“Hmm how should I get to work tomorrow? Normally I’d take the car, but after adopting a stance of distractive pedantism I realized that a car isn’t an acceptable solution to my transportation problem.”

Like please explain what definition of solved you are using. It’s not one most people would be familiar with.

link

pessimizer 1308 days ago

> That doesn't mean transportation is solved.

What are you even talking about?

edit: if we call a cab to get us to the restaurant, and it arrives successfully and takes us to the restaurant, transportation was solved.

link

mannykannot 1307 days ago

Note that we are not talking about neural networks in general, but specifically the sort of generative autoregressive language model that Galactica is. What reason do we have to think that such a model is more likely to produce a true statement than a false one? - especially as just one misplaced truth-valued function or operator is likely to turn a true proposition into a false one. Truthfulness (not to be confused with truthiness) of their productions does not seem to be something we should expect from how they work, and the empirical evidence from Galactica supports this view.

link

throwaway9870 1308 days ago

Yes, I would agree with that.

link

espadrine 1308 days ago

> In true science, it is exceptionally hard to distinguish truth from falsehood

I understand the sentiment, but I don’t think they referenced subtle proofs.

The system is unable to prove some high-school theorems and computations, see for instance: https://twitter.com/espadrine/status/1592879720269766659

(I don’t think that makes the system necessarily bad; it does mean that it has a long way to go still.)

link

yummypaint 1308 days ago

Not being able to difinitively identify truth is different from not attempting to identify it.

link

throwaway9870 1308 days ago

Attempting to identify truth is called the scientific method.

link

layer8 1308 days ago

The problem is that Galactica spits out obvious nonsense while being completely unaware of that. Okay, the real problem is that it also spits out nonobvious nonsense, where the human reader may also be unaware of it, along with Galactica. The only thing it does reasonably well is to generate text that sounds plausible in tone and form.

link

mytydev 1308 days ago

Science can't identify the truth. It can only identify what is NOT true. As our knowledge expands, we get closer to discovering the truth; but we can never be sure we've arrived.

link

FeepingCreature 1308 days ago

Science can also not identify falsehoods, it can only shift confidence.

link

layer8 1308 days ago

There’s still an asymmetry in that a single counterexample can destroy a theory.

link

thedorkknight 1308 days ago

They give the example of it "thinking" that the soviets sent bears to space. This is something that takes trivial research to see that it is based on nothing

link

basch 1308 days ago

That was my example that somebody screenshotted and cropped. There was more to the goof, that the cropper missed. For some reason the author at MIT cited the tweeter and not my post.

It appears galactica interpreted bear to be a type of dog. Laika was not a Karelian Bear Dog. I also think there are something like 8 species of bear, not 250.

It also as far as I can tell, named the beardog Bars, itself. "Bars the dog" and "dogs named bars" doesnt google well. There is no way to tell google I am looking for the proper noun, and not drinking establishments.

I made the original query because it was easily verifiably false. The correct output should have been "there is no publicly available documented history of bears in space."

https://news.ycombinator.com/item?id=33613676

link

findalex 1308 days ago

What does science have to do with truth? I thought it was a process of supporting hypotheses with observations?

link

CWuestefeld 1308 days ago

> I thought it was a process of supporting hypotheses with observations?

Then you're doing it wrong. Science done properly is a process of coming up with hypotheses, and then attempting to disprove them. If you're just jumping in trying to support your pet theory, you're very likely to wind up fooling yourself.

link

gunapologist99 1308 days ago

Exactly. Also why identifying "misinformation" is a fool's errand, since yesterday's misinformation is today's truth.

link

dmix 1307 days ago

> Also why identifying "misinformation" is a fool's errand

Seems easy enough: as long as the content is inoffensive and fits into the Overton Window then it's not misinformation.

link

gunapologist99 1305 days ago

Now if we could only identify some content that isn't offensive to someone..

link