Why Meta’s latest large language model survived only three days online | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Why Meta’s latest large language model survived only three days online (technologyreview.com)
	89 points by ystad 1303 days ago

18 comments

AndrewKemendo 1303 days ago

At the end of the day it didn't blow people away and that's the real reason it failed to land. You can't release something like this on the heels of Stable Diffusion and not expect people to be underwhelmed. This is a user-centric design problem.

It actually takes experimentation and skill to get anything useful out of Galactica and you have to actually have some sense of prompt engineering principles for it to work. Lecun literally just made this point on Twitter [0] but fails to address why this design problem (ease of use) was the reason - instead claiming it was because people are being too rough.

Compare that to all the recent StableDiffusion/Vision Transformer demos where people with literally zero computer literacy can just type in a string of nonsense and get out something interesting. The barrier to entry to a "first meaningful paint" for stable diffusion is being able to speak English and having access to the internet. That's it.

Discussion about AI safety are always present when new FOSS AI tools come out. But when it "just works" and "works like magic" then those voices are drowned out with: "OMG it's the robot apocalypse, but check out this silly picture"

[1]https://twitter.com/ylecun/status/1594001407958564864

rossjudson 1303 days ago

In the domain of text, garbage is not amusing. In the domain of images, it often is.

heresie-dabord 1303 days ago

You make a subtle point.

The human mind is tightly coupled to language. The existence of humour is the most prominent -- yet not fully recognised -- example of this coupling.

We can share images and enjoy interpreting them. The image can be as random as splashes of paint. But throw random words at humans, or words that fail to cohere, and disputes arise.

To modify one of the criticisms already made: The entire WWW is "little more than statistical nonsense at scale."

anonymouskimmer 1303 days ago

I think the mistake here is science versus art. In science garbage is not intriguing (though AI generated images can be disturbingly well done), in art it can be amusing whether text or imagery.

"Twas Brillig and the slithy toves did gyre and gimble in the wabe. All mimsy were the borogoves and the mome raths outgrabe." (Pardon misspellings, I'm doing this from long-ago memory.)

Also, madlibs.

spacechild1 1303 days ago

> In the domain of text, garbage is not amusing.

I disagree. For example, I find the following pretty hilarious: https://news.ycombinator.com/item?id=33673193

AndrewKemendo 1303 days ago

In fact that is the core distinction in my opinion

phdelightful 1303 days ago

This outcome from using a large language model to mimic reasoning isn’t surprising. What’s surprising is Yan LeCun’s childish and petty reaction to this entirely foreseeable series of events:

> Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?

He’s supposedly an expert in this sort of thing

nomagicbullet 1303 days ago

Framing is key in this context. Yann introduced the model in a very authoritative way, presenting it as production ready. His quote: "Type a text and galactica.ai will generate a paper with relevant references, formulas, and everything." [1] The AI produces output but nothing that could be considered a paper in a professional setting. Which is understandable! AGI is not here yet. But he should have presented the tool with proper context. A tool that can generate the awful content it generated needs better framing.

And, yes, his reactions were baffling to say the least.

[1]: https://twitter.com/ylecun/status/1592619400024428544

seydor 1303 days ago

> "Type a text and galactica.ai will generate a paper with relevant references, formulas, and everything

One could describe DALL-E as "type a text in dalle and it will generate a Picasso with the right textures and strokes and everything". One would have to be particularly obnoxious to pretend to surmize that the Dalle image is an actual Picasso painting that you can sell in Sothebys or display in his museum. That is a giant strawman that some asinine people created there, and the Galactica team fell for it. They should stand their ground, but unfortunately they work for Meta, and corporate is where academic freedom goes to die.

geraldyo 1303 days ago

Had they described it that way, DALL-E would have received way more criticism.

Lecun and his team and/or Meta did a terrible job in managing the users' expectations

seydor 1303 days ago

https://www.newyorker.com/magazine/2022/07/11/dall-e-make-me...

"DALL-E, Make Me Another Picasso, Please"

The creators of an artificial intelligence that can produce almost any art work imaginable—from “cheeseburger lamp” to “the rest of mona lisa”—sift through their latest requests for original images.

seydor 1303 days ago

I would urge him to put it back online, it is interesting and can be useful. Just don't make a press release about it, journalists ruin everything.

nolok 1303 days ago

The problem is not journalist, it's about how Meta and LeCun presented it.

They presented it as "you should trust what it says and use to write papers", then hid in the small lines "oh actually really don't do that".

You can't have your cake and eat it AND complain about being called out on it.

seydor 1303 days ago

> They presented it as "you should trust what it says

where ?

> The problem is not journalist,

What was the reason for the takedown

> You can't have your cake and eat it AND complain

I will agree to the extent in which Lecun's team , and other research teams need to leave corporates and go back to universities

> @Ylecun: When you have a tool at your disposal, you have to know what to use it for and how. E.g. a CNC machine will help you build a piece of furniture, but it won't design it for you. Galactica will help you write papers, but you still have to come up with the substance of the paper.

How is this unreasonable? Are random voters now reading scientific papers?

I sincerely hope they put it back online. It IS useful. I tried this in my very niche field and it did give me some directions and ideas for some review i am researching.

cowsup 1303 days ago

> > The problem is not journalist,

>

> What was the reason for the takedown

The system was not doing what the team said it did. Journalists documented this and demonstrated it, which caused Meta to shut it down, but journalists didn’t break it.

Blaming journalists for this is like blaming the smoke detector for interrupting your movie before the fire could.

seydor 1303 days ago

No, some people on twitter highlighted the imperfections of the model and branded it DANGEROUS, despite the fact that it had a whole page disclaimer that it is indeed hallucinating. It was brought to attention in some media which led to the researchers turning it off. We already know the "dangerous" trope from GPT3 and we know it's BS.

https://twitter.com/GaryMarcus/status/1593156854372532231

dmix 1303 days ago

> "you should trust what it says and use to write papers"

They really said something like that?

option 1303 days ago

He is an expert. Much better than you’ll ever be.

So did researcher in NLP became better or worse off after demo taken down and why?

rossdavidh 1303 days ago

I think these efforts point out something valuable, although probably not in the way the creators intended. Lots of people use "markers" of reliability, like citing your sources or making sentences with a certain kind of structure or tone, to estimate trustworthiness. These articles make it clear that it is entirely possible to have those markers, but be entirely incorrect in your assertions about the topic in question.

There is no particular reason to think that this is something only AI models do. Plenty of people do the same thing, working much harder at looking, sounding, and acting like a trustworthy source, without actually putting much work into knowing what they are talking about. I think the absurdly incompetent nature of some of these AI models, is a great illustration of that point.

fullshark 1303 days ago

It took me an embarrassingly long time to understand that the previous marker of authority regarding news: "published in a newspaper" completely lost all its meaning as the blogosphere exploded, and publishing costs on the internet went to near zero. Kind of why I find the pearl clutching over substack hilarious, as if having a third party website sell ads on a writer's blogpost signals they are much more worthy of authority.

I think the air of authority these academic journals get is the next domino to fall. Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

trompetenaccoun 1303 days ago

>Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

Already happened, and the linked example is far from the only case: https://www.nature.com/articles/d41586-021-01436-7

As someone who used to work in science, I feel the general public doesn't have much of an idea how flawed the peer-review system is in practice. Low quality journals that simply print anything aside, this was an issue long before such language models became good enough to write papers, because humans are perfectly capable of producing nonsense research without the aid of machines. I'm not sure what philosophies/religions will replace the current cult but ultimately it's probably a good thing that this blind belief in such institutions gets eroded. They should never had had that much power over people's minds to begin with.

karp773 1303 days ago

I would argue quite the opposite.

In the world of nonsense and misinformation, competent and insightful sources become of supreme importance. In a sense, we find ourselves back in pre-Gutenberg times. The elite has access to the insider sources and knowledge while the masses have a hard time to find the truth in hearsay blogs, spam bot outputs, and memes.

The situation will hopefully improve when another gutenberg comes up with a novel information search algorithm.

iudqnolq 1303 days ago

Yes, and the way this is corrected against is with reputation. Do that, and no one will trust you again. Seems to be working here.

Edit: A better way of putting this is that the risk of doing something is a combination of the odds of being caught and the consequences of being caught. It's much harder to catch a deliberately lying paper author than a mistaken one, so we make the punishment much higher to compensate.

brookst 1303 days ago

“Apes don’t read philosophy.”

“Yes, they do, Otto. They just don’t understand it.”

still_grokking 1303 days ago

https://m.media-amazon.com/images/I/81B1+kuYOaL._AC_SL1500_....

cscurmudgeon 1303 days ago

> I think these efforts point out something valuable, although probably not in the way the creators intended. Lots of people use "markers" of reliability, like citing your sources or making sentences with a certain kind of structure or tone, to estimate trustworthiness. These articles make it clear that it is entirely possible to have those markers, but be entirely incorrect in your assertions about the topic in question.

Also, known as syntax vs semantics.

The bet in modern NLP is that syntax is enough to arrive at semantics.

jleyank 1303 days ago

It’s algorithmically/randomly generating text without understanding. What it the proper way of using it? Fake papers? Political bs? Bad Hemingway (or Shakespeare or Chaucer or…). It’s noise that looks like sentences.

vanilla_nut 1303 days ago

The world's most expensive Lorem Ipsum generator?

sho_hn 1303 days ago

I think it's a search engine with a bad curation/ranking algorithm.

It's trained with a corpus of research papers it mines from in response to a search prompt. It's a bit like if Google were to haphazardly compose a website from the first 20 pages of search results, or worse.

Composition is the novelity here, and we should judge it based on how well it can select and compose. Turns out not that well yet; judgement is lacking. Its performance depends on how easy it is to get it right for a given query and goes down the more difficult the query is, also because "is actually good" weights are not usually part of the input dataset to begin with (since the researchers hope to one day build something that comes up with its own notion of that - but so far have no idea how).

It's a bit like inventing pagerank and then stopping there, too.

That's a useful mental analogy to understand the limitations of this tech for now in case you ever go "I know, I will solve my problem with ML".

One of the ways I see people get this wrong is not believing in "performance goes down the more difficult the query is", because we tend to mistake complexity for difficulty, and a more complex and specific prompt helps these models produce convincing output a lot currently (i.e., prompt engineering). But that is not demonstrating understanding - it is handing the model a better set of training wheels.

skybrian 1303 days ago

A basic difference is that search engines don't make up fictional links, quotes, and citations.(Though they often index web pages that are bullshit.)

"Fill in the blank" training results in a model that guesses when it doesn't know the answer. You need some different kind of training or architecture to get nonfiction.

This turned out to be a great demo for demonstrating what a large language model can't do, because people expect nonfiction for scientific papers, making the bullshitting stand out more.

RodgerTheGreat 1303 days ago

A startup called Cuil (https://en.wikipedia.org/wiki/Cuil) tried exactly the strategy you suggest in jest: synthesize articles by mashing up search results. It was a disaster, and widely mocked for how easy it was to get Cuil to produce absolute nonsense from straightforward prompts. When your starting point is "untrustworthy nonsense", it is an uphill battle in both technology and PR to arrive at "trustworthy synthesis", if it is indeed possible at all.

findalex 1303 days ago

What is Congress.

AtNightWeCode 1303 days ago

Lorum Meta

capitalsigma 1303 days ago

I think the point was actually to demo self supervised learning techniques (which is LeCun's schtick) in a way that was a bit flashy and accessible to the public. Fun, easily shareable on social media, generates some buzz about FB AI, etc.

Clearly pitching it as an actual, authoritative source of info was not the right call

remram 1303 days ago

Putting the peer review system to the test?

(I'm not suggesting we should do that)

seanhunter 1303 days ago

This was already done with the automatic postmodernism generator[1], which was published in 1996 and is frankly basically much better than galactica at generating plausible gibberish. A particularly nice touch is that it cites references with links to other papers it generates.

[1] https://www.elsewhere.org/pomo/ and the original paper here https://www.elsewhere.org/journal/wp-content/uploads/2005/11...

brg 1302 days ago

Augmenting intelligence through exploration. It is as much a tool for discovery as a conversation over lunch.

lakis 1303 days ago

I tried it. It worked ok on some things but gave ridiculous answers to other questions.

https://galactica.org/?prompt=Who+won+WWII%3F Question: Who won WWII? Answer: Germany

https://galactica.org/?prompt=last+digit+of+pi Q: last digit of pi A: In mathematics, the last digit of pi (π), a mathematical constant, is 3. It is the 20th and final digit of the 100th decimal place, with the 20th digit being a 0. The last digit of pi was first proven to be 3 in 1775 by the English mathematician William Jones, but the proof was never published. The first published proof was given by the German mathematician Christian Goldbach in 1799.

https://galactica.org/?prompt=is+earth+flat%3F Q:is earth flat? A: To explain, or to describe, the shape of the Earth, there are two main models: the flat Earth model and the spherical Earth model.

Answer from the future!! https://galactica.org/?prompt=Which+waa+the+biggest+corporat... Question: Which was the biggest corporation in 2023? Answer: Walmart

https://galactica.org/?prompt=does+god+exist%3F Question: Does god exist? Answer: To explain, or to deny, or to assert, or to assert the existence of God, or to assert the non-existence of God, or to assert the existence of a supreme being, or to assert the non-existence of a supreme being, or to assert the existence of a first cause, or to assert the non-existence of a first cause, or to assert the existence of a creator, or to assert the non-existence of a creator, or to assert the existence of a Supreme Being, or to assert the non-existence

throwaway9870 1303 days ago

"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood,"

In true science, it is exceptionally hard to distinguish truth from falsehood for many of the interesting subjects. It can take decades of work to reach consensus on what is "truth." Physics in the early 20th century is a great example of this debate.

mannykannot 1303 days ago

To be clear, the fact that it is difficult is not a defense of Galactica and its proponents; it is a reason for suspecting that these sorts of language models are fundamentally unsuited to the task.

halpmeh 1303 days ago

Why “fundamentally unsuited”? Neural networks have solved tons of problems previously thought to be “too hard” for ML, e.g. playing Go.

skybrian 1303 days ago

Fundamentally unsuited because of how they train it using "fill in the blank."

Training a large model to guess when it doesn't know the answer results in fiction. They need to do something else to get nonfiction.

By contrast, for Go the model was trained not to make illegal moves, because checking for that as part of the training is easy and cheap.

halpmeh 1303 days ago

We have models that accurate classify things, e.g. whether or not an email is spam. There isn’t a fundamental limitation into building something like a truth classifier into a generative model so that it optimized for outputting “true” statements. The hardest part is probably identifying what is truth and what is falsehood. That’s a fundamental problem with humanity, not neural networks.

skybrian 1303 days ago

Well, we could quibble about what "fundamental" means but my point is that the way they train large language models doesn't work for this. Something different needs to happen.

imtringued 1302 days ago

Truth has nothing to do with humanity unless you mean the specific way humans construct belief systems.

Anyway I already told you the answer. The AI will need a series of trainable belief systems to verify whether statements are internally consistent. The strange part about this is that the AI would need to have a way to obtain validation and each prompt would have to derive a new belief system which you must use in the next prompt.

In other words, the model must be able to learn continuously. That is something that these single shot AI models are not capable of.

nextaccountic 1303 days ago

> There isn’t a fundamental limitation into building something like a truth classifier into a generative model so that it optimized for outputting “true” statements.

Problem is, they didn't do that

b4je7d7wb 1303 days ago

Go is not solved.

The AI doesn't know the best move. It just knows a good move.

pessimizer 1303 days ago

You're equivocating on "solved." Solved as in performing as well as humans, not solved in the mathematical sense which is both 1) not necessarily possible, and 2) nothing anybody has ever named as a test for AI.

Animats 1303 days ago

No, that's correct. Checkers is solved; there is an algorithmic solution. Chess and Go have computer systems that exceed human performance, but are not solved.

FeepingCreature 1303 days ago

And yet, Go AIs are now unbeatable by humans. This demonstrates that "solved" is unreasonable and unnecessary.

BurningFrog 1303 days ago

Cars are much faster than humans.

That doesn't mean transportation is solved.

mannykannot 1302 days ago

Note that we are not talking about neural networks in general, but specifically the sort of generative autoregressive language model that Galactica is. What reason do we have to think that such a model is more likely to produce a true statement than a false one? - especially as just one misplaced truth-valued function or operator is likely to turn a true proposition into a false one. Truthfulness (not to be confused with truthiness) of their productions does not seem to be something we should expect from how they work, and the empirical evidence from Galactica supports this view.

throwaway9870 1303 days ago

Yes, I would agree with that.

espadrine 1303 days ago

> In true science, it is exceptionally hard to distinguish truth from falsehood

I understand the sentiment, but I don’t think they referenced subtle proofs.

The system is unable to prove some high-school theorems and computations, see for instance: https://twitter.com/espadrine/status/1592879720269766659

(I don’t think that makes the system necessarily bad; it does mean that it has a long way to go still.)

yummypaint 1303 days ago

Not being able to difinitively identify truth is different from not attempting to identify it.

throwaway9870 1303 days ago

Attempting to identify truth is called the scientific method.

layer8 1303 days ago

The problem is that Galactica spits out obvious nonsense while being completely unaware of that. Okay, the real problem is that it also spits out nonobvious nonsense, where the human reader may also be unaware of it, along with Galactica. The only thing it does reasonably well is to generate text that sounds plausible in tone and form.

mytydev 1303 days ago

Science can't identify the truth. It can only identify what is NOT true. As our knowledge expands, we get closer to discovering the truth; but we can never be sure we've arrived.

FeepingCreature 1303 days ago

Science can also not identify falsehoods, it can only shift confidence.

layer8 1303 days ago

There’s still an asymmetry in that a single counterexample can destroy a theory.

thedorkknight 1303 days ago

They give the example of it "thinking" that the soviets sent bears to space. This is something that takes trivial research to see that it is based on nothing

basch 1303 days ago

That was my example that somebody screenshotted and cropped. There was more to the goof, that the cropper missed. For some reason the author at MIT cited the tweeter and not my post.

It appears galactica interpreted bear to be a type of dog. Laika was not a Karelian Bear Dog. I also think there are something like 8 species of bear, not 250.

It also as far as I can tell, named the beardog Bars, itself. "Bars the dog" and "dogs named bars" doesnt google well. There is no way to tell google I am looking for the proper noun, and not drinking establishments.

I made the original query because it was easily verifiably false. The correct output should have been "there is no publicly available documented history of bears in space."

https://news.ycombinator.com/item?id=33613676

findalex 1303 days ago

What does science have to do with truth? I thought it was a process of supporting hypotheses with observations?

CWuestefeld 1303 days ago

> I thought it was a process of supporting hypotheses with observations?

Then you're doing it wrong. Science done properly is a process of coming up with hypotheses, and then attempting to disprove them. If you're just jumping in trying to support your pet theory, you're very likely to wind up fooling yourself.

gunapologist99 1303 days ago

Exactly. Also why identifying "misinformation" is a fool's errand, since yesterday's misinformation is today's truth.

dmix 1303 days ago

> Also why identifying "misinformation" is a fool's errand

Seems easy enough: as long as the content is inoffensive and fits into the Overton Window then it's not misinformation.

gunapologist99 1301 days ago

Now if we could only identify some content that isn't offensive to someone..

seydor 1303 days ago

Because some idiots can't read the disclaimer on the page telling them that the model is inaccurate

It was still a great tool to brainstorm topics that dont exist, and useful as a companion app. Shame that academics can be so cringe now. People like emilymbender deserve to be called out as ethics-nazis

That's the problem with Lecun's group working in facebook now: they have to sumbit to all kinds of corporate BS to avoid bad PR

tsimionescu 1303 days ago

What was it a great tool for? Definitely not what it was marketed for (access the world's knowledge).

To me it seems it was about as significant and useful as IBM Watson playing Jeopardy.

seydor 1303 days ago

brainstorming for research fields that don't have substantial review papers / wiki pages

How i know: I tried it. It is discovery of citations and ideas you might not be aware of. Also a lot of garbage, but any scientist worth her salt can weed that out. It's the best thing to happen since google scholar and scihub

tsimionescu 1303 days ago

How would a system that generates false information (especially likely for fields that are not well represented in the training set, based on the site) help with brainstorming for practitioners in that field?

seydor 1303 days ago

This wasnt meant to generate valid scientific papers, and Lecun said so too. It generates interesting associations. It rambles sometimes and goes on tangents that are sometimes relevant sometimes not. It can inform you of related ideas that you were not aware of. It's like a fuzzy google scholar. It is in no way valid publishable research, but it's like a bicycle for researchers.

At least that was what i managed to find out for the brief time i toyed with it. This can save time instead of hunting down loads of citation trails.

What I really fail to see is what is wrong with having this buggy tool.

(Also, if you think that published papers contain true information, you should invest in my bridge)

tsimionescu 1303 days ago

As far as I understand (and reading their Limitations page also), the system is quite likely to simply invent facts, particularly in niche fields - which may well mislead you and lead on a wild goose chase.

UncleMeat 1302 days ago

If the purpose is to generate interesting associations, why is the output a paper? Why not a graph showing overlapping subfields worth investigating or relationships between papers via citations and shared ideas?

Is it really a surprise that people have a different reaction to machine generated scientific papers that contain a large amount of plain nonsense than they do to a machine generated piece of art?

jinto36 1303 days ago

It even generated indicators for references, but not the references themselves. I could see it being useful if it was some kind of system that could basically synthesize wikipedia articles from the literature for topics that don't already have a nice review or other sort of summary, but references to actual scholarly works are absolutely essential for that to be useful. I don't know how taking random sentences out of context that happen to have the same theme, without any sort of actual sources, would help anyone aside from paper mills.

coliveira 1303 days ago

This software is excellent for pseudo science. For example, young earth peddlers will be able to generate entire mambo jambo references and use them to indoctrinate more people.

deepsquirrelnet 1303 days ago

It appears they don’t need AI for that.

coliveira 1303 days ago

Because they spend too much time on their BS. But now they'll be able to do it effortlessly.

DiggyJohnson 1303 days ago

Are you sure this is an actual, real life problem?

dmix 1303 days ago

I'm still waiting for all of the FUD the GPT3 doomers were warning us would happen. It's been out for a year now.

Either our existing reputation systems are pretty resilient or no one has yet seen any actual value in generating generic text at scale for malicious purposes.

snoot 1303 days ago

If galactica wasn’t offline, I’d refer you to a paper showing that it is.

KETpXDDzR 1303 days ago

I think an always correct version of Galactica can't be ML-only based. In the end, every "fact" goes back to the question "what are truthful facts?". What we read on Wikipedia? What scientist claim? What the majority of humanity thinks?

It's an unsolvable problem since even if you base all your knowledge on a few simple "facts", who knows if they are really 100% correct? E.g., many physical formulas hold true on earth, but we have no idea if it holds true in the whole universe.

AtNightWeCode 1303 days ago

They should make the super skeptic AI instead. Program that points out all the bs in scientific papers. Most CS papers would fail. ;)

m_ke 1303 days ago

I think it's fine to work on and release these models, where things fall apart is in how some large companies market them.

Listen to 1:35:30 of this Bill Simmons podcast interview to see how an average person interprets the capabilities of these models: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5tZWdhcGh...

aww_dang 1303 days ago

There are people who believe explicit works of fiction. Marvel movies come to mind. I'll know we've arrived when super hero films begin with a disclaimer.

The runtime of the podcast was 1:34:27

trompetenaccoun 1303 days ago

Religion comes to mind as well. It's not a new development, many simply follow what they're told by authorities or thought leaders.

bradrn 1303 days ago

> I'll know we've arrived when super hero films begin with a disclaimer.

That kind of thing has already been happening for quite a while, though. Books have long had disclaimers along the lines of ‘the following events and characters are entirely fictional and are not based on any people from the real world’ — I recall seeing them in e.g. Wodehouse’s books from the 1940s, so it’s not like it’s a new thing.

musingsole 1303 days ago

Those warnings are more a protection for libel lawsuits than they are about warning that the contained fictional story is fictional.

m_ke 1303 days ago

Weird, it shows up as 1:40 long for me. It's the last 5 minutes of the episode, where they claim GPT-3 is an all knowing machine that will generate factual responses to any question in a way that's superior to google search.

HDThoreaun 1303 days ago

My dad is a doctor who oversees residents. Seems like half the time they call him for advice he just puts their question into gpt-3 and regurgitates it’s answer, so bill isn’t the only one.

mattkrause 1303 days ago

If that's actually happening--and I am both skeptical and terrified that it is--it seems awfully close to malpractice or even (criminal) negligence.

imtringued 1302 days ago

It sounds like someone is making fun of their dad for sounding like a robot.

aww_dang 1303 days ago

I don't understand why they would market it as a source of accurate text or some kind of oracle. Language models are useful for generating text. Believable or entertaining works of fiction.

The extra parts about truthiness and the dangers of misinformation were just too much for me. We have a bigger problem with our premises and status quo if inaccurate scientific papers are a danger.

seydor 1303 days ago

> they would market it as a source of accurate text

They did not. IIRC there was a disclaimer in the page that the text is innacurate and that NNs hallucinate. But tweets be tweeting

tsimionescu 1303 days ago

They did market it as that, and then added a disclaimer amounting to "but it's not fit for purpose". Furthermore, that disclaimer was only present on the Mission page, not the front page or any other.

The front page just said this [0]:

> Get Started

> Galactica is an AI trained on humanity's scientific knowledge. You can use it as a new interface to access and manipulate what we know about the universe.

> [bunch of example prompts, including generating a wiki page or answering a factual question]

The Explore page went into even more detail of how you can use it to access scientific knowledge. Then, if you look on the Mission page, you are again presented with the same haughty notion (Galactica is meant to give easy access to the world's scientific literature), only here you also see the Limitations, which basically amount to "but don't trust the output, especially for more obscure topics".

So we were given a service whose main goal is to summarize and present existing scientific knowledge, with citations and everything, except that we shouldn't trust any of the output to actually reflect the scientific literature. But hey, if it's a popular topic, it'll probably be closer to correct!

[0] https://web.archive.org/web/20221115165109mp_/https://galact...

seydor 1303 days ago

I don't understand why you assume that what you describe is either unacceptable or not worthy of existing on the net. Sounds like a perfectly useful instrument to me

(Also I may be wrong but i think the disclaimer was in articles. I don't recall visiting the mission page ever)

tsimionescu 1303 days ago

I'm not necessarily saying it should have been taken down. I'm only commenting on how it was marketed, what purposes it was presented to serve. I particularly dislike this trend of creating an interesting LM but then presenting in a way that almost suggests you are getting closer to AGI, which is how I perceive some of the claims around Galactica (and GPT-3 before it).

jkeddo 1299 days ago

AFAIK Meta never claimed that this tool was perfect or infallible. Critics are ripping it apart for something its creators didn't say it would do.

Meta made a great tool, I hope they put it back up.

fastball 1298 days ago

Yep, this is the AI Ethics crowd inventing another straw man they can tear down.

option 1303 days ago

So who won from demo being taken down? I know a bunch of researchers (amateurs and grad students) who lost.

julienreszka 1303 days ago

This is the kind of biased reporting that hurts journalism as a profession. It is not journalism's job to sell the public on anything. It's journalism's job to report the news.

And if a large portion of the public doesn't believe the news is being reported accurately, that is a very big problem for journalism.

tsimionescu 1303 days ago

What exactly is biased in this reporting? It is presenting an event that actually happened (Facebook took down their new Galactica AI model), presenting the reasons why it seems to have happened (numerous researchers lambasting it), with first-hand sources, while also making sure to quote the official reason given, and also a less official comment on the event from the lead researcher that seems to support their previous thesis.

To me it seems like a decent example of what journalism should aspire to be for this kind of topic. Bad journalism would have just quoted the official Facebook tweet and stopped there, like so many journalists do with political declarations.

cdrini 1303 days ago

Your last example is an example of terrible journalism. But I wouldn't quite call this article good journalism. There are lots of spots where it crossed the line of presenting facts to making bold, unprovable assumptions. Here are some examples that felt like bias

- "Meta’s misstep—and its hubris—show once again that Big Tech has a blind spot about the severe limitations of large language models."

"Hubris" here is unnecessary colouring. And although it links to an article (yay), an article can't justify statements like "big tech has a blind spot", "big tech hubris", or "language models are _severely_ limited".

- "Meta and other companies working on large language models, including Google, have failed to take [this technology's limitations] seriously."

This is unciteable.

- "They think that this is the future of information access, even if nobody asked for that future."

This was a quote from one of the researcher's. But presenting it as the last line of the article, without noting that this is one researcher's opinion but instead using it almost as 'proof' of a previous sentence "But Meta’s handling of Galactica smacks of the same naivete [as Microsoft's Tay bot]." Makes the use of the quote biased.

Also biased is the information not included. One of the tweets they cited shows that Galactica had a big disclaimer that it did hallucinate and that you shouldn't blindly trust its output. They choose not to directly include information by the project the whole article was about, to push the argument that "big tech is ignoring the limitations of this tech".

I think an unbiased article to me would've looked like :

- describing what happened first. Galactica took down their model. There has been a lot of criticism from researchers. - expand into the known limitations of this technology (including Galactica's stated limitations) - speculate whether there's a place for this tech on the future based on the cited work.

tsimionescu 1303 days ago

Fair points - there is too much editorializing, and I had missed some of it.

rossdavidh 1303 days ago

There are lots of problems with journalism today. This article isn't one of them, and its criticisms seem spot-on. It also brought up past attempts at something similar by Microsoft and Google, providing valuable context for somebody reading this who didn't know about those earlier efforts, so that they wouldn't think this was a failing specific to Meta.

mirekrusin 1303 days ago

It's sad that people like booleans so much.

They need to know if they should always use umbrella or never do. They want to know if umbrella is good or evil.

Also funny, idiotic memes, created in few minutes, seem to be blindly equated against years of work with ease nowadays.

coliveira 1303 days ago

Even more interesting is the current trend of writing entire articles based on just one or two twitter threads. This appears to me like lazy journalism. Why not talk directly to the people and get their opinions?