Hacker News new | ask | show | jobs
by dekhn 1269 days ago
I would like to see the section on "Common-yet-boring" arguments cleaned up a bit. There is a whole category of "researchers" who just spend their time criticizing LLMs with common-yet-boring arguments (Emily Bender is the best example) such as "they cost a lot to train" (uhhh have you seen how much enterprise spends on cloud for non-LLM stuff? Or seen the power consumption of an aluminum smelting plant? Or calcuated the costs of all the airplanes flying around taking tourists to vacation?)

By improving this section I think we can have a standard go-to doc to refute the common-but-boring arguments. By pre-anticipating what they say (and yes, Bender is very predictable... yuo could almost make a chatbot that predicts her) it greatly weakens their argument.

4 comments

I don't understand why there is such a big group of people who see AI research like a team sport, where it's "us" vs "them" and "we" are the cheerleaders of our home team, and "they" are the haters and "we" must do everything to shout those bad guys down.

Criticism is essential for progress in science, and even in AI research (which is far from science). Get over it. The role of the critic is not to be your enemy, the role of the critic is to help you improve your work. That makes no difference if the critic is a bad person who wants your downfall, or not. What makes a difference is if you can convincingly demonstrate that your critic's criticism does not hold anymore. Then people stop listening to the critic- not when you shout louder than the critic.

Oh and, btw, you do that demonstrating by improving your work, which implies that you need to be one of the researchers whose work is criticised to do that, rather than some random cheerleader of the interwebs. What you propose here, to compose some sort of document to paste all over twitter everytime someone says something critical of the "home team", that's not what researchers do; it's organising an internet mob. And it has exactly 0 chance of being of any use to anyone.

Not to mention the focus on Emily Bender is downright creepy.

> Or calculated the costs of all the airplanes flying around

This is the key comparison. A 747-400 burns 10+ metric tons of kerosene per hour, which means its basic energy consumption is > 110MW. The cost to train GPT-3 was approximately the same energy spent by one 8-hour airline flight.

Equivalently, the energy used to train GPT-3 was the same as the energy consumed by Bitcoin in just four minutes.
Loved that section as well. An addendum I'd include is that many of these arguments are boring as criticisms, but super interesting as research areas. AIs burn energy? Great, let's make efficient architectures. AIs embed bias? Let's get better and measuring and aligning bias. AIs don't cite sources? Most humans don't either, but it sure would make the AI more useful if it did...

(As a PS, I've seen that last one mainly as a refutation for the "LLMs are ready to kill search" meme. In that context it's a very valid objection.)

It looks pretty good as it stands, I think - to spend too much time on these arguments is to play their game.

Having said that, I would add a note about the whole category of ontological or "nothing but" arguments - saying that an LLM is nothing but a fancy database, search engine, autocomplete or whatever. There's an element of question-begging when these statements are prefaced with "they will never lead to machine understanding because...", and beyond that, the more they are conflated with everyday technology, the more noteworthy their performance appears.