| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dekhn 485 days ago
	So, I've been reading Google research papers for decades now and also worked there for a decade and wrote a few papers of my own. When google publishes papers, they tend to juice the results significance (google is not the only group that does this, but they are pretty egregious). You need to be skilled in the field of the paper to be able to pare away the exceptional claims. A really good example is https://spectrum.ieee.org/chip-design-controversy while I think Google did some interesting work there and it's true they included some of the results in their chip designs, their comparison claims are definitely over-hyped and they did not react well when they got called out on it.

5 comments

warbaker 485 days ago

The article you linked is not an example of this happening. Google open-sourced the chip design method, and uses it in production for TPU and other chips.

https://github.com/google-research/circuit_training

https://deepmind.google/discover/blog/how-alphachip-transfor...

link

dekhn 485 days ago

It's an ongoing debacle with multiple people making extremely good arguments that Google overstated the results.

Yes, I know it's in TPUs and I said exactly that.

You simply can't take Google press at face value.

link

warbaker 485 days ago

Google has responded to this "controversy" already:

https://x.com/JeffDean/status/1858540085794451906

https://arxiv.org/abs/2411.10053

link

dekhn 485 days ago

Yes, I am aware. I didn't find Jeff's argument particularly convincing. Please note: I've worked personally with Jeff before and shared many a coffee with him. He's done great work and messed up a lot of things, too.

link

confused_boner 485 days ago

From your perspective, which arguments were not convincing, are you able to share why not?

link

Der_Einzige 484 days ago

Unironically "just trust me bro" is actually fine here. They're objectively right and you'll find they are when you do your painstaking analysis to figure it out.

link

mupuff1234 485 days ago

> You simply can't take Google press at face value.

I think that's true for virtually every company and also for most people (in the context of published work)

link

mmooss 484 days ago

Do you think that of most published scientific research?

link

tho3i4j3242324 484 days ago

Seems to be true. 'Published' scientific research, by its sheer social-dynamics (verging on highly toxic), is the academic equivalent of a pouty-girl vis-a-vis Instagram.

(academic-burnout resembles creator-burnout for similar reasons)

link

tsumnia 485 days ago

Remember Google is a publicly traded company, so everything must be reviewed to "ensure shareholder value". Like dekhn said, its impressive, but marketing wants more than "impressive".

link

dekhn 485 days ago

This is true for public universities and private universities; you see the same thing happening in academic papers (and especially the university PR around the paper)

link

hall0ween 485 days ago

I would say anecdotal. This hasn't been my case across four universities and ten years.

link

BeetleB 485 days ago

The actual papers don't overhype. But the university PR's regarding those papers? They can really overhype the results. And of course, the media then takes it up an extra order of magnitude.

link

dekhn 485 days ago

I've definitely seen many examples of papers where the conclusions went far beyond what the actual results warranted. Scientists are incentivized to claim their discovery generalizes as much as possible.

But yes, it's normally: "science paper says an experiment in mice shows promising results in cancer treatment" then "University PR says a new treatment for cancer is around the corner" and "Media says cure for all cancer"

link

nyrikki 485 days ago

Depends on what you call "overhype".

Wishful mnemonics in the field was called out by Drew McDermott in the mid 1970's and it is still a problem today.

https://www.inf.ed.ac.uk/teaching/courses/irm/mcdermott.pdf

And:

> As a field, I believe that we tend to suffer from what might be called serial silver bulletism, defined as follows: the tendency to believe in a silver bullet for AI, coupled with the belief that previous beliefs about silver bullets were hopelessly naive.

(H. J. Levesque. On our best behaviour. Artificial Intelligence, 212:27–35, 2014.)

link

hall0ween 485 days ago

Fair point!

link

killjoywashere 481 days ago

I have worked with Google teams as well, and they taught me a fair bit about how to be rigorously skeptical. It takes domain knowledge, statistical knowledge, data, time and the computational resources to challenge them. I've done it, but it took real resources.

That said, it's a useful exercise to figure out the plan of attack. My experience is the "juice" was mainly in "easy true negative" subclasses. They weren't oversampled, but the human brain wouldn't even consider most of that data. Once you ablate those subclasses from the dataset, (which takes a lot of additional labelling effort), you can start challenging their assertions. But it's hard.

And that said I also review a number of articles in that domain, and I haven't seen a group with stronger datasets overall.

link

ein0p 485 days ago

That applies to absolutely everyone. Convenient results are highlighted, inconvenient are either not mentioned or de-emphasized. You do have to be well read in the field to see what the authors _aren't_ saying, that's one of the purposes of being well-read in the first place. That is also why 100% of science reporting is basically disinformation - journalists are not equipped with this level of nuanced understanding.

link

dekhn 485 days ago

yes, but google has a long history of being egregious, with the additional detail that their work is often irreproducible for technical reasons (rather than being irreproducible for missing methods). For example, we published an excellent paper but nobody could reproduce it because at the time, nobody else had a million spare cores to run MD simulations of proteins.

link

ein0p 485 days ago

It's hardly Google's problem that nobody else has a million cores, wouldn't you agree? Should they not publish the result at all if it's using more than a handful of cores so that anyone in academia can reproduce it? That'd be rather limiting.

link

dekhn 485 days ago

Well, a goal of most science is to be reproducible, and it couldn't be reproduced, merely for technical reasons (and so we shared as much data from the runs as possible so people could verify our results). This sort of thing comes up when CERN is the only place that can run an experiment and nobody can verify it.

link

ein0p 485 days ago

It is probably reproducible, if you have the requisite million cores. That isn't even difficult today - a million cores is about 100 GPUs.

link

Der_Einzige 484 days ago

Actually it IS google's problem. They don't publish through traditional academic venues unless it suits them (much like OpenAI/Anthropic, often snubbing places like NeurIPS due to not wanting to MIT open source their code/models which peer reviewers demand) and them demanding so many GPUs chokes supply for the rest of the field - a field which they rely on the free labor of to make complimentary technologies to their models.

link

ein0p 484 days ago

It doesn't choke anything. Anyone can go to GCP or other cloud providers and get as many GPUs as they need, within reason.

link

ClumsyPilot 485 days ago

> Google's problem that nobody else has a million cores, wouldn't you agree

On the contrary - their advantage. They know it and they can make outlandish claims that no one will disprove

link

webmaven 485 days ago

For a while, anyway.

link

mmooss 484 days ago

> That applies to absolutely everyone.

Eating, drinking, sleeping apply to absolutely everyone. Deception varies greatly by person and situation. I know people who are painfully honest and people I don't trust on anything, and many in between.

link

Bloating 485 days ago

https://www.youtube.com/watch?v=shFUDPqVmTg

link