Hacker News new | ask | show | jobs
by jessfyi 972 days ago
It's not fine when this "magic" is being advertised as on-device.

After reading (and attempting to quickly implement the models ensembles within) both the RealFill[0] and Break-A-Scene[1] papers published from Google researchers just prior to the Pixel 8 launch I was expecting either a leap in their G3 tensor core akin to 2013 Moto X NLP+contextual awareness cores[2] (which provided better implementations of Active Display, gesture recognition, and voice recognition in loud environs than 95% of current mobile devices) or the Coral[3], the edge TPU they developed that got shockingly amazing inference performance from (though HW production handed off to ASUS in 2022--thanks to the chip shortage, the general arbitrary nature of the company, and their wholesale divestment from IoT) I expected more.

All that to say this: your assumptions of inference performance on >$1000 hardware are fundamentally flawed (the fact that you reach for the buzzy "generative" prefix suggests they're erroneously informed by twitter influencers and attempting to deploy current LLMs.)

Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand. If they failed to meet performance, power draw, or processing time requirements, they should've reframed their pitch instead of exposing themselves to what is likely going to be yet another class action suit focusing on their hardware.

[0] https://realfill.github.io/ [1] https://omriavrahami.com/break-a-scene/static/paper/Break-A-... [2] https://en.wikipedia.org/wiki/Moto_X_(1st_generation)#Hardwa... [3] https://coral.ai/

2 comments

> It's not fine when this "magic" is being advertised as on-device.

I can't help but notice that you included a lot of references, but none for this claim.

Neither of the features mentioned in the article is claimed to be on-device in the official Pixel 8 Pro announcement blog[0]. The only feature that the blog post claims is on-device is the Best Take feature, which the article does not say requires an internet connection.

But of course that's just one bit of marketing material, and I'm sure you've seen these features advertised as happening on-device. Maybe you could post a link?

[0] https://blog.google/products/pixel/google-pixel-8-pro/

> Custom hardware can and has been developed in the past (on mobile devices) that could've been tailored to the task at hand.

You think google doesn't know that? are you aware of what's inside of Google's phones?

I'm not sure what performance benefit you expect out of custom hardware. How many orders of magnitude? You're going to probably need at least a few, probably more, to make generative AI work well in the palm of your hand.

Oh, and if you've figured that out, Apple, Google, OpenAI, and other AI companies would like a word.

I'm aware that what's in Google's phones aren't capable of doing the on-device ML inference they claim. You might want to actually read what both I and the article are addressing in particular beyond the broad "generative AI" umbrella that you and other philistines new to the field are imagining aren't capable of being performed on device.
> But of course, on-device generative AI is really complex: 150 times more complex than the most complex model on Pixel 7 just a year ago. Tensor G3 is up for the task, with its efficient architecture co-designed with Google Research.

This is a direct quote from an official press release [0]. They claimed Tensor G3 is "up for the task" that is "on-device generative AI".

I'd say "if you can't do it, simply don't promise it", but the fact is, this is the third time Tensor has been outright incapable of what was promised. People pointing that out are more than justified.

Prior to the launch of the Pixel 6, with their first generation of Tensor SOC, they made big promises concerning HDR video performance [1], implying heavily or outright stating (depending on whom generous you want to be) that they'd finally manage to be on par with Apple. They weren't, by a lot. Pixel 6 video performance was neither on par with Apple nor did it exceed the Pixel 5 on an upper-mid SD765G. Still, first-gen and a bit of overhyping happen to the best of us.

During the Pixel 7 launch [2], they claimed Tensor G2 enabled users to finally get computational photography for high-quality videos. Spoiler alert: It didn't. Fool me once...

Now, on the Pixel 8 with their third generation of Tensor, they finally have a solution that gets their nighttime video processing results competitive with the current iPhone in the form of Video Boost. Instead of doing that processing on their amazing Tensor SOC though, they offload that to the cloud [3]. At least they didn't promise on-device processing improvements to video with the G3, only a tone of GenAI capabilities...

I have followed Tensor extensively, and I am happy to see that they are at least utilizing their control over the silicon to provide a longer update cycle. But few of their local processing promises have held water, and even fewer appear to be impossible on contemporary SOCs from competitors such as Qualcomm (who are by no means angles and need all the competition the market can provide).

If the Pixel team were more honest about their SOCs capabilities and proactively transparent on what they run locally vs off-load to datacenters, that'd be appreciated. With Video Boost they did just that, though I fear that was mainly because of the upload times...

[0] https://blog.google/products/pixel/google-tensor-g3-pixel-8/

[1] https://9to5google.com/2021/08/02/google-pixel-6-video-hdr-t...

[2] https://www.youtube.com/live/2NGjNQVbydc?si=2Gg1mPrdOkmu1L44...

[3] https://blog.google/products/pixel/google-pixel-8-pro/