Hacker News new | ask | show | jobs
by summerlight 1292 days ago
This is so true. Some folks in Ads also tried to explore using large language models (one example: LLM is going to be the ultimate solution for contextual targeting if it's properly done), but one of the major bottleneck is always its cost and latency. Even if you can afford cpu/gpu/tpu costs, you always have to play within a finite latency budget. Large language model often adds latency by order of seconds, not even milliseconds! This is simply not acceptable.

I think Pathways is one approach to tackle this issue at scale by making the network sparsely activated so the computation cost can be somehow bounded based on difficulty of each query. This effectively gives Google knobs for the axis across computational cost and the result quality by limiting the size of network to be activated. If it turns out to work well, then we might be able to see it incorporated to Search in a foreseeable future.

1 comments

That's the thing though, Google doesn't have to release this with Search or in Chrome. It could be a separate product that they can gate access to (charging say $5/mo for 'x' queries a day)? Or, API the model behind GCP? But: Outside of DeepMind, there's nothing comparable from them (in terms of utility AI).
This is the problem of Google, or almost every other big techs. Its infrastructures, products and businesses are designed to serve at least hundreds of millions of users. This works really well for established products but significantly elevates the launch bar for new products, even seemingly easy projects like "why not having this as a small experimental website?". I won't be surprised if someone in the research team actually tried to bring up a small demo site but immediately found a showstopper from product counsels or internal AI guidelines...

Launching a full-fledged paid product is even harder, I guess you'll need to secure at least 3~40 headcounts just to integrate this into many subsystems inside Google. And this needs some senior executives driving the project since this is a cross-organization project between research and products. This creates a structural problem, in that they usually expect bigger impacts from these kind of projects to justify the cost. It's possible to pursue without involving top-down decision makers, but usually that kinds of project tends to fail to create consensus since everyone has different priority.

So "a separate small, experimental product" is not going to work unless 1. the model becomes fully productionized, generally available inside the company so a single VP (or even director) can quickly build a prototype to demonstrate or 2. someone successfully proposes a convincing path to the major billion user product to draw senior executive's attention or 3. the research team decides to build their own product team from scratch and aggressively invest into the sub team.

I concur, but I thought there's a reason Area 120 exists?
This is the first thing which came to my mind as well, pay per use!