Hacker News new | ask | show | jobs
by domenicd 47 days ago
I was formerly the design lead / spec editor for this API while I worked at Google. I retired in 2025-09, before it got shipped. The following contains no inside knowledge.

I am sympathetic to all of Mozilla's concerns here, even though on balance I believe Chromium's decision to ship was the right one.

---

On interoperability, I agree that this is a tough case. But I am more optimistic than Mozilla that developers will use this API in a way that can work across different models.

First, they will be somewhat forced to, because Chrome will change the model over time. (It already changed from Gemini Nano 2 to 3, and I suspect it'll change to 4 soon if it hasn't already.) Edge is already shipping a Phi-based version. A small number of users are using other models via extensions like https://aibrow.ai/. And it's very possible Safari might join the party, exposing the Apple Foundation Models that ship with iOS via this API. (When the Foundation Models API came out, we were struck by how similar it was to the prompt API designs that preceded it, and were hopeful that Apple was going to do a surprise announcement of shipping the prompt API. It hasn't happened yet, but I still think it might soon.)

Second, we designed the API to steer developers in that direction as much as possible, e.g. encouraging the use of structured output constraints. There are also lots of clear error paths, that almost force developers to use this as a progressive enhancement. (E.g., the existence of low-memory/disk space devices.) So it's very unlikely we'll see developers build sites that are gated on this API existing. It'll mostly be used to sprinkle some AI magic, or let users do cool things without entering some cloud API keys.

I made similar arguments for the writing assistance APIs at [1]. As I said there, the prompt AI is trickier than the writing assistance APIs. But I believe it's a difference of degree, not kind. The web has many nondeterministic APIs that access some underlying part of the system, from geolocation to speech recognition/synthesis, all the way up to these AI-based ones. The question is where you draw the line. Mozilla seems to be giving some signals (not yet definite) that translation is on the OK side of the line, but summarization/writing/rewriting/prompting is not. That's a very reasonable position for them to take on behalf of their users. I imagine the Chromium project is hoping that over time, in-the-wild experience with these APIs shows that the benefits outweigh the risks and costs, and so Mozilla (and Apple) follow in shipping them as well. That's definitely happened in other cases, e.g., Mozilla recently indicating interest [2] in implementing WebBluetooth, WebHID, WebNFC, WebSerial, and WebUSB after years of taking a wait-and-see attitude.

You can learn more about my general thoughts on this question of shipping APIs first, and how the Chromium project takes on first-mover risks, at [3], which I wrote during my time on the Chrome team.

---

On the prohibited use policy, I agree that this is just absurd on Chrome's part. This is not how web APIs should work. It smacks of lawyers trying to throw something out there to cover themselves, or of corporate policy being set at the top level for "all AI uses" and then applied even for web APIs where that makes less sense.

The only saving grace is that I suspect it won't actually trigger. Because, as Mozilla points out, it's quite impractical to enforce. But it's still wrong.

I hope Chrome changes this, although I'm not holding my breath.

I did find it interesting that Gemma seems to have a similar terms of use [4]. (Open-weights, not open-source!) As do the Apple Foundation Models in iOS [5]. So unfortunately if the Chrome team were to push for a no-TOS API, they might be forging new ground, which is always difficult in a large company.

---

On the issue of insubstantial developer signals, I think this is just a failure of the current Chrome team in terms of collecting and collating signals. If one pokes around and knows where to look in various threads, you can find a lot more positive signals than the outdated ones in [6]. I wouldn't have let that Intent to Ship get out the door without properly updating that section of the explainer, for sure. (But hey, not my job anymore!!)

[1]: https://github.com/mozilla/standards-positions/issues/1067#i... [2]: https://github.com/whatwg/sg/pull/264 [3]: https://www.chromium.org/blink/guidelines/web-platform-chang... [4]: https://ai.google.dev/gemma/terms [5]: https://developer.apple.com/apple-intelligence/acceptable-us... [6]: https://github.com/webmachinelearning/prompt-api/blob/main/R...

4 comments

Thank you for posting this.

On interoperability, time will tell I guess. I've only been working on Firefox for a few months, but general interop issues are way worse than I realised when we worked together at Chrome. Firefox frequently gets bug reports for not behaving like Chrome, even when Firefox is complying with the spec, and Chrome is not. We end up having to just behave like Chrome.

On developer signals… I'm sure there's better evidence of positive sentiment than Chrome provided, but there's a lot of negative sentiment too. I think it would be fair to call the developer signal "mixed", or maybe even "polarised".

Hey Domenic,

Sucks to be corresponding via The Second Worst Website In The World (TM), but here we are. Hope all's well on your end.

A minor correction from Edge's perspective: we've participated in OT using Phi models, but have not shipped to Stable, and are unlikely to given the current shape of things. Developers have not given us feedback that they're relaxed about compatibility, but I would obviously welcome that sort of data in case anyone has it to hand.

Best,

Alex

> So it's very unlikely we'll see developers build sites that are gated on this API existing.

I think this is an oddly optimistic outlook from someone who until recently worked at Google. A company that has shipped, and probably still ships, lots of sites and versions of sites gated behind "does your user agent say it is Chrome".

I just wonder how a highly non-deterministic API like the Prompt API can work in a system that heavily focuses on interop between new and old websites.

What's going to happen is that people build stuff with the current iteration and a few years later a model update will work entirely differently and break the existing implementations. I understand that every once in a while OpenAI also shuts off older models through API but that's a central process.

What if I have Firefox 150 users that haven't updated yet and Firefox 155 users that have different models, while Chrome 160 and Chrome 170 users also exist and have different models. Is it expected that I build entirely different implementations for every browser version out there? Don't the work groups try to prevent exactly that within HTML & CSS through feature gating?