| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bensyverson 3 days ago

I would love to learn more about what's actually powering Apple Intelligence now. Are they using flagship Gemini models behind their own prompts? Fine-tuning? Pre-training their own models based on Gemini?

Is there a meaningful distinction between the Gemini-powered models and Apple Foundation Models? Does that distinction vary for on-device vs hosted models? Are some models running on Apple's Private Cloud Compute and others running on Google iron?

Edit: they elaborated significantly in a "keynote tech-talk": [0]

According to Apple, there are five models:

On-Device

- AFM Core: Dense architecture; the standard next-gen on-device model

- AFM Core Advanced: Sparse architecture, natively multimodal; enables features like image understanding and expressive voices

Private Cloud Compute

- AFM Cloud: Workhorse server model optimized for latency and cost

- AFM Cloud Image: Image generation and editing

- AFM Cloud Pro: Most capable model, Gemini frontier-level quality, for complex reasoning and agentic tasks; runs on NVIDIA GPUs in Google's cloud under Apple's PCC privacy guarantees

Everything excluding Cloud Pro are custom models running on Apple Silicon, "refined" using Google Gemini. About Cloud Pro, they say "this is our most capable model with quality similar to Gemini frontier models." So I might read between the lines and say this is a wrapped Gemini.

  [0]: https://9to5mac.com/2026/06/08/craig-federighi-details-apples-collaboration-with-google-for-siri-ai-in-ios-27/

7 comments

kube-system 3 days ago

> what's actually powering Apple Intelligence now.

It's a 3B Apple Foundation model.

https://machinelearning.apple.com/research/introducing-apple...

If you've got a mac, you can use this to play around with it:

https://apfel.franzai.com/

djsjajah 3 days ago

I think what they mean by “now” is the stuff announced today.

bensyverson 3 days ago

It's more complicated than that (see my edit above).

Telemakhos 2 days ago

Thanks for that. I like being able to pipe output into a local LLM to get an explanation of the output.

pokstad 2 days ago

That’s really neat. I wonder if that model that shipped recently with Chrome is also accessible similarly?

whywhywhywhy 2 days ago

> Are they using flagship Gemini models behind their own prompts? Fine-tuning? Pre-training their own models based on Gemini?

Easiest way to tell, how much dancing around did we listen to and how many diagrams did we have to look at? If they had their own tech we wouldn't be looking at diagrams we'd just be getting told Siri AI, it's private, it's powerful, here's what it can do. Instead we had 10 minutes talking around the tech and this diagram [1] which is a signal that it's a bunch of other peoples stuff cobbled and wrapped together.

[1]:https://www.apple.com/newsroom/images/2026/06/apple-introduc...

Melatonic 3 days ago

Local is probably similar to Gemma e4b you can get right now on Google Edge Gallery (the ios and Android app). Guessing that the more powerful version that will only work on the 12gb ram devices will be something unreleased that is similar but a bit larger

Google also awhile back announced being able to run full Gemini by leasing / renting hardware in your own datacenters so companies can train or access data without needing to send things to their datacenters. Nvidia based. Guessing Private Compute might just be Apple leasing a ton of those?

NegativeLatency 2 days ago

A larger context window would be nice, the apple model on devices now is almost too small to do cool stuff with

Apple Private Cloud Compute is running on M2/M3 Ultra. I'm not sure if Gemini Flash can fit in that amount of RAM.

nsagent 3 days ago

Am I reading this correctly? Their chosen cloud providers run the PCC stack on their hardware, so the compute provider is responsible for ensuring the privacy guarantees? I assume that would add to the potential security surface area.

Intel and Nvidia are responsible for enforcing their privacy features. The cloud operator (Google in this case) has no access to any data.

bensyverson 3 days ago

Yes, that seems to be the case, and is an evolution/deviation of the original PCC model, which relied on Apple Silicon exclusively.

ErneX 3 days ago

https://security.apple.com/blog/expanding-pcc/

cubefox 2 days ago

> Everything excluding Cloud Pro are custom models running on Apple Silicon, "refined" using Google Gemini

What could "refined" mean here?

jorisw 2 days ago

I was thinking distilled?

leokennis 2 days ago

Always appreciate people answering their own questions in great detail!

pishpash 3 days ago

Gemini (at least public free version) hallucinates way too much. If it's like that, it can go very badly for Apple.

ComputerGuru 3 days ago

I used Gemini exclusively via the API but downloaded the app last week for something. Even on max settings, it is ridiculously nerfed!

hypfer 3 days ago

Unfortunately, even the API variant got RLHF'd pretty hard into being that dumb end-user assistant personality :(

But beside that, I feel like the app variant got worse the day they've had that wwdc-style release thing recently.

Previously it was a sparring partner that could actually keep up. But now it just doesn't.

Truly a shame. And nothing that could be fixed by local models any time soon, given that you need the size for the (cross-)domain knowledge.

t0mas88 3 days ago

The public version of Gemini is ridiculous. At least half their search "answers" are just wrong. If you then start a follow up chat the answers change but usually still half wrong.

Search would be better without the added AI hallucinations above it. If I want an AI answer I'll go and ask Claude, the quality difference is huge.

tonfa 3 days ago

> The public version of Gemini is ridiculous. At least half their search "answers" are just wrong.

That's not Gemini, that's AI Mode (in Search), they're different products built by fairly different part of Google (actually one is built by Deepmind).

(I don't think it's much comparable to https://gemini.google.com/app at least in the past you'd get very different results)

trollbridge 3 days ago

And it's extremely poor marketing by Google to do this - the general perception people have is that Google AI is dumb due to this.

disgruntledphd2 1 day ago

To be fair, this is classic Google.

As I keep saying, they should win this AI stuff, but I have complete faith in their ability to snatch defeat from the jaws of victory.

dyauspitr 3 days ago

It has to be really because think of how fast it has to come up with an answer (ie time for a regular google query) and the immense scale of billions of people querying it many times a day, all for free.

pishpash 3 days ago

Just like search itself, caching does wonders. What do 90% of the people ask anyway but mundane, totally predictable questions?

dgellow 2 days ago

If someone knows about caching it’s google engineers