Hacker News new | ask | show | jobs
by Zinu 999 days ago
The example at the end made me wonder if Apple's model is actually better than GPT2 for text prediction. It generated garbage, but all that garbage made somewhat sense in the context of only the word "Today".

Whereas GPT2 hallucinated random stuff about the US government. A text prediction model should predict what the user wanted to type, so if you evaluate the models based on that, GPT2 actually performed horribly, since the user showed zero intent in talking about the US.

5 comments

The example at the end sounds just like the predictions you get from normal phone keyboards in the last couple of years, which presumably don't use a modern GPT-style language model. A bit disappointing.
Seriously disappointing. I was expecting that it would not produce total gibberish. It acts like it's a Markov chain, and only considers the last 1-2 words. Identical to the currently-shipping thing that we've had for the past however-many years.
People trying to draw this comparison proves making good products is harder than it seems...

The default goal everyone is assuming is spitting out the longest correct sequence possible.

But in reality the mental cost of a wildly wrong prediction is much higher than the mental cost of a slightly wrong one, so what you'd train the model for is sequences of a few words at most being with higher confidence.

Most people can/will tune out slightly wrong words especially as they get a feel for what autocorrect is good and bad at.

If you unleash the full range of tokens GPT 2 can normally output, you'll constantly be blasting out words they didn't expect.

The fact your long sequence prediction got better doesn't matter because the UI is autocomplete not "auto-write": they're still expecting to drive, and a smart but noisy copilot is worse than a dumb but lazy one in that case.

I wouldn't be surprised if they trained the model to an effective context window of just a few hundred tokens with that in mind

This comment is the summary of the difference between human driven design and technology driven design.

Too many people are focused on technology for technology’s sake.

GPT-2 saw "today," and thought "this must be news copy" and generated more news copy. Given a few more words, it could have narrowed down the context. The Apple suggestions aren't even grammatically correct, seemingly no different from the shallow statistical completion we've already had for years, so it's weird that they branded it in lofty AI terms
Autocorrect doesn't suggest whole sentences so it is irrelevant if the remaining sentence is gibberish or not.
Or it could be the contrary. The new feature doesn't suggest whole sentences because the model they are using produces gibberish. It is quite possible that if the model was better then the would allow it to suggest longer phrases.
Maybe it should though, given we have the power to do so (at least, with just a lil more power)
I suspect someone (Craig even) was under some pressure from The Board to have >0 references to generative-AI in their presentation this year since every single company (even non-software) is now expected by Wall St to "be doing some AI". Even though Apple is at the top of the heap with ML in photography and many other domains, without some kind of LLM the tech news narrative will be "Apple is years behind".
Yes but it also looks no better than the existing autocomplete they have, in which case why use a battery-draining midLM?

"Today is a good day for you to be able to do it more than i just a few weeks to get a new."

It's hallucination if you hate it, and creativity if you like it.
It seems obvious to me that it's not, because if you asked a human to guess what comes after "today" in a text, they'd never say "probably some gibberish about a day a day".
Garbage in, garbage out? The preceding text is gibberish, so the prediction will be worse. Presumably they also only show completions with a much higher confidence threshold.
Maybe: "Today was fine. Since I've retired, I'm taking my life a day a day".

Or maybe I wanted to express myself in the timeless words of the poets:

"A day, a day of glory! A day that ends our woe! A day that tells of triumph. Against our vanquished foe!"

"Rose is a rose is a rose is a rose. Loveliness extreme. Extra gaiters. Loveliness extreme."

"A-well now, everybody's heard about the bird, everybody's heard about the bird, About the bird, the bird, bird bird bird, Haven't you heard about the bird? Don't you know that the bird's the word?"