Hacker News new | ask | show | jobs
by TuringTest 1483 days ago
How do you calculate the "most likely character to appear next", if not by memorizing lots and lots of existing sentences? ML is by essence a copycat that will regurgitate what it has seen before in a new context, no matter how hard you try to hide it under the mathematical shape of the probabilities of single characters in a sequence.

Now, there is the philosophical question of whether human creators simply do the same. (Which the don't; we have other mental processes for creating ideas than predicting the next letter we are going to utter next). But that doesn't change the fact that the likeliness of each emitted word is determined by what the model has seen more often in relation to the current context and therefore it considers most "valid".

1 comments

> How do you calculate the "most likely character to appear next", if not by memorizing lots and lots of existing sentences?

Well that's how languages work right? Words are the most common sequence of letters.

But that doesn't mean it's regurgitating parts of sentences it had previously seen anymore than I'm regurgitating when I'm typing this.

Mechanically it has learnt both syntax of language and how concepts relate. So when it starts generating it makes sentence that are syntactically valid but also make sense in terms of concepts.

Thats really different to just combining bits of sentences, and it gives rise to abilities you wouldn't expect in something just cutting and pasting bits of sentences. For example, few shot learning is mostly driven by its conceptual understanding and can't be done by something with no way to relate concepts.