|
|
|
|
|
by IIAOPSW
1182 days ago
|
|
Your argument is that maybe we can brute force with statistics sentences long enough for no one to notice we run out past a certain point? Everything you said applies to computers too. Real machines have physical memory constraints. Sure the set of real sentences may be technically finite, but the growth per word is exponential and you don't have the compute resources to keep up. Information is not about what is said but about what could be said. It doesn't matter so much that not every valid permutation of words is uttered, but rather that for any set of circumstances there exists words to describe it. Each new word in the string carries information in the sense it reduces the set of possibilities from prior to relaying my message. A machine which picks the maximum likelihood message in all circumstances is by definition not conveying information. Its spewing entropy. |
|
>> Your argument is that maybe we can brute force with statistics sentences long enough for no one to notice we run out past a certain point?
No, I wasn't saying that, I was saying that we only need to model sentences that are short enough that nobody will notice that the plot is lost with longer ones.
To clarify, because it's late and I'm tired and probably not making a lot of sense and bothering you, I'm saying that statistics can capture some surface regularities of natural language, but not all of natural language, mainly because there's no way to display the entire of natural language for its statistics to be captured.
Oh god, that's an even worse mess. I mean: statistics can only get you so far. But that might be good enough depending on what you're trying to do. I think that's what we're seeing with those GPT things.