Hacker News new | ask | show | jobs
by gmadsen 933 days ago
its referring to the search space of valid segmentations, which if set up as a classical problem, it would be some type of DP with backtracking from deadend paths. The full input is known in both cases, its just that gpts arch doesn't need to search any segmentation space, its billions of parameters aproximate the function needed to arrive at the correct answser
1 comments

You wouldn’t even be able to solve this in the standard leetcode DP problem way, because it’s ambiguous if all you know is which words are valid. For example THESEA could be either “THE SEA” or “THESE A”. You need to have a model of English grammar to realize that the former is much more likely to be part of a valid sentence than the latter.
I think even a bigram model would provide enough information.
Is that "big-ram" or "bi-gram"?
it's big RAM now! (a bi-gram is the probability of a word given the previous 2.
Bi-gram aka pairs of words
I think it was a rhetorical question.