|
Interesting for the hard example of #2, it outperforms the project, though I credit this to memorization (given that it is able to reproduce the correct stanza and punctuation for "Spring and Fall, to a Young Child"). FWIW, the only reason you need DP to get it "right" is because, well, you want it right. A human can of course generally split words with just a language model in 1-pass, as long as you don't have ambiguous text. And on the flipside, you absolutely need a language model to correctly segment text. "ilovesnails" can only be decoded correctly if you understand subject-verb agreement, given that there are two solutions that have dictionary agreement. "I love snails" and "I loves nails" FWIW, GPT-4 tubro is imperfect. > Heenjoysgoingtotheparkswimmingdancingandlovesnails produces > He enjoys going to the parks, swimming, dancing, and loves snails. Note how it added an additional "s" in presumably because "snails" is just so much higher probability than "nails" to "love" (no idea why "park" also became "parks"). I found it hard to guide it to the correct solution without explicit prompting. Amusingly even with guiding, it first broke it's own grammar model, first choosing: > He enjoys going to the park, swimming, dancing, and love snails. |