Hacker News new | ask | show | jobs
by e-dt 1694 days ago
Syllables should always have only one vowel (technically nucleus) in them, which means that the bogus "syllables" the website is producing can't be real syllables - that might be the rule that you can't articulate. However, the website actually does not use GPT-2 to produce the syllables, it uses the library "pyhyphen".

This library provides (misleadingly) a function claiming to create a list of syllables from a word, which does not actually do that and instead splits the word up by all possible hyphenation points - and not every syllable break gets or indeed should get a hyphenation point. (For example, you do not want to break up prefixes or suffixes during hyphenation.)

[0] https://github.com/turtlesoupy/this-word-does-not-exist/blob...

[1] https://github.com/dr-leo/PyHyphen/blob/master/src/hyphen/hy...

1 comments

Would it be correct to say that syllables in English are made of nuclear vowel phonemes and valent phonemes which could be vowel or consonant in nature? I’m not a linguist, but I’m very interested in language and how words are constructed.
I'm not sure what you're referring to by "valent phonemes." The standard treatment of a syllable, not only in English but cross-linguistically, is that a syllable is composed of an cluster of consonants at the beginning (the onset), a single "nucleus" phoneme that is most often a vowel or, somewhat less often, a sonorant consonant like 'r' or 'l', and another cluster of consonants at the end (the coda). The specific ways that syllables can be formed in a particular language are governed by the "phonotactical rules" of that language. These are rules like the English rule that an "ng" sound may only occur immediately after the nucleus, or like the Japanese rule that a coda may only be "n" or "" (the null coda).
I was kind of referring to valence electrons - so think “edge phonemes” - but I don’t think that came across.

This is a really interesting explanation of something that I kind of understand, but want to have a deeper understanding of. Any good references you can think of offhand?