| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tcwc 4762 days ago
	Neat idea! It looks like the NLTK POS tagger is having trouble here so might limit your recall when used as a filter. Instead I wonder if it would be better to use the context of each token to mine significant ngrams from the rest of Shakespeare's work and filter for rhymes with a phoenetic hash like Metaphone.

1 comments

garysieling 4762 days ago

Interesting thought, thanks! I was thinking an approach like that would be good for non-dictionary words.

One of the things I didn't go in detail in is the issue where there are multiple pronunciations for a word - I was thinking that the way to address that would be to compare pronunciations between lines, but looking at metaphones across Shakespeare's work overall might also help build a solution to that.

link