Hacker News new | ask | show | jobs
by thinkpad20 4435 days ago
I don't think that's what the poster meant. By "average unique words per song" I take it to mean, within each song words are only counted once, but across songs, words can be counted multiple times. So if song A had the words "I like cats" and song B had the words "I like dogs", then the average unique word count would be ((3 + 3) / 2) = 3, not ((3 + 1)/2) = 2.
1 comments

That's definitely one solution, but it still wouldn't quite capture it. As an extreme example, if rapper A produced 100 songs, each with exactly the same lyrics, they should surely be penalized compared with rapper B producing 100 songs with no shared words— even if rapper A's average unique-words-per-song is higher than rapper B's.