|
|
|
|
|
by haikuginger
3059 days ago
|
|
If you need to check against multiword tags, I'd suggest a utility function to expand a list of words into each possible one-or-more-word subset. Should still be substantially faster than the current state, and you can improve it even more by limiting it to phrases with no more words than the tag with the maximum number of words. def get_all_phrases(descr):
words = descr.split()
if len(words) == 1:
return words
phrases = []
for i in range(2, len(words) + 1):
phrases += get_phrases_of_len(i, words)
return words + phrases
def get_phrases_of_len(length, words):
return [' '.join(words[i:i+length]) for i in range((len(words) - length) + 1)]
|
|