|
|
|
|
|
by dsteinweg
5183 days ago
|
|
It looks like it's pulling characters from the paragraph to generate the "unique" paragraph ID. ID = First letter from the first 3 words in the first sentence in the paragraph + First letter from the first 3 words in the last sentence in the paragraph. I wonder... for all the different articles on NYTimes, and the different configurations of words across paragraphs, is this unique enough such that you won't get duplicate paragraph IDs in any given article? |
|
Not it!