|
|
|
|
|
by ncmncm
1757 days ago
|
|
It might be worth noting that all the other plays that scored anywhere close to the Scottish play (427) are much longer. You have to go down to 17 (344) to get to a shorter play; only 6 (397) and 15 (345) approach it. If we scale by length twice (count/length^2), the contrast becomes more stark
(retaining original order): | 1 | macbeth | 724 | 16929 | 252 |
| 2 | henry-v | 1065 | 25577 | 162 |
| 3 | coriolanus | 1126 | 27294 | 151 |
| 4 | loves-labors-lost | 855 | 21093 | 192 |
| 5 | henry-viii | 962 | 24074 | 165 |
| 6 | the-merchant-of-venice | 834 | 20985 | 189 |
| 7 | henry-iv-part-2 | 1001 | 25762 | 150 |
| 8 | henry-vi-part-2 | 990 | 25597 | 151 |
| 9 | hamlet | 1142 | 30006 | 126 |
| 10 | henry-iv-part-1 | 856 | 24100 | 147 |
| 11 | henry-vi-part-3 | 866 | 24491 | 144 |
| 12 | antony-and-cleopatra | 861 | 24465 | 143 |
| 13 | king-lear | 898 | 25661 | 136 |
| 14 | the-winters-tale | 854 | 24568 | 141 |
| 15 | king-john | 717 | 20730 | 166 |
| 16 | cymbeline | 959 | 27738 | 124 |
| 17 | a-midsummer-nights-dream | 564 | 16377 | 210 |
| 18 | richard-iii | 985 | 28914 | 117 |
| 19 | richard-ii | 753 | 22224 | 152 |
| 20 | henry-vi-part-1 | 715 | 21575 | 153 |
| 21 | pericles | 605 | 18282 | 181 |
| 22 | troilus-and-cressida | 837 | 25810 | 125 |
| 23 | titus-andronicus | 659 | 20621 | 154 |
| 24 | alls-well-that-ends-well | 724 | 22683 | 140 |
| 25 | measure-for-measure | 693 | 21858 | 145 |
| 26 | the-tempest | 518 | 16489 | 190 |
| 27 | the-comedy-of-errors | 455 | 14552 | 214 |
| 28 | the-two-noble-kinsmen | 735 | 23751 | 130 |
| 29 | julius-caesar | 592 | 19251 | 159 |
| 30 | as-you-like-it | 664 | 21692 | 141 |
| 31 | twelfth-night | 573 | 19675 | 148 |
| 32 | othello | 737 | 25670 | 111 |
| 33 | much-ado-about-nothing | 591 | 20843 | 136 |
| 34 | romeo-and-juliet | 677 | 23948 | 118 |
| 35 | the-merry-wives-of-windsor | 604 | 21603 | 129 |
| 36 | timon-of-athens | 504 | 18262 | 151 |
| 37 | the-taming-of-the-shrew | 449 | 18709 | 128 |
| 38 | the-two-gentlemen-of-verona | 404 | 17010 | 139 |
with only 17 and 27 breaking 200, and still well shy of 252.But the real point of the article is that the oddity of "the" in the frequency table attracted their attention to that word, and led them to identify an actual peculiarity in its usage. To say henry-v demonstrates anything similar, you would need to check if usage in that play is similarly peculiar (which I have not done either). It seems odd to suggest (as some commenters have done) that the difference was subconscious. My null hypothesis is that peculiarities in usage by a professional wordsmith are deliberate. I expect to see actual evidence that the author didn't know what he was up to. |
|
If we assume that length of a play has an influence on the frequency of stop words, shouldn’t we compare samples of each play? (First x pages or y randomly sampled words)