|
|
|
|
|
by jaimie
2896 days ago
|
|
This is a fun exercise. Back in 1963, Fred Mosteller and David L. Wallace wrote a piece in the Journal of the American Statistical Association titled "Inference in an Authorship Problem: A comparative study of discrimination methods applied to the authorship of the disputed Federalist papers" [0]. It describes another technique for analyzing the authorship using a Bayesian model of word distributions. One interesting thing about this is the claim that there is a ground truth for all but 12 of the papers, meaning that supervised learning could also be used. For discussion, I often think that unsupervised methods are preferred to supervised methods, given a reasonably low error rate by the unsupervised method, as it will be able to generalize more readily. [0] https://www.jstor.org/stable/2283270 |
|