|
|
|
|
|
by ZoF
3778 days ago
|
|
This implies there aren't future episodes upon which this type of statistical analysis could be applied. This also strongly implies you think the author is a 'budding data scientist' out of his/her league. This is very much a 'sample' given the context that South Park is still releasing new episodes. FYI all elitist 'statisticians' ... |
|
As far as I can tell, there are a lot of people out of their leagues going around with the title "data scientist".
This is not a sample. This is a census at this point in time. The fact that there will be another population tomorrow does not change the fact that you have the entire population of all words spoken by all characters up to today.
I am not a statistician. I am an economist who knows enough about statistics and econometrics to know when a significance test is applicable.
Also, do note the issue that R's csv parsing is going to mis-attribute some characters' speech to others. GIGO speaks loud.