|
|
|
|
|
by rspeer
3405 days ago
|
|
Would it be possible to mirror just the data somewhere else, such as S3? I don't need the R code, but this sounds like it would make good companion data to my own wordfreq [1]. It would be interesting to see which words are learned early but relatively uncommon in corpora, and generally to be able to measure differences in register between child and adult language. [1] https://github.com/LuminosoInsight/wordfreq |
|
All our code is at http://github.com/langcog/wordbank and you can access the database directly using the wordbankr R package (on cran).
A paper doing something similar to what you describe is in prep, with a conference version here:
http://langcog.stanford.edu/papers_new/braginsky-2016-underr...