All our code is at http://github.com/langcog/wordbank and you can access the database directly using the wordbankr R package (on cran).
A paper doing something similar to what you describe is in prep, with a conference version here:
http://langcog.stanford.edu/papers_new/braginsky-2016-underr...