I used nltk to extract only nouns and to do word stemming (e.g. so that "time", "times" and "timing" are only counted as one word).
I also experimented a lot with various method of determining word size i.e. size proportional to frequency, size proportional to log(frequency), size proportional to sqrt(frequency).
It's funny, I did exactly the same, with my Hangouts Takeout extract, a couple of weeks ago, but didn't go as far, because I kept struggling with stopwords and some ways to filter out uninteresting stuff (my implementation was much more naive than yours). I'm still thinking about what other types of analysis I could perform on that interesting dataset though (because it's so personal after all).