|
|
|
|
|
by jzdziarski
1447 days ago
|
|
Fascinating study. It would carry more currency with me if the ngrams used were learned through ML and corpus training rather than heuristics. It is assumed these ngrams are particularly useful markers today. The significance in their absence doesn’t seem to be part of the research. When there is such dramatically “off the chart” data such as this, you need to start looking under the hood at other factors. Studying individual authors would be a good start. Other factors may also include the editorial process. Publishers tend to have common nuances they like to conform to. Could the editorial process explain why some of these nuances are widespread? It’s also worth considering that technical language (computing, for example) made great leaps around this same time period and bled into common parlance. Social media, which is cited here only as a loose correlation, also altered the brevity of writing, which changed how we use and communicate language - but it’s a far stretch to call these subtle changes distortions, at least beyond letting Jack Dorsey fuck up our use of language. The authors argue that the meaning of these ngrams hasn’t changed, but their application sure has. Overall it feels like a great area to study, and as good science does, presents more questions than it does answers. There is much more to explore here though before we can conclude the entire world is depressed. I am not an expert in linguistics, but I do feel as though there is a modern element missing from this research. |
|
I would have significantly less confidence. How would you learn such a set? You then would need a set of texts that are clearly labeled from people with cognitive dissonance and without. I don't think such a set exists. Also note that the n grams have been tested previously for individuals (ref 17 in the paper)
Your post points to another interesting line of research (and maybe that is what you meant), can we find correlations between the language used in previous periods of unrest, e.g. in Germany the period of WW2 and other periods.
> It’s also worth considering that technical language (computing, for example) made great leaps around this same time period and bled into common parlance.
The authors specifically mention this, but it should bias the results in the other direction, i.e. technical work has less prevalence of the ngrams according to the authors (I'm unsure if they tested this).
>. Overall it feels like a great area to study, and as good science does, presents more questions than it does answers. There is much more to explore here though before we can conclude the entire world is depressed.
Note that the authors are very cautious about making any such claims and in fact acknowledge the question if applying these markers to societies is valid
> I am not an expert in linguistics, but I do feel as though there is a modern element missing from this research.
I'm not sure I understand. To me it seems like quite solid research (although I admit I don't know much about CDS markers...) without using some hype methods like ML just for the sake of it.