|
|
|
|
|
by LunaSea
763 days ago
|
|
One of the reasons for why word vectors, sentence embeddings and LLMs won (for now) is that text found on the web especially, does not necessarily follow strict grammar and lexical rules. Sentences that are incorrect but still understandable. If you then include leet speak, acronyms, short form writing (SMS / Tweets), it quickly becomes unmanageable. |
|
From what I understand, the modern understanding is that these point to the failure of grammar as a prescriptive exercise ("This is how thou shalt speak"). Human speech is too complex for simple grammar rules to fully capture its variety. Strict grammar and lexical rules were always fantasies of the grammar teacher anyway.
See, for example, the following article on double negatives and African American Vernacular English: https://daily.jstor.org/black-english-matters/.