|
|
|
|
|
by thaumasiotes
3560 days ago
|
|
"Formal rules", in the context you've chosen to speak in, are defined by this upstream comment: > NLP is not very good with standard English yet and usually doesn't generalize from topic to topic. Dialects and other languages - especially those without formal rules - will come when we can deal with standard English. The rules you're talking about, that get printed in books and studied, are not linguistic rules. Crucially, this means they are not widely observed in printed standard English, which in turn means they can't be relevant to training a language model to understand printed standard English. The "formality" you seem to want to talk about has no place in this discussion. It is not relevant to any language. gordonguthrie is correct to point out that the assumption lqdc13 is trying to make is false. You are wrong to contradict him using a meaning of "formal rules" that you brought to the conversation yourself. It had a meaning -- a completely unrelated meaning -- before you showed up. |
|
I agree that they're not widely observed in written English, but they are consistently observed in the WSJ, which was the origin of this entire debate.
As lqdc13 pointed out, NLP still isn't even consistently good at understanding standard English. One could reasonably posit that that's due to the inherent ambiguity and inconsistency of most writing and that focusing on a narrower, standardized document corpus (the WSJ) you could get better initial results. What, exactly, is controversial about that? Do you really think that the language of the WSJ is no more consistent and formalized than the language of Twitter users?