Hacker News new | ask | show | jobs
by morgante 3561 days ago
> It has to be in a book, codified and signed off on?

The point is for rules to be formal then the must be formalized somehow and codified. This could be online or in a book, but the point is that there must be some clear delineation between when the rules are followed and when they are broken. Otherwise what's the meaning of "formal" rules?

> If so, aren't these kind of antithetical to AAVE on the face of it?

Yes. My argument is that AAVE, almost by definition, is an informal dialect without formalized rules.

This is getting rather off topic, but I think it relates to the original point of why NLP might start with Standard English even if you are not biased. A large corpus of Standard English text (such as from the WSJ) will generally be very internally consistent precisely because it follows a set of formal rules codified into a style guide. As there is no such equivalent for AAVE, even gathering a large and internally consistent corpus of AAVE text seems prohibitively difficult. That being said, I do hope researchers are working on gathering text from Twitter to build up new training sets.

1 comments

The point is for rules to be formal then the must be formalized somehow and codified

So every form of English is an "informal dialect" then? Because this ain't French with the Académie publishing strict rules for use of the language. Do you also say that the languages of remote Amazon tribes aren't "real languages" because they don't have a formal government body publishing written rules?

Or do you just want to bash on AAVE and are grasping at straws for reasons why?

> So every form of English is an "informal dialect" then?

Yes, the majority of spoken English does not follow the rules of Standard English. Pretending that such rules don't exist is willful ignorance though: the WSJ obviously write a more formalized version of English than teenagers do in text messages.

> Do you also say that the languages of remote Amazon tribes aren't "real languages" because they don't have a formal government body publishing written rules?

Nowhere did I say that AAVE is "not a real language" because it's less formalized than Standard English. Prior to spelling reforms, English itself was extremely inconsistent and informal, but I certainly don't pretend that it wasn't a language.

> Or do you just want to bash on AAVE and are grasping at straws for reasons why?

I'm not trying to bash AAVE. In fact, I'd even posit that the reason AAVE isn't more codified is perhaps because of racial bias which treated it simply as "incorrect" English instead of a separate dialect worthy of formalization. Pretending that all languages are equally formalized is simply willful ignorance though.