| That we artificially decompose the process of accepting a sentence as, eg, proper English into two phases: syntactic correctness and semantic correctness. However, that distinction is arbitrary -- there is only the question of if the sentence is accepted by an agent (eg, person) as a well-formed sentence. Any full accounting of the class of well-formed sentences must embed the semantic concerns; violating semantics is a syntax error (albeit, not usually a "first order" one). Similarly, even base syntax, such as the subject/verb/object distinction and ordering is carrying semantic information about word usage. The distinction between the two is non-existent: a full accounting of either must embed the other. So semantics is syntax -- if you write a system of rules that only accepts valid sentences, then the rules will end up carrying the semantic structure of the language in them. Ed: I suppose I left out why this might matter -- In the quest to build an AI that understands semantics (ie, that "understands meaning"), we can bypass attacking that problem directly by training it (eg, a NN) on the full acceptance task (joined syntax and semantics -- classifying a sentence as proper English or not), and then truncate the network away from "low level" features (and perhaps at the other side, focused on 'yes' or 'no') to extract a network that has (most of) the (abstract) semantic structure embedded. We could then utilize these "middle level" features as a sort of rosetta stone, to train low level networks to embed content for them to understand, and high level networks to utilize their output on decisions to repurpose "understanding" across tasks. I would argue that using things like Word2Vec (or the resulting vector space of words) is a similar idea. |
2) Me gizmo.
One of the above sentences is valid English, the other one is meaningful. That is the syntax/semantics distinction.
The distinction is a bit fuzzy in places (for instance, inflectional morphology), but does exist.