Hacker News new | ask | show | jobs
by weinzierl 182 days ago
In C, the token sequence

  T * x;
can be interpreted either as a declaration or as an expression (multiplication). Resolving this ambiguity requires context, namely whether T is a typedef name or an object name, which cannot be determined from syntax alone.

One could argue that it is acceptable to produce an ambiguous parse and defer the decision, and in practice compilers do rely on semantic information to resolve such cases.

However, this is not what context-free means in the Chomsky sense: a context-free grammar must be able to determine the syntactic structure without reference to prior declarations or semantic state.

1 comments

Exactly - I'd see this as an example of a language that is not context-free, because generating the AST requires information beyond the root node/production and the characters of the subtree.

In contrast, languages like Java, JavaScript or C# would be context-free: The string

  hello(foo + bar * 5);
is not a "valid" JavaScript program, because of the undefined identifiers, but it does have a non-ambiguous AST.
Thanks for the clarification. C has an easy to demonstrate example and languages like Java, JavaScript and C# are (deliberately in my opinion) closer to a true context-free grammar but they all have counter-examples and so are not context-free in the computer science sense.

https://stackoverflow.com/questions/898489/what-programming-...

(See also the answer by Dave with currently only the second most votes. If that is what you mean, I agree.)