Hacker News new | ask | show | jobs
by eru 176 days ago
His grammar classification is really useful for formal grammars of formal languages. Like what computers and programming languages do.

It's of rather limited use for natural languages.

3 comments

"BNF itself emerged when John Backus, a programming language designer at IBM, proposed a metalanguage of metalinguistic formulas ... Whether Backus was directly influenced by Chomsky's work is uncertain."

https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form

I'm not sure it required Chomsky's work.

Oh, lots of stuff gets invented multiple times, when it's "in the air". Nothing special about Chomsky here. And I wouldn't see that distracting from this particular achievement.
Don't you think people would have figured it out by themselves the moment programmers started writing parsers? I'm not sure his contribution was particularly needed.
Lots of things get invented / discovered multiple times when it's in the air. But just because Newton (or Leibnitz) existed, doesn't mean Leibnitz (or Newton) were any less visionary.

For your very specific question: have a look at the sorry state of what's called 'regular expressions' many programming languages and libraries to see what programmers left loose can do. (Most of these 'regular expressions' add things like back-references etc that make matching their franken-'xpressions take exponential time in the worst case; but they neglect to put in stuff like intersection or complement of expressions, which are matchable in linear time.

Just checked after reading your comment. Surprisingly to me, AFAs (Alternating Finite Automatons) do let you introduce the Complement operation into Regex while preserving the O(mn) complexity of running NFAs.

That's really subtle, because deciding Regex universality (i.e. whether a regex accepts every input) is PSPACE-COMPLETE. And since NFAs make it efficient to decide whether a regex matches NO inputs, any attempts to combine NFAs with regex Complement would trip on a massive landmine.

Actually, I've now checked more thoroughly, and RE+complement matching is O(n^2 m), which is not that good.
See https://en.wikipedia.org/wiki/Regular_language#Closure_prope...

The complement of a regular language is a regular language, and for any given regular language we can check whether a string is a member of that language in O(length of the string) time.

Yes, depending on how you represent your regular language, the complement operator might not work play nicely with that representation. But eg it's fairly trivial for finite state machines or when matching via Brzozowski derivatives. See https://en.wikipedia.org/wiki/Brzozowski_derivative

See also https://github.com/google/redgrep

Regular languages have a bunch of closure properties. In practice, intersection and complement are really useful. So you can say things like:

[regular_expression_for_matching_email_addresses] & ![my_list_of_naughty_words]

Exercise for the reader: expand the expression above to allow the town of 'Scunthorpe'.

It's incredibly useful for natural languages.
I'm a big Chomsky nerd, Chomsky fan, and card-carrying ex Chomskyan linguist. I hate to break it to you, but not even Chomsky thought that the Chomsky hierarchy had any very interesting application to natural languages. Amongst linguists who (unlike Chomsky) are still interested in formal language classes, the general consensus these days is that the relevant class is one of the so-called 'mildly context sensitive' ones (see e.g. https://www.kornai.com/MatLing/mcsfin.pdf for an overview).

(I suppose I have to state for the record that Chomsky's ties to Epstein are indefensible and that I'm not a fan of his on a personal level.)