Hacker News new | ask | show | jobs
by phaker 2020 days ago
> Do you think it is the responsibility of academia to teach "industry relevant stuff"?

I don't think that was what ibains was going for, though i don't fault you for seeing this in that comment. (Especially that i don't think this course suffers from that problem and generally i think things are rapidly improving on this front.)

That problem to me and i think to him too is that quite a lot of things that such courses tend to teach as well thought out, well working solutions and approaches just don't work very well and frequently i find comments on what's a 'good taste' solution and what isn't that are completely my understanding of the problem space which is why.

E.g. in parsing (as that's the topic he mentioned): First, lots of people just spend way too much time on it. And they focus on parts that are of zero use to beginners (like explaining all several grammar families) and then use obtuse parser generators that save no work and sometimes use them in bizarre way (like picking a lalr parser generator then hand editing the output to support a not-quite-lalr language). Meanwhile a recursive descent parser is easy to write, fast and gives pretty good error messages with _very_ little work. You do need to know enough about grammars to be able to write one down and know if it describes an ambiguous language etc so this should be taught, but you don't need to understand the language zoo well.

2 comments

>I don't think that was what ibains was going for, though i don't fault you for seeing this in that comment

Isn't that literally exactly what he was going for?

"I’d like to sit down all university professors who teach compiler courses and teach them a course on what’s relevant."

You might believe academia should not impart knowledge students require for their subsequent jobs. Ask the students why they go to university.

See what is taught in UCBerkeley with Spark and all coming out of there. I took a systems course with Eric Brewer - totally amazing - the context I got.

I think the ideal is really that a student should be capable of going into the world and picking up a new technology within the same area without too much trouble — I may have used ANTLR for a class, but I should also understand it well enough to write my own parser, or start looking into other parsers.

But if the gap is such that knowing ANTLR doesn’t really help me use packrat parsers (haven’t used them myself, so I don’t know) then it’s a gap probably worth filling.

But ideally students should learn the fundamental architecture/algorithms, not the specific tools, popular algorithms & implementations, and all the accidental complexity that came with it.

Which is where the industry vs academia problem usually comes in — industry often only cares about knowledge of a particular tool, and only sometimes cares about knowledge-transferability (I think largely because HR exists, and operates independently of the actual engineer making the request, with the first pass review).

The company hires for Hadoop, the university teaches distributed systems, and ideally the student can be trained in Hadoop specific in short order.

India also offers an example of the extreme alternatives — heavily industry driven courses focusing on specific tools, and you end up with students having practical knowledge for current tooling... but no ability to transfer it. Like the kind of resource who refuses to program C# because they’re a Java developer, and it’s all they know

Most languages are context sensitive.

Most language tools are context free.

How did we go so wrong?

Languages being context sensitive results from hacking in new language features. It's a constant struggle to keep D context free :-/
Strictly speaking, D is not context free nor is any statically typed programming language. That said most languages, including D, have supersets that are context free and can be used to build an AST which can then be refined further with context sensitive checks (ie. variables are used after declaration) and then even further with type checks.

Many language tools don't need a full blown language parser and can get by with a context-free parser for the language's superset. Things like autocompletion, spell checking, documentation/highlighting can all be implemented using context-free parsers.

I suppose we could argue about the definition of context free, but the compiler lexes it without any reference to the parser, and the parser builds the AST without any reference to symbols or semantic meaning.

> spell checking ... using context-free parsers

Only up to a point. D's spell checker uses the symbols in scope as its "dictionary". This means it is very good at guessing what you meant instead of what is typed, as the dictionary is very targeted. I.e. it uses the context, and the spell checker is part of the semantic pass.

A context free grammar has a formal and rigorous definition. One way to see that Dlang does not have a strictly context free grammar for its AST is the following snippet of code:

A[B] C;

That is parsed differently depending on whether B is a constant integer expression in which case it parses to the declaration of an array of type A with a size of B, or whether B is a typename in which case it parses into an hash map with keys of type B and values of type A.

This disambiguation can not be specified by the context free grammar and instead must be delayed until the semantic analysis (which is usually defined by a context sensitive grammar).

Pretty much all statically typed languages work this way with various kinds of ambiguities. In practice, it's not a particularly big deal. But that's kind of my point, there's no need to work exceedingly hard to force a language construct to be context free, especially if making it context free results in a lack of readability or awkward syntax.

It's perfectly fine to specify a superset of the language that is context free, parse an AST for the superset of that language and then resolve AST ambiguities in later phases, such as the semantic analysis phase or the type checking phase. Almost any tool written to assist an IDE with autocompletion or other functionality will have no problem working with the language superset.

> That is parsed differently

It isn't parsed differently. Its semantic meaning depends on what B is, but not the parse.

You realize the poster you are responding to is the creator of D? And has spend his career writing compilers? I'm not sure he needs to have someone explain what a context free grammar is.
I understand you wish to flatter WalterBright, but please note that your comment is fairly toxic and not well suited to Hacker News. Please stay on topic and feel free to contribute to the points being made. This kind of hero worship often just degrades discussion and is better left to other forms of social media.
Yours is the toxic comment. Appeal to authority is fine if the authority is an authority and WB has proven his. Suggestions of intention to flatter and accusations of hero worship are pretty gross. @kingaillas's point is entirely valid IMO.
> Most languages are context sensitive. Most language tools are context free. How did we go so wrong?

We never did. We always knew that any real-world useful grammar is multi-level and attribute (well, it's constraint-based, though depending on your definition of "attribute grammar", those are equivalent). That's why we so much like recursive-descent parsers: adding any multi-level constraints are so easy to them - you have a full Turing-complete language at your disposal, unlike simplistic 1-dimensional DSLs of most parser generators.

Could this be viewed as supply exceeding demand?

Context-free grammars are ripe for theoretical computer science work even if they're not practically relevant. On the flipside, I suppose the constraints of context-free grammars are seen as a price not worth paying when designing a language.

Can you give an example (or three) where that lets us down? That would be very helpful to me I suspect.
Examples of which bit?

Languages that are context-sensitive? C, C++, Java, JavaScript.

Examples of tools based on the starting expectation that languages are context-free? Yacc, Bison, Jay.

Examples of the problems this causes? Well we're using the wrong tool for the job, right from the start. Instead of using an appropriate tool the first thing we do is bend our tools out of shape. We use side-effect actions to subvert the tool's model in an uncontrolled way. We don't get the full benefits of the tool's original model and can't rely on its guarantees.