Hacker News new | ask | show | jobs
by runlaszlorun 976 days ago
I def have a soft spot for Pascal. And I think Niklaus Wirth deserves more recognition in broader circles for his foundational work with pcode, compilers, Oberon, etc. I learned Pascal like many of us growing up in the early PC era and never could look at BASIC the same way again (or respect Gates for his love of it, lol). I think having such a highly structured language at a young age did wonders.

But these days folks are mostly used to the C style syntax. And I'm not even arguing that it is a better language than C or others. But the whole industry has gone overall into believing that anything newly 'invented' is good and anything that's been around a while is passé. Ironically, at the same time as the core technologies we use are based on decades old tech like Unix, relational databases, TCP/IP, etc. And many others like Lisp and Smalltalk fell by the wayside at least partly due to performance issues that were made irrelevant by Moore's law long ago.

Oh humans... :)

Btw, Logo is another one that's under appreciated. Seymour Papert was brilliant in making programming more visual and intuitive for kids. And I didn't actually know until recently it's actually a Lisp based language with a lot of power. Who knew?

In some parallel universe, I'd love to see folks like those, along with many others from that era, as the ones we heap recognition on instead of our worship of current tech billionaires. Those guys generally understood the hardware, software, and core theory. Given the mess that is computing and the internet, it's a shame that we'll be losing them over the next few decades.

8 comments

I'm surprised that you draw a sharp distinction between C and Pascal syntax. They have a shared lineage and are really very close to each other. Yeah, curly braces won over "begin" and "end", but that's not a matter of some huge conceptual rift, just convenience.

There are numerous languages today, including Haskell and Ocaml, that are far more removed from the Algol lineage than these two. Heck, the differences between Rust and C are probably more pronounced than between C and Pascal.

One huge difference between C and Pascal grammars us that Pascal is LR(1), so it can be parsed easily, which helps one-pass translation. It also helps humans read it.

C, on the other hand, has needlessly complicated syntax; a function definition is hard to detect, and a pointer to a function is hard to interpret, because it's literally convoluted: https://c-faq.com/decl/spiral.anderson.html

Sadly, this is a general stylistic difference: where Pascal tries to go for clarity, C makes do with cleverness, which is more error-prone.

You misremember. Pascal's grammar is "easy" because it is LL(1), not LR(1)!

C is almost LR(1), if we allow prior declarations to decide how some tokens are classified, like whether an identifier is a variable or type name.

Declarations like

  void (*signal(int, void (*fp)(int)))(int);
are LR(1).

LR(1) sentences are harder to read than LL(1) because you have to keep track of a long prefix of the input, looking for right reductions (if you follow certain LR algorithms). LR parsing algorithms use a stack which essentially provides unlimited lookahead, in comparison to LL(1). Both LL(1) and LR(1) have one symbol of lookahead, but qualitatively it's entirely different because the lookahead in LR is happening after an indefinitely long prefix of the sentence which has not been fully analyzed, and has been shunted into a stack, to be processed later. Many symbols can be pushed onto the stack before a decision is made to recognize a rule and reduce by it. Those pushed symbols represent a prefix of the input that is not yet reduced, while the reduction is happening on the right of that. So it is backwards in a sense; following what is going on in the grammar is bit like understanding a stack language like Forth or PostScript.

An LL(1) grammar allows sentences to be parsed in a left to right scan without pushing anything into a stack to reduce later. Everything is decidable based on looking at the next symbol. Under LL(1), by looking at one symbol, you know what you are parsing; each subsequent symbol narrows it down to something more specific. Importantly, the syntax of symbols that have been processed already (material to the left) are settled; their syntax is not left undecided while we recognize some fragment on the right.

Under LR(1) it's possible for a long sequence of symbols to belong to entirely unrelated phrase structures, only to be decided when something finally appears on the right. A LALR(1) parser generator outputs a machine in which the states end up shared by unrelated rules. The state transitions then effectively track multiple parallel contexts.

> C is almost LR(1),

Does that include the C preprocessor?

Somehow, I recall someone here (maybe it was user walterbright) suggesting that implementing a C preprocessor was a lot of work - maybe months - so one might consider using Facebook's MIT licensed preprocessor:

https://github.com/facebookresearch/CParser

The C preprocessor is a purely functional language![1]

[1] https://web.archive.org/web/20230714010215/http://conal.net/...

This is correct, thanks!
I'm very much not a C programmer, but I've never understood why it seems far more common to write `float *foo` instead of `float* foo`. The "pointerness" is part of the type and to me the latter expresses that far more clearly.
Because the syntax is:

  <specifiers> <declarator> {, <declarator>, ...} ;
The star is a type-deriving operator that is part of the <declarator>, not part of the <specifiers>!

This declares two pointers to char:

  char *foo, *bar;
This declares foo as a pointer to char, and bar as a char:

  char* foo, bar;
We have created a trompe l'oeil by separating the * from the declarator to which it begins and attaching it to the specifier to which it doesn't.
On the other hand, for those of us who agree with the GP on this, one way around the pitfall is to have your project's style guide ban multiple declarations on a single line, or at least ban them for non-trivial variables— so `int x, y, z;` is permitted, but nothing more than that.
That's fine if it isn't used as a pretext for writing nonsense like char* p; which should likewise be banned in the same coding style document.
>This declares foo as a pointer to char, and bar as a char:

  char* foo, bar;
So that's why I've had so many problems understanding C. I come from the Pascal world, where a type specification is straightforward.
Because it isn't - `float* foo, bar;` foo is a pointer, bar is not.

(There were suggestions back in the 90s that to make C easier to parse for humans (and not-coincidentally simplify the compiler grammar) this should be `foo, bar: float*;` and your model of pointerness could actually be true. Never got much more traction than some "huh, that would be better, too bad we've been using this for 10 years already and will never change it" comments :-) (with an occasional side of "maybe use typedefs instead")

Good news (kinda): C23 allows (and GCC has for a long long time allowed) you to write typeof(float *) foo, bar; and declare two pointers. Not that I’d advocate writing normal declarations that way, but at least now you can write macros (e.g. for allocation) that don’t choke on arbitrary type names.
Which is why the convention is usually to not permit multiple declarations in one line.

If you value your codebase anyway.

Declaring X,Y and Z on separate lines for a graphics routine would just be silly, they're all the same type.

Defensive programming that extreme reminds me of the behavior I learned to avoid pissing off my drunk dad.

I think it is a manner of preference, both "float* foo" and "float *foo" are widely used.

Personally i used "float *foo" for years until at some point i found "float* foo" more natural (as the pointer is conceptually part of the type) so i switched to that, which i've also been using for years. I've worked on a bunch of codebases which used both though (both in C and C++) - in some cases even mixed because that's what you get with a codebase where a ton of programmers worked over many years :-P.

I do tend to put pointer variable declarations on their own lines though regardless of asterisk placement.

(and of course there is always "float foo[42]" to annoy you with the whole "part of the type" aspect :-P)*

Here's how I understand it.

One important (and beautiful) thing to understand about C is that declarations and use in C mirror each other.

Consider the same type written in Go and C: array of ten pointers to functions from int to int.

Go: var funcs [10]*func(int) int

C: int (*funcs[10])(int)

Go's version reads left to right, clearly. C version is ugly.

But beautiful thing about C version is that it mirrors how funcs can be used:

(*funcs[0])(5)

See how it's just like the declaration.

Go's version doesn't have this property.

So, now about the *.

Usage of * doesn't require spaces.

If p is a pointer to int, you use it like this: *p

And not like this: * p

And since type declarations follow usage, therefore "int *p" makes more sense.

There is also a good argument about "int *p, i". In the end, these usages follow from how the C grammar works.

There are many more musings about that on the web, but here is one of my favourites: https://go.dev/blog/declaration-syntax.

Edit: HN formatting.

The * binds the name, not the type.

https://godbolt.org/z/GsoxrWdrG

For the same reason you don’t write x+y * z: because then the spacing contradicts the way the priorities work in the language.

We might wish for the C declaration syntax to be <type> <name>[, ...], but it’s not: it’s <specifier>[ ...] <declarator>[, ...], where int, long, unsigned, struct stat, union { uint64_t u; double d; }, and even typedef are all specifiers, and foo, (foo), (((foo))), *bar, baz[10], (*spam)(int), and even (*eggs)[STRIDE] are all declarators (the wisdom of using the last one is debatable, but it is genuinely useful if you can count on the future maintainer to know what it means).

Everybody is free to not like the syntax, but actively misleading the reader about its workings seems counterproductive.

I swear if I see one more "int* x, y" example I will flip a keyboard. This isn't the 90s anymore. Everyone and their mother knows about this one pitfall, repeated for decades by people who have yet to read the memo that we now declare one variable per line, so it is never an issue. Even if you make this mistake, it's the least of your worries when coding in C because the compiler will typically warn you when you try to use the integer as a pointer.

Get with the program: types on the left, names on the right, one declaration per line.

Humans can normally treat C as if it is LR and get away with it, except for a few places that they can often recognize and avoid. It is a bad shortcut, but still one that you can get by with taking in a lot of code.
Everywhere but typedef can be made LR(1) IIRC
Sure, but "get away with" is not something to strive for in a programming language. I've allways hated C for the unnecessary brainpower it sometimes takes to parse a construct.
Historically, the point of writing 'begin' and 'end' instead of using curly braces was mostly support for non-ASCII character sets where the curly braces are not included. It's why C also has an alternate syntax using <% and %> and COBOL goes as far as writing out arithmetical operators as English text, such as DIVIDE x INTO y GIVING z.
Ada, at least, uses begin...end in part because it prevents certain kinds of errors. In its syntax you have to specify what you are ending, reducing the risk of invalid matches and increasing the likelihood of the error report system guessing correctly what you intended. E.g.:

    if X > 0 then
      Y := 0;
    end if;
Curly braces are shorter, but a close curly brace will match any open curly brace. Such is the nature of trade-offs.
Ngl, I think that's brilliant. Braces matching the wrong brace is like a daily occurrence. It's such a tedious small thing that constantly hounds me whenever I'm writing code
In languages without this feature (most of them), you sometimes see long blocks get labeled at the end anyway. On the other hand, you could argue that if your block is long enough it doesn't fit on the screen, then it should be its own function anyway.

Like this, except replace "..." with many lines of code.

  if (z.p == z.p.p.left) {

      ...

  } else { // z.p != z.p.p.left

      ...

  } // if
> In languages without this feature (most of them), you sometimes see long blocks get labeled at the end anyway.

True. However, in Ada at least, if the block types don't match then it's a syntax error detected at compile time by the compiler. Comments like those listed above are often not checked at compile time, and thus aren't very useful for preventing errors.

Most programming languages now support reformatters out of the box. Part of the point of those is to make mismatched closing braces more visible.
Rainbow brackets, so the brackets themselves are color coded to match with their corresponding partner, are also a godsend.
Even with all these helpers, there's too much cognitive overhead and not enough that IDEs or plugins can do to take that away. Rainbow braces are nice and all, but it's not enough when the underlying concept is broken.
Scala 3 actually has a brace-less mode that supports this.
You can do this with PHP, too, but it’s on the rare side except in some templating niches.

The reverse-reserved-word convention like ‘fi’ to end an if block in shell (and other?) languages seems like it functions this way too.

And I guess significant indentation also does this job, albeit with some of its own hazards.

I was under the impression that COBOL's English syntax was intended to be a more human-readable approach, not so much a workaround for character set limitations.
Maybe both? COBOL predates the first draft of ASCII by several years. Character sets were far from standardized in those days.
Lots (most?) of classic COBOL used EBCDIC[0]

[0] https://en.wikipedia.org/wiki/EBCDIC

Well some of the great ideas of yesteryear are having a bit of a rebirth. Many of the safety concerns of Ada are being brought into the limelight with Rust. And of course Erlang blew up a while back after spending forever in the (relative) shadows.

Overall we're still stuck in a bit of a near-monoculture of JS(TS) and Python, but it's a far cry from back in the day where there was very little openness to the sole blessed corporate stack (Typically Java or C#/CLR). I think we can only handle so many mainstream languages, but I do love all the experimentation and openness going on.

My impression from my limited experience with Pascal (more recently only playing around a bit with Free Pascal) is that much of the basic parts of the language, that a beginner like me is mostly exposed to, is very safe with strong typing and range-checking on all arrays. I do not know how far you can get in practical programming while sticking to those parts of the language, or how unsafe it gets once you start playing with pointers? I think the pointers are supposed to at least be safer than in C, unless you go out of your way to break things?
> But these days folks are mostly used to the C style syntax. And I'm not even arguing that it is a better language than C or others. But the whole industry has gone overall into believing that anything newly 'invented' is good and anything that's been around a while is passé

Problem of the Pascal syntax is that it prevents adoption of certain constructs, which are just not nice. A few examples

- lambda expression: `begin` ... `end`, say goodbye to nice one liners;

- binary assign: FPC has `+=` `-=` but obviously not `mod=`, `and=`, etc;

On top of that there are other things like

- shortened boolean evaluation (e.g `if someInt` => `if someInt != 0` is not possible because `and` is a two headed creature

- locals are still not default initialized

I use to like Pascal (actually Delphi then ObjFPC) much but nowaday I think the only good parts are in certain semantics, e.g no fallback in the `case`...`of` construct, manual memory management BUT ref counted arrays.

I would tend more to like a C like syntax with certain semantics coming from the Pascal world.

The problem with "one liners" and other coding is that its generally to clever. One of the things I like about python (despite not liking it that much) is for certain constructs there is "the one true way", for example the required formatting trains people to all read the same code. With C/etc languages there are dozens of different but in the end identical ways to express and format the same construct its crazy. And it creates unnecessary mental overhead, nevermind the if..fi..else ambiguities that aren't even standardized behavior.

So, much of what you are complaining is largely pointless syntactic sugar issues, like people complaining about the difficulty of typing "begin" vs "{" when any modern editor can autocomplete, and nevermind the difficult parts of programming are rarely the limit on how fast one can type 5 characters vs 1. I might even go so far as to say, slowing down a bit probably actually increases the code quality.

(PS: I've programmed professionally in pretty much every mainstream language and quite a number that aren't mainstream. IMHO Object Pascal strikes a far better balance of performant code, ease of development and maintenance, and developer safety than most of the languages in modern use, maybe all of them. Its frankly a shame that more places don't take it more seriously and would rather invent yet another poor half baked language that takes another few thousand man years of effort for the compiler writers and the users to overcome as they are discovered).

> - lambda expression: `begin` ... `end`, say goodbye to nice one liners;

If there's one thing I would eliminate from programming, despite their benefits, is the one liner lambda expressions. It has turned clean, readable Python code into muddy statements I need to pause to compile in my head to understand.

I am not a fan.

BEGIN/END needn't be a showstopper for one-line lambdas. Ruby has nice one-line lambda expressions where you can substitute `{` ... `}` for `do` ... `end`. A Pascal implementation can't do exactly that, but it could use an alternative syntax.
elixir allows one-line lambdas with `fn ... end`. they don't look as nice as they would have with braces but the parser handles them fine.
Wirth's disdain for C++ leads sometimes to chuckles in the audience at public lectures("C with 2 plusses")
> C with 2 plusses

I read it as "C with two pulses"... Which is how I feel about C++ - unnecessarily complicated.

I don't understand the joke you're making.

How does "pulse" equate to complicated?

Two redundant cardiovascular systems that don't work together, presumably in such a way that a heart attack or exsanguination in either is lethal (as opposed to modular redundancy for improved reliability, which mostly isn't relevant at the software level).
C++ isn't really one language, but a language family with dozens if not more dialects. If you read some codebases from different places, you will sometimes, or even often, barely recognise what's going on (unless you are an advanced user).

In C++, people very much tend to pick a subset, so different pulses beat concurrently, if you will.

same here
> But these days folks are mostly used to the C style syntax.

Mostly, but I'm told the new Austral[1] language has syntax very similar to that of Pascal's.

1: https://austral-lang.org/

Unfortunately, LOGO uses dynamic scope. It could easily be a viable language aside from that one misfeature.
So does ELisp, by default :-)
During home school my daughter was learning about how data structures are represented in memory, which gave us the opportunity to explore how to represent strings as arrays, and thus: Pascal strings vs C strings
>But the whole industry has gone overall into believing that anything newly 'invented' is good and anything that's been around a while is passé.

I think this is partially accepted to keep wages down. New languages allow fresh developers to be on a level playing field with more senior developers. Both have say 2 years experience in said new language. Fresh developers are cheaper and therefore push down wages.

It’s almost like there are improvements in PL design.

Also, at better places the language itself is a tool - someone will be a senior in any other language as well.

>It’s almost like there are improvements in PL design.

Eh, not really, at least not in the last 10+ years. I'm sure some obscure hotness does something neat, but mostly inconsequential for the vast, vast majority of shops.

>someone will be a senior in any other language as well.

While I agree with you, that often isn't the opinion of people hiring. If someone is looking for 2 years of java, in most places, 10 years of C# isn't what they are willing to hire.