Hacker News new | ask | show | jobs
by whakim 1115 days ago
I think there's a piece of insight missing from the author's analysis of non-programmatic sigils. To wit, the sigils are only valuable when both parties deeply understand the information that the sigil is trying to convey. The "$framework at $dayjob" example illustrates this point. Programmers familiar with the use of sigils to indicate variables intrinsically grok this phrase, but it looks like gobbledygook to non-programmers. The email inbox example is similar. (I'd argue the hashtag/@-symbol example is a bit more complicated, because those symbols service important UX functions.)

I think this insight crystalizes the trade-off. I agree with the author that sigils are a powerful way of communicating useful information in a concise fashion. But does their inscrutability to non-expert users justify their existence? I'd argue it usually doesn't. Whenever I've had to pick up a language that uses a lot of sigils (or even just had to read source code in one of those languages if I don't use it daily), I always find the sigils require a bit of extra mental effort to process. It seems like other languages manage to express meaning in a way that is less burdensome to non-experts.

4 comments

> because those symbols service important UX functions

As I read the post, I was thinking that #tags and @mentions are primarily about input, not reading. It's easier to just whack some #random #tags in your #sentences than to switch to a separate tag list input. Similarly, highlighting some text in order to apply the "mention" brush like we might with bold or italics would be strictly worse.

I might agree with you if I ask for the fact some of the most popular beginner languages use sigils:

- BASIC

- Shell scripting

- PHP

It’s also worth noting that all languages have special tokens to identify properties of the code. Eg why does a string need to be wrapped in quotation marks but integers do not? Why do single and double quotation marks behave differently in some languages? Why do function names behave differently if you pass () vs not including parentheses in some languages?

At the end of the day, if you want to learn to program then you are always going to have some degree of syntax that you just have to learn. Sigils aren’t inherently hard but some languages make I them more abstract than others.

Another thing that’s worth baring in mind is that sigils solve a problem in languages that make heavy use of barewords, such as shells. Eg how do you know if foobar is a variable, function, keyword, parameter, etc if you syntax is

   echo foobar
This is why other languages then use quotation marks, parentheses, etc. But while that’s arguably more readable, it’s a pain in the arse for REPL work in a shell (I know because I’ve tried it).

So there’s always trade offs.

> I might agree with you if I ask for the fact some of the most popular beginner languages use sigils

20 years ago I might've agreed with you. But I do not think that PHP, BASIC and shell scripting are popular beginner languages in 2023.

> It’s also worth noting that all languages have special tokens to identify properties of the code. Eg why does a string need to be wrapped in quotation marks but integers do not?

Quotation marks and especially parentheses after function calls don't fit TFA's definition of a sigil because they aren't at the beginning of the word and (arguably only in the latter case) don't communicate meta-information about the word.

> At the end of the day, if you want to learn to program then you are always going to have some degree of syntax that you just have to learn.

I'll agree with you that the line between sigils and general syntax/punctuation is a bit of a blurry one - where do you stop? Using my definition above, I think wrapping strings in quotation marks is a clear win because it fits our widely-held shared understanding that quotation marks demarcate and group a sequence of words. Single and double quotes behaving differently is unintuitive for the same reason while not conferring a corresponding benefit on experts.

> 20 years ago I might've agreed with you. But I do not think that PHP, BASIC and shell scripting are popular beginner languages in 2023.

PHP and shell scripting are still massively used in 2023 (eg https://madnight.github.io/githut/#/pull_requests/2023/1). You have a point about BASIC but it was the de facto standard for computers at a time when people didn't have the web to quickly look up problems and thus learning to code was much harder. Yet we (in fact I) managed just fine.

> Quotation marks and especially parentheses after function calls don't fit TFA's definition of a sigil because they aren't at the beginning of the word and (arguably only in the latter case) don't communicate meta-information about the word.

I didn't say they are sigils. I said they're tokens. My point was that removing sigils doesn't remove meta-information encoded in magic characters:

- You have `foobar()` where the braces denote (call the function rather than pass the function reference

- "" == string which allows escaping and/or infixing vs '' which doesn't (other languages have different tokens for denoting string literals, like `` in Go)

- # in C and C++ is a marco

- // is a line comment in some languages. Others use #, or --

- Some languages use any of the following for multi-line comments: ```, /* /, and even {} is used. Whereas it's an execution block in some other languages

My point is you have to learn what all of these tokens mean regardless of whether they sit as a prefix or not. The that that they're a sigil doesn't change anything.

The real complaint people are making here is about specific languages, like Perl, overloading sigils to do magical things. That is a valid complaint but, in my opinion, it's a complaint against overloading tokens rather than sigils specifically. Much like a complaint about operator overloading doesn't lead to the natural conclusion that all operators are bad.

> don't communicate meta-information about the word.

We need to be careful about our assumption about whether a token effectively communicates meta-information because while I do agree that some tokens are more intuitive than others, there is also a hell of a lot of learned behaviour involved as well. And it's really* hard to separate what is easier to understand from what we've just gotten so use to that we no longer give a second thought about.

This is a massive problem whenever topics about code readability comes up :)

> I'll agree with you that the line between sigils and general syntax/punctuation is a bit of a blurry one - where do you stop?

shrugs...somewhere...? You can't really say there should be a hard line that a language designer shouldn't cross because it really depends on the purpose of that language. For example the language I'm currently working on makes heavy use of sigils but it also makes heavy use of barewords because it's primary use is in interactive shells. So stricter C-like strings and function braces would be painful in a read once write many environment (and I know this because that was my original language design -- and I hated using the shell with those constraints).

In a REPL environment with heavy use of barewords, sigils add a lot to the readability of the code (and hence why Perl originally adopted sigils. Why AWK, Bash, Powershell, etc all use them, etc).

However in lower level languages, those tokens can add noise. So they're generally only used to differentiate between passing values vs references.

But this is a decision each language needs to make on a case by case basis and for each sigil.

There also needs to be care not to overload sigils (like Perl does) because that can get super confusing super quick. If you cannot describe a sigil in one sentence, then it is probably worth reconsidering whether that sigil is adding more noise than legibility.

> sing my definition above, I think wrapping strings in quotation marks is a clear win because it fits our widely-held shared understanding that quotation marks demarcate and group a sequence of words. Single and double quotes behaving differently is unintuitive for the same reason while not conferring a corresponding benefit on experts.

Here lies the next problem for programming languages. For them to be useful, they need to be flexible. And as languages grow in age, experts in those languages keep asking for more and more features. Python is a great example of this:

- ''

- ""

- ''' '''

- """ """

- f""

...and lots of Python developers cannot even agree on when to use single and double quotes!

I tried to keep quoting simple in my own language but I ended up with three different ways to quote:

- '' (string literals)

- "" (strings with support for escaping and infixing)

- %() (string nesting. For when you need a string within a string within a string. Doesn't come up often but useful for dynamic code. A contrived example might look like: `tmux -c %(sh -c %(echo %(hello world)))` (there are certainly better ways you could write that specific code but you get the kind of edge case I'm hinting at).

As much as languages do need to be easy to learn, they shouldn't sacrifice usability in the process. So it is a constant balancing act trying to make something easy to learn, yet also powerful enough to actually have a practical use. Not to mention the constant push and pull between verbosity where some claim fewer characters (eg `fn` as a function keyword) improves readability because it declutters the screen from boilerplate, while others say terms like `function` are more readable because it is closer to executable pseudo-code. Ultimately you cannot please all of the people all of the time.

Ah, the classic trade-off between designing for the convenience of beginners or experts! I'm in camp expert, but I get that not all people are.
>Programmers familiar with the use of sigils to indicate variables intrinsically grok this phrase, but it looks like gobbledygook to non-programmers.

While I understand where you're coming from, I'd argue that programming-related concepts are all "gobbledygook to non-programmers", that's to be expected. Having something like (this is close to valid Raku but it's not)

    Positional[Any] ages = [42, 38, 25];
doesn't make it any easier than

    my @ages = [42, 38, 25];
unless you already have prior knowledge of arrays, assignments, types, etc.