Hacker News new | ask | show | jobs
by jerf 1116 days ago
I like to rate programming language's features not by how much I use them when I'm in the given language, or how good they make me feel, but by how much I miss them when I'm in a different language, once I'm fluent in that language and writing in the native idiom. (This is important. If you're still trying to write X in Y, yes, you'll miss the features from X, but that's not a useful data point.)

By this metric, rather a lot of features turn out to be less important than they may seem at first. Many things are a zero on this scale that I think might surprise people still on their second or third language. From this perspective you start judging not whether a language has this or that exact feature that is a solution to a problem that you are used to, but whether it has a solution at all, and how good it is on its own terms.

So while sigils have a lot of company in this, they are also a flat zero for me on this scale. Never ever missed them. I did a decade+ of Perl as my main language, so it's not for lack of exposure.

(As an example of something that does pass this test: Closures. Hard to use anything lacking them, though as this seems to be a popular opinion nowadays, almost everything has them. But I'm old enough to remember them being a controversial feature. Also, at this point, static types. Despite my decades of dynamic typed languages, I hate going back to dynamic languages anymore. YMMV.)

10 comments

> So while sigils have a lot of company in this, they are also a flat zero for me on this scale. Never ever missed them. I did a decade+ of Perl as my main language, so it's not for lack of exposure.

I tend to miss one specific sigil (or pair of sigils): the @ and @@ sigils in Ruby, that mean "instance variable" and "class variable" respectively. Having identifier shadowing between stack-locals, and what Java would call "members" and "statics", be literally impossible, is just so nice. Especially when you get it "for free" in terms of verbosity, rather than needing to type `self.class.` or something.

I also really quite interned-string-literal : sigils in Ruby/Elixir — though I'd be equally fine with the Prolog/Erlang approach of barewords being symbols and identifiers needing to be capitalized. As long as there's some concise syntax for interned strings, especially in the context of dictionary keys. Because otherwise people just won't use them, even when they're there in the language. (See: Java, Python, ECMA6.)

Speaking of Elixir, the "universal sigil" ~ is kind of amazing. Define a macro sigil_h/2, and you can suddenly write ~h/foo/bar (or ~h[foo]bar, or whatever other delimiter works to best avoid the need for escaping), and foo and bar will be passed to sigil_h/2 as un-evaluated AST nodes to do with as you please. The language gives you ~w by default (which works like Ruby %w); but more interestingly, Regex literals in Elixir are just sigil_r.

> I tend to miss one specific sigil (or pair of sigils): the @ and @@ sigils in Ruby, that mean "instance variable" and "class variable" respectively. Having identifier shadowing between stack-locals, and what Java would call "members" and "statics", be literally impossible, is just so nice. Especially when you get it "for free" in terms of verbosity, rather than needing to type `self.class.` or something.

When I went from C++ to Python, the explicit "self" felt weird but over time, I felt it was much better. This became a lot more obvious in Rust. In C++ you get an implicit `this` variable and you get weird trailing keywords on functions to modify the `this` variable. Granted, these kinds of use cases won't be needed in every language. However, I also feel like sigils for this would be less understandable for someone unfamiliar with the language than explicit `self`. Something I judge a language on is how easy is the code to casually maintain by a group that is trying to get other stuff done.

The thing about Ruby is that it has uniform syntax — a.b for any a+b mean "send a message :b to a." So `self.foo` (and `self.foo = bar`, too!) are possible to write, but these are always interpreted as message sends (to the :foo and :foo= methods, respectively), not as direct field accesses. The "syntax-ness" of @ and @@ show are that you're specifically breaking out of† the paradigm of "everything is a message send", to instead "just" access a field. It's what makes this make sense:

    def foo # define a getter method
      @foo # in terms of a field access
    end
How would you write that, if the field access was spelled `self.foo`? The language wouldn't be able to tell that you're not just recursively calling the getter!

---

† Though, technically, you're not breaking out of the paradigm; @foo is short for self.instance_variable_get(:@foo). It's message-sends all the way down, until you hit natively-implemented methods.

> How would you write that, if the field access was spelled `self.foo`? The language wouldn't be able to tell that you're not just recursively calling the getter!

You can require parentheses for method calls put methods and fields in separate namespaces.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

Elixir supports paren-free calls but the default linter and formatter won't let you use them except for a few whitelisted DSLs. I've never missed not having them.

In languages that require parens for method calls, not using them usually gets you a method handle. Which is still in conflict with a field reference — usually because methods are just function-pointer-typed static fields.
That's why I said "you can" rather than "you must". And linked to an example that of a language with the property I described.
As someone who infrequently had to touch Ruby code, this was maddening. Years later, I only now am finding out what was going wrong and a better sense of what search terms to use.

As I said I'm a big propoenent in languages being approachable for those infrequent one-off cases. I've been burned by the challenge of updating the "handful" of Perl and Ruby scripts (and Perl was my first language). This is why I advocate against Lua and 1-indexing when the target audience is programmers and it isn't a "primary" language.

I also have to touch Ruby code from time to time, so when I found out I don't quite understand what "@" and "@@" mean (other parts, even blocks, were kinda more or less apparent), I... went and read the docs. Took me an hour or two but now I know what "@" and "@@" mean and actually think they're a pretty ingenious solution.
I explicit this.thing in C# as well. It started from inheriting some coding standards / projects whose designers came straight from C and didn't do the idiomatic _variable thing for instance variables.

Now it's quite an entrenched habit and at this stage I'd prefer if the implicit access wasn't possible.

Agree on Ruby ivar/cvar sigils. We ran into some nasty variable shadowing, especially with autowiring frameworks, in big Java projects over the years.
Yes, those are handy. I don't much care for all the other ones in Ruby. Maybe the regexp one.
Thanks for putting into words something I've started to feel over time, but never conceptualized clearly.

I agree that closures pass the test - and I too remember when they weren't popular. I also remember what I did before learning about the very idea of first-class functions and closures: I simulated them with some ad-hoc means (like function pointers in C/C++, or passing strings to be eval()-ed in PHP, etc.).

This, I think, is an useful heuristic: the things likely to pass your test are the ones which people who don't have and don't know about them still end up approximating anyway - meaning those things are a natural solutions to some common problems.

I can think of couple other things that pass your test:

- Functions in general. It's the basic organizational primitive in code; working without them is Not Fun.

- Lisp-style macros. There are many problems that would be best solved with some surgical code generation, and having that option built-in into the language makes all the difference. Most languages don't have this type of macros - but that doesn't mean they aren't needed. Having done enough Lisp macrology, I saw that in those other languages I've always been coping. Missing them without knowing what they are.

Hell, look no further than webdev - these days, major frameworks like React, and every other minor library, and even the language evolution itself, all depend on running an external macro processor / code generation tool as part of your build pipeline.

A trivial but real one for me is being able to use non-alphanum chars freely in var and fn names. Being able to name a fn 'string->int' or especially something like 'valid?' seems very small but I really miss it in languages with more restrictions on names.
Although less encompassing than what you're talking about, I miss being able to name functions with a "?" at the end when they are a predicate; "isValid?".
That’s what the ‘is’ is for. If you have a ‘?’ character, you don’t need it. But tastes vary. I don’t even like the ‘is’.

1. if isValid() 2. if valid?() 3. if isValid?() 4. if valid()

Number 4 is nicest to my eyes. But I guess if the ‘?’ or ‘is’ (or both) is a promise to the user that the function is a true predicate, then I can see its utility.

'?' / 'is' get more useful with more complex predicate names than just 'valid', and they also help with certain corner cases in English. For example, what does the following code do:

  if(widget.free()) { ... }
Does it 1) check if the widget is "free" (whatever that means in the widget domain), or 2) frees the widget and checks the outcome of that operation?

If I saw something like this while reading code, I'd pause and carefully check what exactly is going on here.

In fact, I was going to write "Option 2) resembles resource management patterns, for example memory management in C", but then I checked and noticed that free() in C does not return a value, so this pattern would not exist with malloc()/free() - in other words, despite doing a bit of C and a lot of C++ in the past two decades, I still tripped over this.

Now compare with:

  if(widget.isFree()) { ... }
  if(widget.free?()) { ... }
Both resolve this ambiguity.

On that note, I'd love some kind of sigil for "asserting" functions - which are similar to checks, but instead of returning true/false, they ensure the argument is in the state described by the function name, or else they throw an exception. It's a pattern I've been using in exception-ful code to cut down on noise. For example:

  // Check if connected; if not, maybe run reconnection
  // logic or attempt some other form of recovery.
  // The only way control proceeds past this line is if
  // the session is connected; if it isn't and can't be,
  // exception is thrown.
  EnsureConnected(session);
   
  if(IsSomething(session, arg1)) {
    // ... some code requiring connected session
  }
  // ... more code requiring connected session
It's not a big deal, but in some cases, that "Ensure" or "Assert" look weird, and I don't like inventing more synonyms for the same pattern.
> Does it 1) check if the widget is "free" (whatever that means in the widget domain), or 2) frees the widget and checks the outcome of that operation?

Just program in Esperanto! So 1) would be "umo.libera()" and 2) be "umo.liberigu()".

I can't believe I still remember the grammar after what, 10 years of complete disuse?

Good points.

I’d avoid overloading a standard library function name like free().

Your request for syntax for assertions reminds me of the ‘guard’ keyword in Swift, which is good for making sure of preconditions.

Thanks for putting into words something that was on the edge of my mind, but never quite graspable.

Two more examples (for me?) of features that I find you really miss in a language even if you’re fluent in the local idioms: First-class functions and pattern matching.

Passing functions as values is so nice and afaik most modern languages have that feature nowadays. But I remember when it used to blow people’s minds.

Pattern matching is something I’ve missed ever since having it in Haskell. Such an elegant solution to a problem that you have just often enough that the typical native approach feels clunky.

Luckily pattern matching is also catching on.

Rust might be partially responsible for that, maybe? Python has also massively improved its pattern matching recently.

Dart 3 got very nice pattern matching [1]. And the next version of Java might introduce it (but likely it will still be behind a "preview" flag) as well.

[1] https://dart.dev/language/patterns

Perl's regex handling. That's the one I miss in whatever other language I program in. To be able to match AND get the matched substrings out in one line, makes this really succinct.
Indeed. I always need to look up the docs when I use Python regex, while in Perl it's so much more natural.
isn't that the same in any expression-oriented language ?

E.g. in ruby this could be

    puts $1 if /(\d+)/ =~ 'test 123' # perl-like
    
    if m = /(\d+)/.match('test 123') then puts m[1] end # perl-less
Perl's regexes do more things perhaps, but this is a relatively common thing, I believe.
($hour, $min, $sec) = $someTimeString =~ /(\d\d):(\d\d):(\d\d)/;

    if ($hour >= 12) {
       if ($min == 42) {
          doSomething();
       }
    }

Whereas in Python, Ruby even though it's only a tiny extra step, but that symantic distance in one's head when parsing out and naming the parts of a regex on the same is a convenience that when you get used to it, you really miss it.

    m = /(\d\d):(\d\d):(\d\d)/.match( someTimeString )
    hour = m[0]
    min  = m[1]
    sec  = m[2]
    if hour >=  ...
but ruby, python etc.. can just extract the list too, it's just an extra method call on the same object (perhaps there could be destructuring too in modern ruby/python)

    hour, min, sec = /(\d\d):(\d\d):(\d\d)/.match('10:11:12').captures
or

    hour, min, sec = re.compile(r'(\d\d):(\d\d):(\d\d)').match('10:11:12').groups()
ruby & python will explode if there is no match rather than fail silently but I'm not convinced the difference is huge.
Ruby is a very concise language.

Matching & getting in most languages typically devolves into matching a regexp to a string, getting a MatchResult object, and then getting/iterating/checking/... on it.

I switched from Perl to Python around 12 years ago. I do think the sigils make code a little faster to comprehend. A bare word in python can be anything, wheres a variable will always start with a '$' or '@'.

It's not a huge win, but I do think it's better than nothing.

As for missing things, I do miss Perl a lot. I missed the curly braces when I first started and whitespace didn't feel right. Than after maybe 6 months I had to go back and do some Perl. Moved some blocks of code around, then got the dreaded missing brace problem. I realized that was something that I never got in Python and am a fan of whitespace since then.

I like this approach - We use python for our backend and the things I miss the most from other languages are the protected/public/private keywords (Java), single-line `if condition: return` statements (js, ruby, etc.), and npm/yarn/package.json (js). I miss types too but it feels unfair to complain about that with Python.
Fair point re: package management - but Python does support single-line conditionals, and reasonably robust type-checking with mypy.
MyPy isn't quite as nice as typescript, and it also has some trouble with Django. I know it CAN work, but does any language have a worse type system? Maybe Ruby.

For single line conditionals - looks like you're right from that I can tell. I mistakenly assumed the pep8 errors were actually interpreter errors. Thank you!

I believe that the reason you're having issues with Django is that Django lacks types on its external interfaces. This isn't an issue with MyPy. You're conflating language-level issues with code-level issues. Python's type hints are quite powerful, assuming you're using stuff that provides them. At this point, it has a better type system than Go (if only because it has sum types and option types), but it's opt-in.
Glad I could help :)
> modern IDEs and editors give us all the type information we could want, and these tools made sigils obsolete.

I like to rate a programming language by how dependent the language is on some bloated IDE ("editor"). If I need an Eclipse or a Pycharm just to edit a file, something has gone wrong syntactically and systemically.

Sigils are semantic information about the code. Sigils do not reduce readability, they increase expressivity and comprehensibility. It isn't the characters themselves that are the problem -- we see the same notations for different purposes entering Python and DSLs such as Pandas.

Bash, awk, sed, and Perl are solid tools.

> I like to rate a programming language by how dependent the language is on some bloated IDE ("editor")

I like to use the tools that make me and my team the most efficient.

sed and Perl scripts are very often extremely hard to decipher.

I've used both extensively a long time ago when my workstation had a Sun Microsystems logo on it (yeah I am that old) and I remember having problems to read my own scripts a few months later.

I don't miss those.

Words are the better sigils.

I agree with this and in most languages I don’t miss sigils but one language I do wish supported them is plpgsql.

The reason is that column names and function arguments overlap a lot, which can cause ambiguities when performing updates or selects. To become productive at plpgsql it’s a problem that you have to solve.

There are several approaches but the one I settled on is just to prefix all formal parameters with underscores.

The wish I have with plpgsql is that I could use $ instead since underscore is already heavily used as a word separator.

> Despite my decades of dynamic typed languages, I hate going back to dynamic languages anymore. YMMV.

Mine does vary - while static typing is helpful, it still (even with more advanced type systems) leads to boilerplate code that I dislike writing. In a compiler written in OCaml that I worked on for a bit, there were hundreds of lines of code dedicated to just stringifying variants. It could have been generated by a syntax transform (the newer tools for this are actually quite good), but that's another dependency and another cognitive overhead. In Kotlin, lack of structural types means that the rabid "clean architecture" fans create 3 classes for each piece of data, with the same 10 fields (names and types), and methods to convert between those classes - it requires 10x as much code for very little gain. Lack of refinement types makes the type systems mostly unable to encode anything relating to the number values, other than min/max values for a given type. There's reflection in Kotlin (not in OCaml though) that you can use, but then we're back to everything being an Object/Any and having runtime downcasts everywhere.

I think gradual type systems are a good compromise, for now at least. I'd prefer Typed Racket approach of clearly delineating typed and untyped code while generating dynamic contracts based on static types when a value crosses the boundary. Unfortunately, that's not going to work for existing languages, so the next best thing is something like TypeScript or mypy.

Of course, convenient, hygienic, Turing-complete not by accident, compile time execution and macros would, to some extent, alleviate the problems a simplistic type systems cause. A good example is Haxe, Nim, Rust, Scala 3, etc. Without such features, though, I'm not willing to part with runtime reflection and metaprogramming facilities provided by dynamic languages - the alternative is a lot more lines of code that need to be written (or generated), and I don't like that.

---

More to the topic: logic variables. The `amb` operator from Scheme, for example, or what Mozart/Oz has, or Logtalk, or Prolog of course. They're powerful, incredibly succinct way of constraints solving without writing a solver (just state the problem declaratively and done - as close to magic as it gets). No popular language offers an internal logic DSL, although there are some external DSLs out there.

Also, coroutines. No more manual trampolining, no need for nested callbacks, the state of execution can be saved and resumed later mostly transparently. Lua has them built-in, Kotlin implements CPS transform in the compiler. Nowadays almost all popular languages provide them, mostly exposed as async/await primitives. Scheme and Smalltalk can implement them natively inside the language and did so for ages; it's nice to see mainstream languages catch up.

REPLs. Not a language feature per se, but an implementation decision that has a lot of impact on productivity. It's relatively commonplace now - even Java has jshell - but most of the REPLs are pretty bad at executing "in context" of a project or module. Racket, Clojure, Common Lisp, Erlang, Elixir are gold standards, still unmatched, but you can get pretty far with Jupyter Notebooks.

Destructuring/pattern matching. It was carefully added in some simplified cases (mostly simply destructuring sequences) in many languages, then the support for wildcard and splicing was added, then support for hashes/dicts was added, and now finally Python has a proper `match` statement. I think more languages will implement it in the near future.

Some sort of coroutine solution is definitely on my list too. I'm actually not too passionate about which one it is, except for a distaste for async/await on the grounds the compiler ought to be able to do it for me. But generators, threads or actors cheap enough to use freely, coroutines, something that allows me to break out of the strictly hierarchical structured programming system and retain some degree of state within a function when I need to. It's possible to hack something together in a language lacking this, by moving all function state into a struct/object but all the manual scaffolding is painful and error prone.
Well articulated. Thanks.