Hacker News new | ask | show | jobs
by lamontcg 1534 days ago
In the early days, symbols weren't garbage collected, while strings were mutable, wasted memory and were slow. So there were tradeoffs.

Now you can use frozen string literals and there's no benefit to symbols. Throwing "# frozen_string_literal: true" in the top of the memory benchmark script I get:

    Calculating -------------------------------------
             strings     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
             symbols     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

    Comparison:
             strings:          0 allocated
             symbols:          0 allocated - same
At this point with no practical difference between them STRINGS AND SYMBOLS BEING DIFFERENT ARE A MISTAKE. When you serialize to something like JSON you lose the distinction (the operation is singular and does not have an inverse transform) and you have to pick either symbols or strings to get back. On a long enough timescale this causes enormous confusion, and leads to the creation of hashes with indifferent access (which helps with the problem, but doesn't fix it).

Ideally it would be good at this point to make symbols and frozen strings completely equivalent ("foo".freeze == :foo being true) but that would likely break too much existing code. The differentiation between strings and symbols though only causes code bugs (mostly biting the new and intermediate level programmers). It is just syntactical sugar with a footgun.

Designing a language from scratch these days, it should have immutable strings by default from the start and should not introduce symbols, unless they are purely syntactic sugar around creating an immutable string.

2 comments

> At this point with no practical difference between them

Gah no! This isn't true! Even if you turn on frozen string literals comparing two strings is slower because they have to test for a non-frozen and non-interned string also happening to be the same.

https://twitter.com/ChrisGSeaton/status/1514603665801109508

There's a pointer comparison, but behind it is on the failure side is a full-byte-comparison. Atrocious for cache even if the strings are tiny. If they aren't you're checking every byte!

I'm a big fan of explicit symbols, but some syntactic sugar around immutable strings and type inference should let you use "symbols" with little performance penalty. Of course by then you might as well use explicit symbols anyway, but I guess there's some additional flexibility. Of course that hurts macro-writing a bit, but not much.
Okay mutable non-frozen strings shouldn't exist either, and people should use string builders.

Really both features (symbols and mutable strings) aren't worth the literally endless bugs that they cause.

I both like and hate how Rust has `String` and `&str`. Constant juggling between the two (which really is a sign I'm not doing it right). Yet knowing and using the difference is important and powerful.

I somewhat miss this when I go back to Ruby, but then realize that symbols often can be used for `&str`. Often. Not always.