Hacker News new | ask | show | jobs
by ambulancechaser 2166 days ago
can you explain?

    ((juxt type identity) (clojure.edn/read-string "x"))
    [clojure.lang.Symbol x]

It seems that the reader returns symbols just fine.
1 comments

I don't use Clojure; I have no idea. If two occurrences of "x" are mapped to the same object, that is interning; maybe what that does is use its own package-like namespace, separately allocated for each call.

The documentation for EDN says that "nil, booleans, strings, characters, and symbols are equal to values of the same type with the same edn representation." The only way two symbol values can be equal is if they are actually same symbol, I would hope.

> The only way two symbol values can be equal is if they are actually same symbol, I would hope.

Why is this important? Specifically, why do symbols need to be interned?

In Clojure, "Two symbols are equal if they have the same namespace and symbol name." In general, "Clojure’s = is true when comparing immutable values that represent the same value, or when comparing mutable objects that are the identical object." [1]

[1] https://clojure.org/guides/equality

If we read two symbol tokens from a stream, and the lowest-level equality function that is available to us does not distinguish them, then they are interned.

Because symbols are used to refer to things, whether or not they are mutable can be blurry. You can make symbols as immutable as you want, but as soon as you make one of those symbols a key which refers to a mutable object, such as a global environment, then effectively, the symbol appears as a gateway to something mutable, and you can't necessarily tell whether the mutability is in the symbol itself or something beyond it.

For instance, let's consider global variables. The definition of a global variable has an effect which we can inspect if we have a boundp function:

  (boundp 'x) -> nil
  (defvar x)
  (boundp 'x) -> t
That can be made to work by mutating the symbol (the global binding information can be right inside the symbol). Or it could be working by keeping the symbol immutable, but mutating some hash table of bindings.

Either way, the symbol looks interned, because we have mentioned it several times, and those mentions seem to be connected. The (defvar x) has an effect on (boundp 'x) and so they are referring to an x which is somehow the same.

It could work with x actually be a kind of character string, which got separately allocated three times. As long as we can't show any property of the system indicated by x to be different based on which copy of x we are using to enquire (e.g. boundp reports true for one x and false for another), then x looks interned.

In Clojure land, equal but not identical? symbols don't cause any issues; they can be used interchangeably as map keys, etc. It won't impact code correctness, just potentially cause slowdowns.

With that said, I always thought symbols would intern, but that's not the case. It is true with keywords, however.

(identical? (clojure.edn/read-string "x") 'x) => false

(= (clojure.edn/read-string "x") 'x) => true

(identical? (clojure.edn/read-string ":x") :x) => true

Objects that can be equal but not identical are not symbols.

They are, at best, cargo culted symbols: character strings with a tag bit which says "read/print me without quotes, so I visually look like something out of Lisp".

You don't use Clojure, but you're willing to jump in and criticize one part of it that, given the context of the rest of the system, could not be less important?

Whether Clojure's object model and equality semantics as a whole make sense is certainly up for debate. It's highly opinionated and no silver bullet.

But once it's in place, the decision of whether to intern symbols is a trivial implementation detail.

I incorrectly assumed they were interned for seven years of using Clojure professionally, it has never made a difference, and I can't come up with a scenario where it plausibly would.

Common Lisp Elitism is a real thing. This person appears to be from the "Clojure is not Lisp" clan of gatekeepers.
almost every language which has Lisp in its name is using some form of symbol tables for interning, from McCarthy's Lisp 1 implementation onwards. That's one of the defining features of the Lisp s-expression reader.

If the 'reader' reads an s-expression like (EMACS LISP IS A LISP DIALECT) then both occurrences of LISP are the same identical Lisp object, both are the same symbol.

If your language is doing something different, then it's not using symbols like Lisp-like languages usually do since the dawn of time.

I think the characterisation of Clojure is not that unfair here. It's Clojure keywords have the role that Lisp's symbols have (and I think they have better ergonomics), and symbols are mostly only used for source code representation.

In other Lisps the detailed semantics of symbols are more important including the identity/interning thing.

Rich Hickey was a Common Lisp user before making Clojure so there's a fair chance he knew how symbols worked there, so the cargo culting characterisation should be applied only light heartedly :)

I thought I was one of these gatekeepers; and that was before I found out that Clojure doesn't actually have symbols, but just a string type with a quote-free read syntax.

Even AutoCAD's AutoLisp (the old one from the 1980's) has interned symbols.

How symbols work goes back all the way to the original MacCarthy work, and all of its actual (not cargo-culted) descendants.

It is not "Common Lisp" elitism.