If you're counting tokens, you should count all of them: all symbols and punctuation, as well as any whitespace outside of a string literal that cannot be replaced by a single space character without affecting the syntax.
You should somehow measure cognitive load. Of course it should be measured on someone fluent in that language - but there should be an adjustment to factor in the cost of becoming fluent. :)
APL is about being obsessive with character count. We can turn any language into "APL" by giving the core functions one-letter names and using the lack of whitespace between them in some semantic role like chained application or whatever.
Hey look, "FUBAR". Take the first item, unwrap the list, bind it to function A as the first argument, then reverse! This kind of character-level reduction I'm not interested in at all; It's computer science puberty.
I never said "way too long" but rather "way too cumbersome". What is cumbersome in the Nimrod is the awkward encapsulation. For example, we have to create a special kind of list of statements with a special constructor. This list is a "bag-like" container with an .add method. Yuck!
You should somehow measure cognitive load. Of course it should be measured on someone fluent in that language - but there should be an adjustment to factor in the cost of becoming fluent. :)