Hacker News new | ask | show | jobs
by Stratoscope 2423 days ago
It's not just case insensitive, it's underscore insensitive.

So MyNimName, mynimname, MY_NIM_NAME, My_Nim_Name, __MYNIM___name_, and any other variation you can think of are all the same name!

Most search tools have the option of case sensitive or insensitive search, but a search that does that and ignores underscores? Not too many of those outside the Nim world.

An interesting contrast is Nim's policy on tabs and spaces for indentation.

Spaces are fine. You can use as many as you want. Two, four, three, whatever. Nim doesn't care.

But Tabs? They are forbidden!

Unless you use this magic line at the top of each source file:

  #? replace(sub = "\t", by = " ")
Now you get to use tabs!
6 comments

I thought this feature was a bit odd once I first started using Nim. But by now I'm a huge fan, and I'll explain why. But first I'll just say that yes, it is slightly harder to search for identifiers. But libraries will stick to either snake_case or camelCase for their identifiers, so as long as you know which one it is it's not that big of a deal. The benefit of this style insensitivity though is really nice. I programmed in Python and C for many years before trying out Nim. And in both languages I've run into libraries that does the opposite of what the style guide recommends. Some people just prefer one style over another and will do their libraries in their preferred style. The problem is that this meant that my code ended up as a hodge-podge of the different styles. This was definitely more prevalent with Python code, but I've run into it with C as well, especially with micro-controller programming. With style-insensitivity however this is a complete none issue. You can write your code in whatever style you prefer, and not care about what dubious stylistic choices the library maintainer sticks to. To me this far outweighs the small inconvenience that it brings (especially if you take into account editor tools).

I feel most people who read about style insensitivity imagines all Nim code being written as a crazy mix of styles, while in reality Nim code is often more consistent in style.

And the tabs vs. spaces issue is simply to alleviate the issues with "how many spaces are a tab" which is impossible to guess when compiling code. The replace filter you mention above essentially just explicitly specifies how many spaces you intend a tab to be. That being said I would've liked it better if it forced tabs, that way you'd have one tab for one indentation level, and everyone could choose their own preference of indentation.

> I would've liked it better if it forced tabs, that way you'd have one tab for one indentation level, and everyone could choose their own preference of indentation.

The problem with this is that in many code bases, you occasionally need spaces to align code/data in different lines to make them more readable and easily inspectable, e.g. within a long parenthesized expression spanning multiple lines. In that case, indentation tabs require nontrivial gymnastics, whereas spaces are consistent.

The files should always contain spaces - because behaving as if spaces are tabs is round-trippable even when every user uses a different setting. An editor could en-tab on load, de-tab on save, and everyone is happy (though I'm not aware of any editor that does this). If the file only has tabs, the converse "de-tab on load and en-tab on save" does not give you the same alignment flexibility.

Nim is actually case sensitive on the first character only, so your list of identifiers are not all equivalent. Just thought I'd point that out for correctness sake.
Oh, thank you for the correction, and sorry for the misinformation!

In any case (pun intended?) it's the underscore insensitivity that makes it difficult to use standard search tools.

Again just for correctness your last identifier isn't legal in Nim. You aren't allowed to start an identifier with an underscore, and you can't have two underscores together.
> it's the underscore insensitivity that makes it difficult to use standard search tools.

I personally don't like the feature, but what do you find difficult about using /my_?[vV]ariable/ to search for any of myvariable, my_variable or myVariable?

Sounds like a nightmare.
Sounds like a pointless misfeature that will only generate mandatory "do not @#$$ do this" entries in future Nim coding style documents, and Nim linting programs that find and flag abuses.

The motivation is good, if the intent is to get rid of ___unwanted___crap___ like this.

But if I, as a language designer, wanted to ban such identifiers, I would just go ahead and ban them, rather than making them equivalent to ones that do not have repeated underscores.

The principle is: don't make unwanted/undesirable forms equivalent to acceptable forms in hopes that people will then just stick to the acceptable forms when they discover that the the unwanted forms don't bring about the difference they were hoping for. People won't. People will go to town with the equivalence to make problems for other people.

People who don't know about the equivalences, or have forgotten, will be tripped up. Someone might think that a variable called __foo in an inner scope is different from an outer (or global) _foo, yet their definition will shadow _foo, with some behavior-altering consequences.

It's possible to argue about this endlessly from a theoretical perspective, but Nim is not a new language, there are substantial code bases (not as substantial as C, Java or Python, of course, but still substantial), and these rules have been worked out through real world cases, and are very effective in practice.

From a theoretical perspective, your argument can equally be applied to C's case sensitivity or Pascal's case insensitivity when someone "doesn't know" or "forgets" about the equivalece - isn't it absurd that FOO and foo are different (C) / equivalent (Pascal) when you are coming from the other one? Similarly, Lisp-1 vs. Lisp-2 . In practice, it's just one more convention -- among several others.

The motivation is not to get rid of __unwanted___crap___ (nim bans multiple sequential underscores and leading underscores, so they are rid of), but rather: Nim arguably has the best built-in FFI of any modern language with nontrivial use, and this FFI has been a factor in the language design. Nim's rules allow you to keep a mostly uniform coding style in _your_ parts, yet integrate it naturally into projects using snake_case, CamelCase, javaCase and ALLCAPSCASE and SHOUTING_MATCH_CASE, or all of the above in the same project.

Nim started out case-insensitive (which is less popular, but definitely common choice made e.g. by Pascal, Excel and others). IIRC, the "first letter's case does matter" is a relatively recent addition (as in, 2 years out of the project's 9) to simplify FFI to conventions like OpenGL, which have the same identifier in both lower or upper case.

In theory, everything could go wrong. In practice, Nim gets it exceptionally right.

Are you saying that the FFI transparently renames identifiers, so you think you're calling foo_bar, but the actual foreign function is FooBar, with no traces of FooBar in the program (like in some definition which indicates that the two are mapped together)?

If so, that's an incredibly bad idea.

If a program calls some foreign function called FooBar, the identifier FooBar better appear somewhere in it, if you know what's good for the maintainer seven years from now.

The “real” name and argument types of an FFI must appear at least once, but other than that the identifier follows Nim equivalence rules.

Again, the theory can be argued endlessly. In practice, it just works.

Ah, but does the real name appear together with the Nim name in some line of code that binds them together, which your editor can jump to if you want to know where that Nim name is defined?

If it's just some construct that defines a call to FooBar in the foreign library, with no mention of foo_bar, but elsewhere in the Nim code we call it as foo_bar, I'm afraid I cannot agree with this being a good technical decision in language design.

I wouldn't sign off on such a concealment ruse even if it were someone's macro, not being upstreamed into a language implementation at all.

This is incorrect.

Nim does not allow variables starting with underscore.

Also, the compiler errors cannot if any of "useHTTP", "usehttp" or "use_http" is acceptable or unwanted, but it can error out with a clear warning if you are trying to define different variables with those names in the same scope.

It's a misfeature, but let's not to be melodramatic.. just use a formatter.
It's a pretty bad misfeature though, since it's a feature you might accidentally use without realizing it if you don't use a linter.
I disagree with this claim.

You shouldn't modify code in a programming language you don't have sufficient familiarity with - at least not code that you depend on.

C has shortcut boolean ops, so "if (have_peace_treaty || launch_missiles() == LM_SUCCESS) cross_border()" would not launch missiles if you have a peace treaty.

C++ has shortcut boolean ops as well... unless you have overloaded "operator&&". So the same line in C++, depending on the types involved, might not short-circuit and start a war even if you have a peace treaty.

In Pascal and Python 3, evaluating 1/(1/2) gives 2. In python 2 and C, you get an integer overflow (with differing semantics). But 1.0/(1.0/2) gives 2 in all cases, even though 1==1.0 is true in all languages mentioned.

And "if (0.3*3 == 0.9) all_is_well(); else apocalypse_go_to_bunker()" might also surprise you if you are not aware of how FP math is implemented (in APL/K/J, all_is_well(), but not in any other language I'm familiar with).

The only example I can think of that could bite you is when shadowing an outer scope -- which is just as much a problem in C, Python etc. Some compilers warn about it, some don't, I don't know if Nim does.

In practice, Nim choices put it in an amazingly sweet spot.

Tabs are fine and spaces are fine, but mixing tabs and spaces for indentation is a recipe for disaster, because different environments and tools have a different tab width setting -- as a result, "printing a file and OCRing it" (which people do, e.g. when typing code shown somewhere else) can result in legal but semantically very different programs looking the same. That's why you shouldn't use tabs on code that is ever edited/viewed by more than one person[0], and why you should always use python with -tt mode.

Nim takes a practical approach by banning tabs altogether; The "#?" hack you suggest just shows another great Nim feature - source code filters are standardized; You don't need a preprocessor/lex/yacc/re2c/swig with its own driver/makefile; it's all well documented and tracked within your Nim environment.

[0] especially in languages in which indentation changes program semantics, but even in e.g. C - where mismatch between indentation and curly brackets can let a bug like "goto fail;" hide in plain sight.

No tabs? Guess I won't be using nim.
Banned tabs? Good to know. From now on I'll just ignore everything about Nim. I will also applaud the Go team's decision to include a formatter instead of such nightmare restrictions.
Well, while its your right to avoid Nim for its syntax choices (I dislike the case insensitively strongly too), the tone of the comment makes one wanna reply "Just don't let the door hit you on your way out".

It's not as if some non-user of Nim announcing they'll avoid the language is any great loss.

Reminds me of all those "Cancel my subscription" letters to the editor in days past. Yeah, I'm the the "Time" magazine or whatever would tremble to know someone is cancelling...

Gofmt is pretty great, I really hope more languages take such an opinionated approach!
nim has a formatter