Hacker News new | ask | show | jobs
by BelenusMordred 1832 days ago
> Identifiers can now contain non-ascii characters. All valid identifier characters in Unicode as defined in UAX #31 can now be used. That includes characters from many different scripts and languages, but does not include emoji.

Don't despair fellow emoji devs, our time will come soon enough.

5 comments

In the English speaking world, I have a hard time justifying untypeable identifiers. Maybe, and this is a fat maybe, Greek symbols could make the odd write-once math or scientific code easier to read. It would be bonkers to use any non-ascii codepoints in a public API.
Your statement here has so many qualifiers that it's hard to treat it as a general rule.

"English speaking world"

"untypeable"

"public API"

Even then you exempt Greek symbols. I suppose you might be willing to accept ¢ or €. Of course, "untypeable" is very ambiguous.

Still, not all code is "public API" and not all code is intended for the English speaking world. So I think it's great to support non-ASCII symbols.

よろしく

I was not trying to argue against its general support, only that for this English-speaking audience (HN) there are very limited use cases.
I accept your caution. But it is fun for me to consider the differences in these:

enum Size { P, XS, S, M, L, XL, XXL }

enum Size { Petit, Extra_Small, Small, Medium, Large, Extra_Large, Extra_Extra_Large }

enum Size { 小小小, 小小, 小, 中, 大, 大大, 大大大 }

Which do you prefer: P, Petit, Petite, 小小小, 3小, XXS, 2XS?

English is chalk full of non-English words. And lots of symbols are useful. Sometimes it's worth it to learn the symbol. And often not much harder than learning the English abbreviation.

enum ⏻ { ⭘, ⏽, ⏾ }

Again, "typeable" is ambiguous but obviously an important consideration.

None of this is to argue against your point. Just musing.

That's not standard Chinese. Is it Japanese?

You could write SSS, SS, LL and LLL in English too.

By the way, it's 'chock-full' not 'chalk-full'.

Thanks for the correction. TIL

Are you distinguishing Chinese or Japanese on visual appearance? Or by some other method?

I think 小中大 have an advantage over SML because SML have no meaning by themselves. 小中大 have the exact meaning we need. Granted, 中 kinda ruins that argument. Size::中 is very clear, but perhaps not more so than Size::M.

I'm actually curious. Which do you like better in code? P, SSS, XXS, 小小小, 3S, 3小, whole words, something else?

Even if english speaking, it is still ambiguous. For example in my keyboard the micro sign (µ; not to be confused with the Greek small letter mu) is easy to type (AltGr + M). I bet this is the same for the majority of HN users. The US keyboard doesn’t use third level shift (usually AltGr) so most people that use US keyboard exclusively are unaware of this. Whenever I see people using the Latin small letter U (u) instead of the micro sign, all I think about is how restrictive the US keyboard actually is, and how much of a shame it is that the culture which predominantly uses this restrictive keyboard design came to be the dominant when designing computer interfaces.
> For example in my keyboard the micro sign (µ; not to be confused with the Greek small letter mu)

The greek letter mu IS the micro sign, where do you think it comes from?

Unicode has both MICRO SIGN (U+00B5) and GREEK SMALL LETTER MU (U+03BC). The former is the one on (most) people's keyboard, and it shouldn't be used to type actual Greek.
Look the same, but different code points.

MICRO SIGN (U+00B5)[1] vs. GREEK SMALL LETTER MU (U+03BC)[2]

1: https://www.fileformat.info/info/unicode/char/00b5/index.htm

2: https://www.fileformat.info/info/unicode/char/03bc/index.htm

There's lots of non-english codebases out there, supporting symbols for their languages is a big deal.

In a way, it's in part a job for the dev tools. Autocomplete engines should be able to complete these symbols from an ascii-only hint. 'a' should offer "α" as one of the completion options, and 'bla' should offer "BLÅHAJ".

I like to use the poop emoji in the names of methods that are dangerous hacks that we want to (eventually) get rid of.

Being harder to type is part of the point.

Can't you just use poop emoji in comments and search for them like TODOs?
But then others won't feel dirty when they use the method. The goal is to make people feel bad when they add more usages of it and make it harder to remove. This is the case when it's easier and faster than the proper way. When there is crunch time it's hard to push your lead to reject the PR so it's soft pressure and it's apparent at the call site whereas a TODO in the method definition is not.
There is a famous story of a codebase that used `sleep` calls for this purpose. Start by sleeping 1 ms every time that the undesired function is called. Next release, up it to 10 ms, then 100 ms, and so on.
But that's method shaming! /s

I guess you have a point. It's clearly a "let's meet in the back alley" API.

Reminds me of Haskell's tendency to prepend "unsafe" (or, in one case where that was insufficiently dire, "accursedUnutterable"), although the use case is a little different.
I assume making the name of the function they want to eventually get rid of more annoying to type is part of the reason to do this rather than just tracking ones needing replacement.
It's really not hard at all to make them typeable, it just requires some minor setup (use an IME). Still doesn't necessarily mean it's a good idea, but it's not a total blocker.
They're hard to type if you're a Windows user. Other OSes have better input systems.
> Maybe, and this is a fat maybe, Greek symbols could make the odd write-once math or scientific code easier to read.

The Julia community has been trying really hard to convince people that this is a good idea and I'm still not on board with it.

Scientific computing is actually a use-case I see this working well in; IMO, it's perfectly fine to have terse code that exactly matches up to equations in a longer paper that justifies correctness, etc. (as long as the paper is easy to find from the code!)
It helps here to have good shortcuts (i.e. \tex like commands) for all the symbols. Otherwise writing them is a PITA and nobody will do it (e.g. Go supports full unicode identifiers, when was the last time you saw one in the wild?)
> Go supports full unicode identifiers, when was the last time you saw one in the wild?

Back when someone used them to add pseudo-generics https://old.reddit.com/r/rust/comments/5penft/parallelizing_...

This is the one thing I used to love about Maple back in college. You can compute directly with mathematical symbols, draw matrices and rational numbers the same way you would on paper, and the language runtime understands it. Too bad it's a $5000 license or whatever.
maybe we can have some keyboard plugin/"mode" that merges letters to form symbols in special way?

Korean windows have a "special-emoji" keyboard that can be invoked by: 1. type a charactor 2. press 'hanja' key 3. (wild special character selection menu appears)

example: § (from 'ㅁ') / ㈜ (from 'ㅁ') / ㎖ (from 'ㄹ')

Most Linux environments already have this, it's called "compose key". You may need to enable it in your desktop environment's settings, but then after pressing the key you can press !? to type ‽, 12 to type ½, <= to type ≤, n~ to type ñ, <3 to type ♥, etc. Here's GTK's list: https://help.ubuntu.com/community/GtkComposeTable
It would be nice if this were built into Windows, but in lieu of that, I have been using WinCompose[0] for several years for typing diacritics (as part of my learning French). For example:

* ALT e' = é

* ALT u " = ü

* ALT n ~ = ñ

It's not limited to diacritics; you can type ligatures (ALT ae = æ), extended characters (ALT [/] = ), I assume the majority of UTF (ALT #G = 𝄞) and so on. And yes, even emoji (ALT ALT alembic = )

edit: apparently, HN won't show the checkbox (U+2611) or alembic[1].

[0] http://wincompose.info/

[1] https://emojipedia.org/alembic/

> It would be nice if this were built into Windows, but in lieu of that

Something that's basically equivalent is builtin. Just switch your keyboard from EN-US to EN-INTL, and now several accent characters become dead keys, so ' e = é, etc.

For Windows, the ENG-INTL keyboard layout is pretty good for simple accented characters, covering latin-based European languages [1].

Windows+. also allows you to just type to search for emojis, so e.g. typing WINDOWS+. sad ENTER gives you a sad emoji. Sadly this doesn't work for symbols, even though the Windows+. menu has a good coverage of symbols.

I guess you can always switch your keyboard to Korean input.

1: https://community.windows.com/en-us/stories/keyboard-shortcu...

Julia does this in a pretty simple way, you type \lamdba then press tab and it becomes λ, emojis and many other things work like this, so typing them is very wasy.
I tried doing this once, but it's not worth the pain of having to switch keyboards or copy-paste symbols
Here's our new feature. Please, never use it.
Soon we'll be able to do the equivalent of the C++

  typedef shared_ptr<Foo> Foo STAR_EMOJI;
EDIT: My star emoji doesn't seem to be showing up in HN posts so image that STAR_EMOJI is actually https://emojipedia.org/star/
I cannot understand how this new "feature" could be considered a good idea.
Animated GIF meme identifier devs of the world unite! We will overthrow this repressive violence against our kind. Baby dance will be the next "Hello, world!"