Hacker News new | ask | show | jobs
by ghego1 1739 days ago
First, the readme is simply hilarious!

Jokes asides, the concept underlying this project is actually interesting. It wouldn't be bad at all if programming languages were localizable.

I think it would help many if it was possible to choose the (human) language in which to use a programming language. Ideally, the same source code could be viewed in different languages depending on the preferred idiom of the developer.

7 comments

> It wouldn't be bad at all if programming languages were localizable.

It wouldn't be bad indeed, it would be terrible.

I wouldn't be against some IDE add-ons allowing you to see the keywords in your language if you wish, but the underlying names should stay in english. And the function names as well. Otherwise:

- you duplicate the documentation effort, which is already a burden;

- you make googling things extra hard

- people will use their language features, which means non ascii chars. Good luck typing "La leçon du père noël à l'école de la forêt" with something else than my french keyboard.

- IT is nothing but thousands of conventions glued together. And names are a hell of a shortcut to describe conventions. Break that and you destroy trust, reliability and productivity.

- you split the community. FOSS works so well because we can collaborate so well: we have one rosetta stone that lets us do so. Is has a basic alphabet, few rules, and is quite easy to learn.

I'm a french Python dev, and Python 3 does allow you to write variables names with french accents. I would never do that, and really hope nobody ever does.

I'm also a french developer and though i've been very pedantic in the past against non-english code, I must say today that I really prefer "french" well expressed code over "english" code full of errors, "faux amis" everywhere and wrongly expressed intentions.
The kind of code you talk about are usually internal only. People don't write libs, framework or main stream language in those situation.

In that case, then it's ok to bite the bullet, to avoid the worst case scenario. But you are already in a bad place to start with. Because if you put in prod some code by somebody who can't write english properly, then it means your team doesn't have access to most information resources in the world.

So your problem here is damage control, hardly a situation to generalize from.

There’s a lot (maybe a majority where I live) of people who can perfectly read and understand English but not write a single sentence without making errors.
Bad written code is bad written code in any language, that's a false comparison.
What I meant is that the same non English native developer can write totally well expressed code in $native_lang but also full of nonsense code in English.
In that case I'll make an exception, but this is actually very rare. I can see a clear correlation between terrible english and terrible code.
I almost entirely agree, though I think your third point (typing non-ascii characters) could be less severe with one extra lesson in the typing course most kids are forced to take. When I first took French in 7th grade somehow I learned online you could enter ascii codes (whatever those were) on Windows with alt+numpad, and I still have memorized that alt+130 gives é. Later I moved to Linux where we have a great Compose key system, so I can just type <compose> + e + ' and get é, <compose> + c + , to get ç, and so on, with <compose> mapped to whatever I like (currently right-alt). Supposedly (haven't tried it) this system now has a Windows port: https://github.com/samhocevar/wincompose

Asian languages are harder. But if you're told about IME, then at least if you know what you want to type, how to actually type it isn't a big burden. IME can also help with rarer math symbols like ⋂ (\bigcap) ≅ (\cong) or ⊵ (\unrhd), or is another way to get something like the compose key system.

That's a recipe for combinatorial explosion: each language has special chars or even are just made if glyphs, are you going to learn them all? Do you want to impose the burden for all devs in the world?

And now your typing is not at all fluid, even if you didn't have to stop and thing about every single special characters (which most people do), you will have to enter a combination to get them every time completion can't help.

In fact, even with an AZERTY keyboard, typing French, my native language, is slower than english.

Instead of having one simple common ground, you also now have an infinite number of variations to care about.

Not to mention having to understand a lib in spanish, an another one in russian and a last one in Hindi or arab.

I don't need to learn them all, I just need to know the ones I expect I need to type in for the foreseeable future. I might even forget them later (as I've forgotten a lot of LaTeX mappings, fortunately there's a cheat sheet).

You're just repeating your other points about the problems that come from having so many languages, which I mostly agree with, but typing at least is not really an issue, which is all my point is. The issue is: "do I know Arabic?" not "I know Arabic, can I type it?"

As for your experience with French, maybe the AZERTY layout is just inherently bad, but how much slower is it really? Typing your example sentence with my compose key takes me about 3x longer than ignoring the accented characters entirely, which is more than I expected, though I'm very unpracticed. With Japanese, which I'm even more of a noob at, nevertheless the overhead is smaller. If I wanted to type 'purple', I hit a chord to enter JP IME mode if I'm not in it already, type 'murasaki' (the romanized version of the word) at full speed, and leave it as むらさき or select from kanji completions to get 紫 or maybe I needed it to be in katakana so I hit F7, finally I commit it and move on. It may seem like a lot but whatever the case it's a really tiny overhead from just typing 'murasaki'. In practice though looking at source code from Japanese developers I see a mix of full English (maybe some Japanese comments, maybe not), some mix of Japanese but always using romanized symbol names, and full-on variable and function names are in actual Japanese. Because the overhead of typing is so small though, I don't have a preference for the second or third option, they're both fine, though of course selfishly I'd love for the whole world to understand and use English exclusively. For the French case, even with an overhead of 3x in the worst case, I wouldn't object to a variable with an é in it.

I know a lot of devs that don't even type with 5 fingers, you are apparently well trained with your keyboard and you take a X3 hit.
> Good luck typing "La leçon du père noël à l'école de la forêt" with something else than my french keyboard.

There's key bindings for a regular qwerty keyboard that lets you compose letters and symbols to make accents. It's very easy to write French (with all the accents) on a US keyboard (on Gnome Intl. Alt keyboard with dead keys I believe).

French here. I've an American keyboard, not only because my company language is english, but especially because writing code on a US layout is from far the best. [], {}, () can be typed directly or by pressing shift. On a French keyboard, you've the additional Alt-GR key which must be used (second Alt key), which makes some keys awkward to type efficiently - if you've a MacBook, the Alt-Gr key doesn't event exist, which makes writing code a nightmare ([ and { just don't exist!!).

Then, numbers which are very common to type when writing code must be typed by pressing shift. A French keyboard is made to write French efficiently, arguably English too, but definitely not to write code.

Québecois here, for me I have to switch between English and French all the time and pressing Alt do ~{}[] and Shift for <>()| has become a second nature. I just don't think about it anymore and I have to say that when I have to use a US keyboard I just feel so limited.
The problem with US keyboard is that it's most of the time an ANSI layout, with a missing key between Z and left-shift. I love the CSA keyboard because letters with accents are in direct access (a lot better than French Azerty) and {}[]() are positioned logically.
I very much long for a French keyboard designed for coding ...
A French equivalent of the "Polish Hacking keyboard"? Never used it (not even remotely Polish), but I know several people who speak well of it.
I had a computer with an US ANSI keyboard a few years and I don't remember it to be very easy. It's annoying to compose accents, it's worse when you have to switch to another OS or another computer. On Windows, it's a mess when you have more than one keyboard layout activated. And it's terrible when you have to type « » or æ œ like : « J'ai mangé des œufs pour Noël »
I just set the input language to French and let autocorrect add the accents, 99% of the time, it gets it right as there are not that many words where the accents change a word into a completely different word. The only one I can think of is marche as in walking vs marché as in a market.
Easy doesn't mean productive. My guess is that you take 2 to 3 times longer to time such sentence.
Versus using an azerty keyboard or another French-specific layout? Yeah, it's a tad slower. But it's a hell of a lot quicker than trying to look up/remember alt-codes or something. Writing in 2 languages with one keyboard is never going to be optimal.

And 2-3 times longer? Most sentences in French don't have that many accents, the example picked was a rare one.

Yes but we are talking about putting them in code symbols in the first place.
French here using US keyboard with Compose key. Works like a charm.
> It wouldn't be bad at all if programming languages were localizable.

Go switch Excel or Google Sheets into a language you don't know. Then try to use formulas with them.

Come back and tell me if localized programming languages are a good thing after that. :D

> Go switch Excel or Google Sheets into a language you don't know.

I don't believe that point supports your claim. Switching excel to a language you don't know is a good example of the benefits people who don't speak english get from localized languages, not the other way around.

> Come back and tell me if localized programming languages are a good thing after that. :D

The truth is that localized excel has enabled a generation of people speaking poor english to still do pretty dope stuff for their business.

Likewise, localizable languages for children (e.g. Hedi) work pretty well at helping children discover computers before they have a firm grasp of english.

And it makes support, reusable code, common understanding, and readability (in a team) fly out the window, and increases the amount of development overhead of whomever has to maintain the language.
> it makes support

Support is already a problem currently, because personalized error messages are less searchable but more useful than generic ones. Moving to error codes is a good example of an orthogonal concern that would incidentally help i18n. Supporting error formats that can be treated by machines is a further unrelated step that would also benefit i18n.

So I would say the stuff we already do tend to make this concern less relevant.

> reusable code

That's an orthogonal concern. You don't write new code when you i18n something, you just create dictionaries.

> common understanding, and readability (in a team)

I don't believe anyone is making the point that one team should write in separate languages. However, a team able to choose its own language (possibly the one used by the rest of the business) is a plus.

In non-english countries, it's very common to have code written in english to describe a business that is done in local language, with really bad translations of business terms. This already causes issues.

> increases the amount of development overhead of whomever has to maintain the language.

Why would the language maintainer do the i18n themselves?

> You don't write new code when you i18n something, you just create dictionaries.

No. i18n is not that easy, sorry. I wish it were.

To be frank, this answer reads as though it was written by someone who has not done any extensive i18n work in their life. Languages are not 1:1 translatable. This goes beyond words and phrases - numbers, math, time, dates, names, grammar, etc. are all completely unrecognizable between certain languages.

It's not a problem that even regular software has solved generically and elegantly, and then going on to apply it to programming language design is a completely different beast, with its own set of problems.

> this answer reads as though it was written by someone who has not done any extensive i18n work in their life. Languages are not 1:1 translatable.

I argue that i18n is not translation.

It seems like you're making a point for solving problems beyond the scope of PL internationalization, including problems that are not solved when it comes to how english code translates to english natural language.

My original point is that foreign excel versions already did help people. From there, improving compilers to make that work easier and less costly is a reasonable goal, or at least a tractable one. Some of that work is already being done for unrelated reasons (e.g. modern compilers offer interfaces for LSPs as well as error messages in formats that allow processing beyond reading for CIs — both of these changes benefit potential i18n efforts).

> orthogonal concern

Sorry to be blunt, but you have no idea of what you are talking about.

Only if the said support and code are not localized too. I'm using all my devices in English because it's easier to find help online but I can see why people less proficient in it would prefer localized software down to formulae and documentation.
> The truth is that localized excel has enabled a generation of people speaking poor english to still do pretty dope stuff for their business.

I got paid a lot to transform projects that used abusively Excel sheets. All of them had absurdly long delay and cost explosions caused by this.

Excel sheets are for small business and small projects by non-programmers. Excel sheets in a non-English language are for small businesses and small projects by non-programmers that are doomed to stay this way. This is because having programmers that want to unfuck a mess of macro AND speaking your foreign language simply are very rare AND doing so before the business implodes are very slim.

> I got paid a lot to transform projects that used abusively Excel sheets.

I wonder where all this money came from.

I also got paid a lot to transform projects that abused excel, among other things. The usual truth was that, without excel (or, on other projects, without their tortured python scripts, or without the off-the-shelf software modded to death), these companies would never have built a business successful enough to pay entire teams for months or years before any returns.

> having programmers that want to unfuck a mess of macro AND speaking your foreign language

I mean, at which point did the idea "Oh, that language is so weird that very few can possibly speak it _and_ learn to program" start to make sense to you?

Did it ever occur to you that my foreign language was the first language of a whole lot of people living in my country, and that maybe there are a few good universities here?

Switch back to a langage you know?
I'm curious, do you speak another language? Anyway, I think the history of programming languages indicates that it wouldn't be all that helpful, simply because nothing's really caught on, despite many opportunities. What does seem useful is being able to have a way of representing non-ascii characters in source code, at least with comments but hopefully with symbols and of course with data, but even in languages that let you trivially define your own names for everything including the basic keywords and standard functions, you don't really see localization attempts or non-English speakers caring. It's more than good enough to just have native-language documentation that explains the concepts behind the English tokens; you need this in English too since it's not like there's always a clear direct correspondence between the English dictionary meaning(s) and the programming language's meaning(s) of the same word/abbreviation.
Early VBA did this, apparently - see https://ericlippert.com/2021/02/17/life-part-38/#comment-119... on Eric Lippert's blog.
> It wouldn't be bad at all if programming languages were localizable

That is an admirable dream to have, and one that I have sometimes had myself. Unfortunately, as with most things reality soon puts an end to it.

First, the dream is only really a dream you can have with the luxury of a Western language. It is unlikely to be viable once you start looking further East in the direction of e.g. Kartvelian languages (i.e. low number of speakers worldwide and even fewer with relevant specialisms in programming to write compilers) or Asian languages (e.g. Japanese and its particles and other complexities).

Second, when you are writing code you don't really want to have to worry about the possibility of debugging your compiler. Sticking to English means the likelyhood of a more robust compiler because of more widespread adoption. You also remove technicalities such as UTF-8 or whatever else that might cause indirect issues.

Its a bit like science really. Sure you can write your papers in your native language, but the reality is if you're serious about your work being discovered and are interested in being offered bigger and better jobs, you'll want to publish in English sooner rather than later.

> Unfortunately, as with most things reality soon puts an end to it.

Usually, when a thing gets labeled as a dream, people don't try very hard either.

> Sticking to English means the likelyhood of a more robust compiler because of more widespread adoption.

The comment you quote specifically talks about localizable software, not having one compiler per language. When you I18n a website, you don't write n sites, for instance.

> Usually, when a thing gets labeled as a dream, people don't try very hard either.

Some things are hard, but worth trying.

Other things are a dream because the reality is they are unachievable either due to the realities of practical constraints or because the amount of work involved would be hugely disproportionate to the benefit.

For example, I mentioned Kartvelian languages. How about we specifically look at Laz which according to Wikipedia at the last count in 1980 had only 22,000 native speakers.

Writing a Laz compiler ? I'd say if we're being totally honest its a dream, not just "hard work".

> the amount of work involved would be hugely disproportionate to the benefit.

And you can trust english speakers to properly assess the benefits of I18N, right?

> Writing a Laz compiler ? I'd say if we're being totally honest its a dream, not just "hard work".

As I said in the message you quote, internationalizing a compiler is not the same as rewriting it. If you consider that making languages accessible to other languages means each natural language should have its own programming language and compiler, of course it's going to be hard.

But that's moving the goalposts.

First off, you don't need to port a programming language to every natural language. Some smaller goals would already help. For instance, making it easy for third parties to i18n the language and libraries would be a significant enabler.

> Writing a Laz compiler ? I'd say if we're being totally honest its a dream, not just "hard work".

This seems like letting the perfect be the enemy of the good. "We can't possibly localize for every possible language ever" is not much of an argument for why localization itself isn't worth it.

IIRC, TI Basic dynamically translated keywords (and probably function names) depending on the current langage.

That's less interesting if you have lots of libraries if you don't translate them.

> It wouldn't be bad at all if programming languages were localizable.

What a terrible idea.

AppleScript had this! But they ended up not shipping it.