Hacker News new | ask | show | jobs
by samuelstros 1452 days ago
Since months I am working on an open source localization solution that tackles both developer and translator facing problems. Treating translations as code completely leaves out translators, who in most cases can not code.

I am working on making localization effortless via dev tools and a dedicated editor for translators. Both pillars have one common denominator: translations as data in source code. Treating translations as code would break that denominator and prevent a coherent end-to-end solution.

Take a look at the repository https://github.com/inlang/inlang. The IDE extension already solves type safety, inline annotations, and (partially) extraction of hardcoded strings.

5 comments

As someone who has dealt with localization pipelines before, I totally agree with your sentiment. People doing translation work should not need to deal with code. I like that you opted for translation IDs, though that can get messy switching back and forth to know what the English (or whatever the base language is) actually says. The IDs are somewhat worth it though since you can try and use old translation files unlike the gettext methodology which looks for 1-1 string matches.
IDs vs base language string as ID is a common debate. I opted for translation IDs since Mozilla's Fluent (https://projectfluent.org/) uses translation IDs. I can't find their list of reasons. I do remember having problems myself by changing the base language string and thereby losing the connection to all translations.

The argument against IDs is the reduced readability. Something that can be solved with the IDE extension.

Is there something else that bothers you in localization pipelines?

> Treating translations as code completely leaves out translators, who in most cases can not code.

It's a big lift to extract all hardcoded strings for a future state where localization will be 'required', especially for large companies. There's no question non-technical teams need the ability to edit strings/translations but if it means changing your infra or the way eng prefers to build it's a tough argument.

We've been building https://www.flycode.com as a platform to make strings/translations and static assets (hardcoded or in resource files) editable by connecting existing repos.

Anyone can learn how to call a function, just like they can learn how to splice a parameter into a string. And translators already have to know basic HTML anyway.

MessageFormat is code. It's just not a very powerful language. And it knows nothing about rich text, just plain strings, which means that you have to deal with manual HTML decoding in the application, ensure all translations are actually producing valid HTML and absolutely not forget to encode all string params that could be user input.

Using tagged template literals and JSX in the translations avoids all those problems.

It's a tempting argument. By interviewing hundreds of people a different pattern emerged though. Translators don't know how to code. Some companies manually removed quotation marks (") from strings because they confused translators.

What do you think about Mozilla's Fluent format/syntax https://projectfluent.org/?

BTW feel free to reach out via email to me. Look at my profile to find it.

I honestly don't understand why translation libraries work with JSON of all things. In a typical pipeline translators work with excel sheets and word documents not code.

https://www.npmjs.com/package/csv-translations

Are you using CSVs to store your translations in source code?
wow love the interface. I have been working with weblate for the past year and while it has a lot to offer, it feels very heavy.
The interface shown in the GitHub readme is an old version. The new version goes in the same direction but will look a bit differently. One thing is sure though: Strive for simplicity. I haven't come across a good editor for translators yet.