| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by leeoniya 1128 days ago

ICU :(

https://github.com/oven-sh/bun/issues/2541

https://twitter.com/jarredsumner/status/1616597667408404481

1 comments

edflsafoiewq 1128 days ago

QuickJS has a custom unicode library for this reason.

link

mananaysiempre 1128 days ago

It’s slow as molasses in some parts though. (Deliberately, it’s a trade-off.)

The Unicode part of ICU shouldn’t be that large, however (on the order of hundreds of kilobytes), it’s the locale data that’s big[1]. Does Bun implement ECMA-402 internationalization? Even without locales, one of the largest parts of ICU data is normalization and casing tables, which I think bare ECMAScript does not require. (It does mean bare ECMAScript cannot adequately process Unicode text, but meh, you get what you pay for.)

[1] https://unicode-org.github.io/icu/userguide/icu_data/buildto...

link

sjrd 1128 days ago

Bare ECMAScript does require normalization [1] and case conversions (for the root locale only) [2].

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

link

mananaysiempre 1128 days ago

Ah, I see, String.prototype.normalize() is in ES6. I remembered that Duktape (which has its own, even slower Unicode library) can’t do it[1], but then it doesn’t try for ES6 either.

[1] https://github.com/svaarala/duktape/issues/1718

link