It’s slow as molasses in some parts though. (Deliberately, it’s a trade-off.)
The Unicode part of ICU shouldn’t be that large, however (on the order of hundreds of kilobytes), it’s the locale data that’s big[1]. Does Bun implement ECMA-402 internationalization? Even without locales, one of the largest parts of ICU data is normalization and casing tables, which I think bare ECMAScript does not require. (It does mean bare ECMAScript cannot adequately process Unicode text, but meh, you get what you pay for.)
Ah, I see, String.prototype.normalize() is in ES6. I remembered that Duktape (which has its own, even slower Unicode library) can’t do it[1], but then it doesn’t try for ES6 either.
The Unicode part of ICU shouldn’t be that large, however (on the order of hundreds of kilobytes), it’s the locale data that’s big[1]. Does Bun implement ECMA-402 internationalization? Even without locales, one of the largest parts of ICU data is normalization and casing tables, which I think bare ECMAScript does not require. (It does mean bare ECMAScript cannot adequately process Unicode text, but meh, you get what you pay for.)
[1] https://unicode-org.github.io/icu/userguide/icu_data/buildto...