|
|
|
|
|
by jlarocco
4660 days ago
|
|
From the article: > The exceptions that were crashing us were caused by people using String.prototype.substr. That function works perfectly on strings that only contain Unicode 1.0 data, but as soon as you're storing UTF-16 in your UCS-2 string there's a possibility that when you take a slice you'll split a valid surrogate pair into two invalid lonely surrogates. To me, it seems like it'd be nearly impossible for somebody to trigger, but there's always Murphy's law... |
|
Suppose you receive a long piece of text wrapped in JSON, unpack it into a JS String, then start processing it in fixed size chunks. If your source text contains any significant percentage of surrogate pair-represented characters, you'll eventually break one.