I think you have a bug in codepoint_read(). The last argument is a pointer to variable unicode_t named decoded. If the first if statement in that function is true, then the a value is never assigned to that variable, the function returns, and immediately outside the function the value of that variable is read. But that variable was never initialized.
If a NUL parameter is passed to utf8toutf16 and friends, they'll return UTF8_ERR_INVALID_DATA.
However, you are correct about codepoint_read's behavior. I actually have a test for passing NUL to the function, but the value of decoded isn't checked:
TEST(CodepointRead, InvalidData)
{
const char* i = nullptr;
size_t il = 0;
unicode_t o;
EXPECT_EQ(0, codepoint_read(i, il, &o));
}
It's definitely a bug, but relatively harmless. The function is only used internally and the calling sites guard against that kind of attack.
Nevertheless, I thank you for the extra set of eyeballs. I have been looking at these functions far too long, you tend to gloss over obvious problems after a while.
I'll be sure to get a fix in for the next release. :)
However, you are correct about codepoint_read's behavior. I actually have a test for passing NUL to the function, but the value of decoded isn't checked:
It's definitely a bug, but relatively harmless. The function is only used internally and the calling sites guard against that kind of attack.Nevertheless, I thank you for the extra set of eyeballs. I have been looking at these functions far too long, you tend to gloss over obvious problems after a while.
I'll be sure to get a fix in for the next release. :)