Hacker News new | ask | show | jobs
by joosters 3960 days ago
Do they use any fuzzing as part of their testing? UTF-8 parsing seems like an ideal candidate for this kind of bug hunting.
1 comments

Author here.

The library currently doesn't employ any kind of fuzzing. I rely on multiple tests for every kind of input. I've found that is a pretty reliable way to test for "unknown unknowns", even if it's extremely time-consuming.

Adding testcase fuzzing is definitely something to consider though, because it would most likely have found the very issues this release fixes.

I recommend trying fuzz testing, it can be very effective (see my experiences here: http://forwardscattering.org/post/21) Utf8-rewind looks like a nice library, I have looked into it before when researching unicode support. If I need to do more complex Unicode stuff in my language I will probably use it.