https://github.com/coreutils/coreutils/blob/master/src/wc.c
The word counting algorithm does seem to be much more complex.