Hacker News new | ask | show | jobs
by colanderman 2432 days ago
Not one byte, but a number of bytes equal to the length of the CRC (typically 4).

But yes, people saying "CRC" when they really mean "checksum" or "signature" is a pet peeve of mine, and I treat it as a code smell. CRC has a precise defined technical meaning.

For the curious: CRC is linear with respect to XOR. This means that if you XOR two equal-length strings, the CRC will be the individual CRCs XORed together. It's also a bijection for messages of length equal to the polynomial (typically 32 bits): every input maps to a distinct output and vice-versa. Together, these mean it's trivially invertible for a message of length equal to the CRC: the CRC function can be represented as a square matrix over GF-2 (i.e., bits + XOR), which can be inverted with standard Gaussian elimination to produce the inverse function: generate a (short) string from any CRC.

More fun CRC tidbits: it's not a given that two arbitrary CRC functions, or a CRC and another linear(ish) checksum (e.g. ones' complement), are linearly independent. This can bite you when you try to use two different CRCs to get "more" error protection, or when you use "independent" CRCs to route work at two different points in a system. Again this is easy to verify using linear algebra (just check that the GF-2 matrix formed by the concatenation of your two functions is of full rank).

1 comments

I've done it for people with any level of programming. People have already got used to that "CRC" is "checksum". But I will think about it.