| HN Mirror

Yes, the algorithm zip uses in this case uses backrefs "LZ77 algorithms achieve compression by replacing repeated occurrences of data with references to a single copy of that data existing earlier in the uncompressed data stream. A match is encoded by a pair of numbers called a length-distance pair, which is equivalent to the statement "each of the next length characters is equal to the characters exactly distance characters behind it in the uncompressed stream". (The distance is sometimes called the offset instead.)"

This is a different technique from entropy coding, which gets its improvements by allocating fewer bits for more frequent symbols. But most modern compresses uses a mix, for example gzip uses DEFLATE, which is a combination of literal backrefs and dynamic and static Huffman tables.

and Yes, JSON is usually absurdly compressible.