Hacker News new | ask | show | jobs
by codetrotter 2971 days ago
> Some people mentioned that this would "only affect RAR files" and it would be safe to extract 7z files with 7-Zip prior to version 18.05. This is wrong, because 7-Zip detects the file type from the magic numbers at the beginning of the file. So the exploit can be renamed to 'exploit.7z' and it works just as well.

But the contents of a file is what really determines the actual file type, though.

So if you decide to block RAR files, you should do that by looking at the magic number. Then you are good to go. And it's not like you have to write a lot of code or anything to do this. The standard 'file' utility in Unix is able to determine the type of a file -- often by looking exactly at the magic number.

Just like you would inspect the actual contents of a file if you were running an image hosting site -- and not trust whatever arbitrary filename the uploader told you that the file had.

1 comments

Of course, but I would still strongly advise against this.

If you really cannot avoid implementing something like this, you should inspect the 7-Zip code in order to be 100% sure that the magic number detection in your filter is identical (or matches a superset) to the one from 7-Zip.

> you should inspect the 7-Zip code in order to be 100% sure that the magic number detection in your filter is identical (or matches a superset) to the one from 7-Zip

CPP/7zip/Archive/Rar/RarHandler.cpp:

    #define SIGNATURE { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x00 }
CPP/7zip/UI/Common/OpenArchive.cpp:

    const Byte kRarHeader[] = { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x00 };
CPP/7zip/Archive/Rar/Rar5Handler.cpp:

    #define SIGNATURE { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x01, 0 }
Those are the two magic numbers for RAR archive version 1.50 onwards and RAR archive version 5.0 onwards respectively, and those are the places they are referenced in the 7-zip source code. I looked at the source archive of the 18.05 version, downloaded from https://www.7-zip.org/a/7z1805-src.7z. I guess if you wanted to be really rigorous you'd look at previous versions as well. If the 7-zip project makes use of a version control system and that vcs supports the equivalent of git blame then that should not be too difficult of a task for whoever wanted to go to that extent of investigation.

And here is where one copy of the 'file' command identifies the same magic numbers as RAR archives:

https://github.com/file/file/blob/f0a725a13fe0c1b046d8e07057...

    0 string Rar!\x1a\7\0 RAR archive data
https://github.com/file/file/blob/f0a725a13fe0c1b046d8e07057...

    0 string Rar!\x1a\7\1\0 RAR archive data, v5
:)
You missed my point. It is trivial to find out what the magic number is. What is more important though: How exactly is the magic number matched? From what you have written, one might be tempted to simply check whether a file begins with this magic number. And this would be wrong. If you take a look at the matching in CPP/7zip/Archive/Rar/RarHandler.cpp:

    Byte marker[NHeader::kMarkerSize];
    RINOK(ReadStream_FALSE(stream, marker, NHeader::kMarkerSize));
    if (memcmp(marker, kMarker, NHeader::kMarkerSize) == 0)
      m_Position += NHeader::kMarkerSize;
    else
    {
      if (searchHeaderSizeLimit && *searchHeaderSizeLimit == 0)
        return S_FALSE;
      RINOK(stream->Seek(m_StreamStartPosition, STREAM_SEEK_SET, NULL));
      RINOK(FindSignatureInStream(stream, kMarker, NHeader::kMarkerSize,
          searchHeaderSizeLimit, arcStartPos));
      m_Position = arcStartPos + NHeader::kMarkerSize;
      RINOK(stream->Seek(m_Position, STREAM_SEEK_SET, NULL));
    }
7-Zip finds the magic number if it appears within some searchHeaderSizeLimit, i.e., the file does not need to start (at offset 0) with the magic number. For example, 7-Zip will extract a RAR file which begins with [00 52 61 72 21 1A 07 00] (instead of [52 61 72 21 1A 07 00]) just fine.
Oh, I did not expect that. Yes you are right, one must be careful about these things.