Hacker News new | ask | show | jobs
by landave 2971 days ago
There were some misunderstandings that I want to clear up (maybe I will add them in an update to the blog post):

1. Some people mentioned that this would "only affect RAR files" and it would be safe to extract 7z files with 7-Zip prior to version 18.05. This is wrong, because 7-Zip detects the file type from the magic numbers at the beginning of the file. So the exploit can be renamed to 'exploit.7z' and it works just as well.

On /r/sysadmin, someone even mentioned that a temporary solution might be to block RAR files. By the same argument, this is unlikely to be effective.

2. Almost all versions prior to 18.05 are affected. I manually checked version 15.05 and 17.01, and they are definitely affected.

3. Not only 7-Zip itself is affected, but essentially all software that uses 7z.dll as library to extract files. This includes various anti-virus software. However, exploitation may be more difficult (though not impossible) if ASLR&DEP is properly enabled (on all modules).

4 comments

This includes various anti-virus software.

It's fascinating that this category of equipment, which searches for viruses by running untrusted code, is still regularly installed in all corners of valuable networks.

As far as I understand the bug, this is not about running untrusted code (but: a parsing error resulting in state corruption). Unless you refer to the 3rd party lib (7z), used by virus scanners, to analyse rar files. But typically "untrusted code" means code that was supplied "at runtime", not at compile time (like a lib), so e.g. if a virus scanner would actually execute a .exe to evaluate its effects, or run javascript found in a webpage.

To be fair to virus scanner vendors, the only way to mitigate this kind of bug is NIH: don't use 3rd party libs, implement everything yourself. But then, of course, without bugs yourself, as well :)

> To be fair to virus scanner vendors, the only way to mitigate this kind of bug is NIH: don't use 3rd party libs, implement everything yourself. But then, of course, without bugs yourself, as well :)

That is not the only way to mitigate such vulnerabilities. AV vendors have had plenty of time to work on sandboxing parts of their scan engines that have repeatedly been found to have vulnerabilities like this one. Somehow we've arrived at a point where no one would recommend using a browser that doesn't utilize sandboxing to some degree, but when it comes to security products, you can count yourself lucky if they don't just run that code as SYSTEM. That should really tell you all you need to know about the state of the AV industry.

> no one

Plenty of people recommend Firefox despite it sharing processes between tabs. More than 10 years after MS sandboxed IE.

> the only way to mitigate this kind of bug is NIH

Or, you know, conduct audits of open source libraries they use and contribute fixes back.

It is a very blurry line between running strange code and running library code against strange data. Mostly though, I'm trying to say that AV boxes should be outside the firewall.
They are not intentionally running untrusted code. But yes, anti-virus software can reduce security.
What AVs use this? I keep hearing horrible things about AVs but little proof. I know Tavis from Google finds bugs and AV in rare cases can reduce security, but that's a far cry from them embedding FOSS projects to save time. For AV companies writing a rar parser is a single-day activity, its hard to believe they need this app.

The few places I've seen 7z used is in installers where the input is known (installer archive in 7z format) and I'm assuming signed in many cases so you can't feed it random inputs. I find it hard to believe Sophos and Symantec and Trend are copying and pasting 7zip.dll into their apps.

I don't know about 7-Zip specifically, but AV vendors use plenty of FOSS code. Here's some findings just from Google's P0 showing that Symantec[1], Bitdefender[2], Microsoft[3] and Avast[4] all use unrar in their products. It wouldn't be far-fetched to assume some might use 7-Zip for other archive formats.

[1]: https://bugs.chromium.org/p/project-zero/issues/detail?id=81...

[2]: https://bugs.chromium.org/p/project-zero/issues/detail?id=12...

[3]: https://bugs.chromium.org/p/project-zero/issues/detail?id=15...

[4]: https://bugs.chromium.org/p/project-zero/issues/detail?id=57...

Maybe I'm just way overestimating how complex RAR archives are (I admit, I have not looked into this), but I think you're out of your mind if you think that someone could write a parser to analyze RAR archives in a day. That it would be better tested, debugged, and more secure than a tool that has probably seen more widespread use than any single AV.
I have previously read that for some reason the author disables most of the compiler options for things like ASLR and DEP

I never managed to find out why

edit: just found this: https://sourceforge.net/p/sevenzip/feature-requests/1270/ -- seems rather questionable considering MS give away the latest compilers for free

DEP was previously disabled because Igor used to compile 7-Zip with VC6, which doesn't support the /NXCOMPAT flag. I convinced him back in January to enable it for 7-Zip 18.01. Note, however, that 64-bit versions of Windows enforce DEP even if the /NXCOMPAT flag is missing. Since Windows 10, the 32-bit version does this as well.

ASLR was primarily disabled because Igor wanted to strip the relocation from the binaries in order to save about 0.5-1% in file size. I have discussed this with him, and convinced him to enabled full ASLR for 7-Zip 18.05.

So now we have 7-Zip 18.05 with full ASLR and DEP. Stack canaries (/GS) are still disabled though.

If you can convince him to use Control Flow Guard, Stack Canaries, and HE-ALSR then you should be nominated for for whatever the security community has as an equivalent to a Nobel prize.

If you can convince him to get rid of his custom garbage Stdlib replacements and use ISO C++ then you're a hero to maintainability (and would probably improve the performance because the stdlib has move support).

HE-ASLR I am discussing with him right now, and I think we will get this.

But honestly, I don't think we will ever see a 7-Zip with /GS or CFG. Not only would this cost about 1% in binary size, it would cost an additional 1% in runtime performance loss. Additionally, it would require compiling 7-Zip with a modern compiler like VS2017. You're just asking for too much.

Like blibble says, this is an absurd tradeoff given that 7zip's primary use case is unarchiving files downloaded from the public internet. Between that, the willful ignorance and dismissiveness of security measures displayed https://sourceforge.net/p/sevenzip/feature-requests/1270/, and the apparent need for you to keep pressuring him to enable each countermeasure one-by-one, I question the prudence of using 7zip at all. He is clearly a great developer, and I have no reason to suspect ulterior motives, but his actions don't engender trust.
> You're just asking for too much.

I know it's not you saying this, but it's very strange given almost all files 7-zip will ever see are untrusted files downloaded from the internet

I'd rather have it be 1% slower than be compromised!

I know this probably isn't the answer you're looking for, but it might be worthwhile to maintain a parallel fork to enable these options for yourself/others. That way, folks that feel the same way you do can have their cake and eat it too. I realize that it's not an ideal solution though.
For virtually everybody even 500% slower would be acceptable. 1% is 10 second, one-time cost, which is objectively a fool's bargain.
> Additionally, it would require compiling 7-Zip with a modern compiler like VS2017

Ahhh, but does say VS2017 produce a smaller executable file, or a faster executable?

I am pleased Igor cares about individual 1% improvements - they stack up to significant savings. However I agree for our work usage security is more important.

> does say VS2017 produce a smaller executable file, or a faster executable?

If I recall correctly, Igor once said that he tested the new VS compiler and it produced neither smaller nor faster executables. I believe there was almost no difference.

Why is that asking too much? I realize paying for software is not something everyone wants, but doesnt the free versions of VS work for compiling 7-zip?
> Why is that asking too much?

Dunno, just as with FB/CA, users have agreed to what befell them. Relevant extract from LGPL 2.1: "THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU."

EDIT: /s, ...

Interesting insight -- thanks!

I agree with others here on the security vs. performance and security vs. binary size trade-offs. As a thought experiment I wondered at what point it would become a hard decision if I were in his shoes. I think a 10% performance hit would make it a harder decision for me, but at that point I would probably still err on the side of caution and prioritize security.

Of course it's possible this is exactly one of the reasons Igor owns something like 7-Zip, and I don't :-)

Why is this such an uphill battle?
Developer cares about efficiency. It seems strange in today's world of Electron apps and text editors that take up a GB of ram and have hundreds of ms of latency, but that's clearly where his priorities lie and I, for one, can respect that.

If people are so up in arms about the security problems of 7-Zip, they're free to fork it.

Comparing javascript to ASLR is nonsensical.
> Some people mentioned that this would "only affect RAR files" and it would be safe to extract 7z files with 7-Zip prior to version 18.05. This is wrong, because 7-Zip detects the file type from the magic numbers at the beginning of the file. So the exploit can be renamed to 'exploit.7z' and it works just as well.

But the contents of a file is what really determines the actual file type, though.

So if you decide to block RAR files, you should do that by looking at the magic number. Then you are good to go. And it's not like you have to write a lot of code or anything to do this. The standard 'file' utility in Unix is able to determine the type of a file -- often by looking exactly at the magic number.

Just like you would inspect the actual contents of a file if you were running an image hosting site -- and not trust whatever arbitrary filename the uploader told you that the file had.

Of course, but I would still strongly advise against this.

If you really cannot avoid implementing something like this, you should inspect the 7-Zip code in order to be 100% sure that the magic number detection in your filter is identical (or matches a superset) to the one from 7-Zip.

> you should inspect the 7-Zip code in order to be 100% sure that the magic number detection in your filter is identical (or matches a superset) to the one from 7-Zip

CPP/7zip/Archive/Rar/RarHandler.cpp:

    #define SIGNATURE { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x00 }
CPP/7zip/UI/Common/OpenArchive.cpp:

    const Byte kRarHeader[] = { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x00 };
CPP/7zip/Archive/Rar/Rar5Handler.cpp:

    #define SIGNATURE { 0x52 , 0x61, 0x72, 0x21, 0x1a, 0x07, 0x01, 0 }
Those are the two magic numbers for RAR archive version 1.50 onwards and RAR archive version 5.0 onwards respectively, and those are the places they are referenced in the 7-zip source code. I looked at the source archive of the 18.05 version, downloaded from https://www.7-zip.org/a/7z1805-src.7z. I guess if you wanted to be really rigorous you'd look at previous versions as well. If the 7-zip project makes use of a version control system and that vcs supports the equivalent of git blame then that should not be too difficult of a task for whoever wanted to go to that extent of investigation.

And here is where one copy of the 'file' command identifies the same magic numbers as RAR archives:

https://github.com/file/file/blob/f0a725a13fe0c1b046d8e07057...

    0 string Rar!\x1a\7\0 RAR archive data
https://github.com/file/file/blob/f0a725a13fe0c1b046d8e07057...

    0 string Rar!\x1a\7\1\0 RAR archive data, v5
:)
You missed my point. It is trivial to find out what the magic number is. What is more important though: How exactly is the magic number matched? From what you have written, one might be tempted to simply check whether a file begins with this magic number. And this would be wrong. If you take a look at the matching in CPP/7zip/Archive/Rar/RarHandler.cpp:

    Byte marker[NHeader::kMarkerSize];
    RINOK(ReadStream_FALSE(stream, marker, NHeader::kMarkerSize));
    if (memcmp(marker, kMarker, NHeader::kMarkerSize) == 0)
      m_Position += NHeader::kMarkerSize;
    else
    {
      if (searchHeaderSizeLimit && *searchHeaderSizeLimit == 0)
        return S_FALSE;
      RINOK(stream->Seek(m_StreamStartPosition, STREAM_SEEK_SET, NULL));
      RINOK(FindSignatureInStream(stream, kMarker, NHeader::kMarkerSize,
          searchHeaderSizeLimit, arcStartPos));
      m_Position = arcStartPos + NHeader::kMarkerSize;
      RINOK(stream->Seek(m_Position, STREAM_SEEK_SET, NULL));
    }
7-Zip finds the magic number if it appears within some searchHeaderSizeLimit, i.e., the file does not need to start (at offset 0) with the magic number. For example, 7-Zip will extract a RAR file which begins with [00 52 61 72 21 1A 07 00] (instead of [52 61 72 21 1A 07 00]) just fine.
Oh, I did not expect that. Yes you are right, one must be careful about these things.
hmmm is there any way for a end user to know what version of the dll is a problem?

The enterprise deployment of Trend Micro Officescan I have has the 7z.dll (7za.dll) version 4.57

To me that looks like quite an old version... Probably open to this exploit.