https://github.com/upx/upx/blob/devel/src/stub/src/arch/amd6...
Then it takes ~2500 bytes, mine is ~500 (static version). But the decompression speed for larger code should be higher, as I noticed.