| What is the MIME type of a .tar file; and what are the MIME types of the constituent concatenated files within an archive format like e.g. tar? hachoir/subfile/main.py:
https://github.com/vstinner/hachoir/blob/main/hachoir/subfil... File signature:
https://en.wikipedia.org/wiki/File_signature PhotoRec:
https://en.wikipedia.org/wiki/PhotoRec "File Format Gallery for Kaitai Struct"; 185+ binary file format specifications:
https://formats.kaitai.io/ Table of ':
https://formats.kaitai.io/xref.html AntiVirus software >
Identification methods > Signature-based detection, Heuristics, and ML/AI data mining:
https://en.wikipedia.org/wiki/Antivirus_software#Identificat... Executable compression; packer/loader:
https://en.wikipedia.org/wiki/Executable_compression Shellcode database > MSF:
https://en.wikipedia.org/wiki/Shellcode_database sigtool.c:
https://github.com/Cisco-Talos/clamav/blob/main/sigtool/sigt... clamav sigtool: https://www.google.com/search?q=clamav+sigtool https://blog.didierstevens.com/2017/07/14/clamav-sigtool-dec... : sigtool ā-find-sigs "$name" | sigtool ā-decode-sigs
List of file signatures: https://en.wikipedia.org/wiki/List_of_file_signaturesAnd then also clusterfuzz/oss-fuzz scans .txt source files with (sandboxed) Static and Dynamic Analysis tools, and `debsums`/`rpm -Va` verify that files on disk have the same (GPG signed) checksums as the package they are supposed to have been installed from, and a file-based HIDS builds a database of file hashes and compares what's on disk in a later scan with what was presumed good, and ~gdesktop LLM tools scan every file,
and there are extended filesystem attributes for label-based MAC systems like SELinux, oh and NTFS ADS. A sufficient cryptographic hash function yields random bits with uniform probability.
DRBG Deterministic Random Bit Generators need high entropy random bits in order to continuously re-seed the RNG random number generator.
Is it safe to assume that hashing (1) every file on disk, or (2) any given file on disk at random, will yield random bits with uniform probability; and (3) why Argon2 instead of e.g. only two rounds of SHA256? https://github.com/google/osv.dev/blob/master/README.md#usin... : > We provide a Go based tool that will scan your dependencies, and check them against the OSV database for known vulnerabilities via the OSV API. ... With package metadata, not (a file hash, package) database that could be generated from OSV and the actual package files instead of their manifest of already-calculated checksums. Might as well be heating a pool on the roof with all of this waste heat from hashing binaries build from code of unknown static and dynamic quality. Add'l useful formats: > Currently it is able to scan various lockfiles, debian docker containers, SPDX and CycloneDB SBOMs, and git repositories Things like bittorrent magnet URIs, Named Data Networking, and IPFS are (file-hash based) "Content addressable storage": https://en.wikipedia.org/wiki/Content-addressable_storage |