Also image formats are fairly stable, so they are a good candidate for a verified F* parser. Not sure how easy it is for pdf, maybe start with a reasonable subset of it.
Good luck doing that for stuff like PDF which can include JavaScript in the PDF file itself, or SVG+PDF which both can include raw font files, another reliable source of exploits.
PDF/A-1 and PDF/UA require unicode mapping, so embedded font files might be not as critical, you can just render unicode text, and javascript is routinely banned for being inaccessible.