Hacker News new | ask | show | jobs
by mpasternacki 4100 days ago
If you have a text version, at least preprocessing and proposing changes could be doable (especially that as far as I know, formal French is quite uniform - but then, it's what I heard from others, all French I actually know comes from a Dexter's Laboratory episode). In Poland, though, the original acts are published as PDFs, and the text extraction itself needs assistance, if it's even possible without OCR: the international agreements that are side-by-side in multiple languages (Polish, other party, sometimes also English) are usually image scans saved as PDFs.