| If you upload a pdf to google drive and download it 10 minutes later it will magically have BY FAR the best OCR results in the pdf. Note my pdf tests were fairly clean so your experience may not be the same. I have used Google's fine OCR results to simulate a hacker. - Download a youtube video that shows how to attack a server on the website hackthebox.eu - Run ffmpeg to convert the video to images. - Run a jpeg to pdf tool. - Upload the pdf to google drive. - Download the pdf from google drive. - Grep for the command line identifiers "$" "#". - Connect to hackthebox.eu vpn. - Attack the same machine in the video. |
By the way, why do you wait 10 minutes? Is there a signal that the PDF is done processing?
Or is there just some kind of voodoo magic that seems to happen that just takes 10 minutes to do?