Hacker News new | ask | show | jobs
by hnick 1425 days ago
Because Acrobat will open these files, there is considerable pressure for Ghostscript to do so as well, though we do try to at least flag warnings to the user when something is found to be incorrect, giving the user a chance to intervene.

Anyone who has done PDF composition for a "print ready" job (what a lie) from a client has run into this so many times. All we have to do is rearrange the pages in the right sorted order, add some barcodes, and print, right? Acrobat can open the file, so why is your printer crashing? Ironically, some of those printers used an Adobe RIP in the toolchain and this conversion PDF->PS on the printer was where things went wrong (I once tracked down a crash where a font's gylph name definition in the dict was OK in PDF but invalid syntax in PS, due to a // resolving into an immediately evaluated name that doesn't exist) but it's not something a technician could help with.

It was so bad that Ghostscript was one of many tools - we'd throw a PDF through various toolchains to hope one of them saved it in a format that was well behaved. Anyway I'm almost sad I've moved on from that job now so I can't try it out with some real world files. But in the end most of the issues came down to fonts and people using workflows that involve generating single document PDFs and merging them, resulting in things like 1000 subset fonts which are nearly identical and consuming all the printer memory, so I'm not sure how well this would help.

1 comments

Many years ago I worked in print (mostly RGB to CMYK stuff, small run) and the very expensive RIP software chocked on what seemed like every PDF a customer supplied.

I ended up with a fairly large set of shell scripts over Ghostscript to convert them into high DPI tif's to be able to reliably print them, it worked remarkably well considering that one was open source and free and the other was 1000's per license.

Yeah you just moved the RIP upstream, rasterize before the rasterizer :) We did that for a few jobs that caused trouble.

I haven't worked on the innards of those machines but my suspicion is that it's a combination of 1) Not much RAM, to keep costs down, 2) An inability to handle a large number of resources i.e. no swapping out to slow storage on a least-recently-used principle or similar, and 3) extremely strict conformance to avoid surprises in output.