| Opinion from 10 years ago, I suspect still valid: There are a million python libraries and tools to do some overlapping subset of the things you'd want to do with a pdf. There are no doubt another million in other languages. These are each basically bundles of some of the transformations you'd want to make to the same underlying data structure. So, complex pdf scripts often need two or three different libraries to get their thing done, which is wasteful at borh a dev effort and computational level. The ecosystem would be greatly improved if someone made a great (probably rust based) in-memory low level pdf reading and writing data structure. PDF libraries in any language could switch to using that structure and library internally, with the carrot that the switch would result in needing less code, and likely being some combination of faster and safer. And then if they just exposed get_structure_pointer() and set_structure_pointer(), they could all interoperate for free. (Another carrot for joining -- small libraries could usefully add features and be adopted without needing to pick an existing popular library to glom onto.) Not sure what would economically cause this to happen, but it would be great. |
[0] https://dev.to/gosukiwi/software-design-deep-modules-2on9