Hacker News new | ask | show | jobs
by sanmon3186 693 days ago
>I had to do a heuristic layer on top to break up the PDFs into small chunks so the output didn’t overflow

How do you stitch the outputs of all chunks without losing the overall context?

1 comments

The output is just individual line items from the invoices, so all you have to do is concatenate the outputs of the chunks. If there was data that crossed a page, it would have been harder!