Hacker News new | ask | show | jobs
by nradov 128 days ago
There has been a standard X12 EDI format for invoices for decades. It's kind of a hassle to work with but it can at least be reliably parsed. A lot of huge businesses like Walmart use it successfully, and even require their suppliers to submit all invoices that way.

I don't object to using LLMs to parse PDFs but over the long run it's going to be less efficient and reliable than other options.

1 comments

Yes, there has been a standard format for invoices for decades, but it was only ever used if both companies were using a ERM system (and as you say, large enough purchasers could force their suppliers to). We have to deal with small business who don't use the standard format, which is the vast majority of them.

Please go ahead and try parsing non-standard invoices without an LLM. I spent 20+ years on and off dealing with this problem. It's not as simple as it looks. And then LLMs came along and made it simple.