Hacker News new | ask | show | jobs
by tobyhinloopen 335 days ago
This is something I've done as well - I wanted to scan all invoices that came into my mail so I just exported ALL ATTACHMENTS from my mailbox and used a script to upload them one by one, forcing a tool call to extract "is invoice: yes / no" and a bunch of invoice line, company name, date, invoice number, etc fields.

It had a surprisingly high hit rate. It took over 3 hours of LLM calls but who cares - It was completely hands-off. I then compared the invoices to my bank statements (aka I asked an LLM to do it) and it just missed a few invoices that weren't included as attachments (like those "click to download" mails). It did a pretty poor job matching invoices to bank statements (like "oh this invoice is a few dollars off but i'm sure its this statement") so I'm afraid I still need an accountant for a while.

"What did it cost"? I don't know. I used a cheap-ish model, Claude 3.7 I think.

1 comments

In your use case, for that simple data matching that it errors on I think it would be better to have the LLM write the code that can be used to process the input files (the raw text that it produced from images and the bank statements), rather than have the LLM try to match up the data in the files itself.