|
|
|
|
|
by wrs
694 days ago
|
|
The big news for me here is the 16k output token limit. The models keep increasing the input limit to outrageous amounts, but output has been stuck at 4k. I did a project to summarize complex PDF invoices (not “unstructured” data, but “idiosyncratically structured” data, as each vendor has a completely different format). GPT-4o did an amazing job at the extraction of line items, but I had to do a heuristic layer on top to break up the PDFs into small chunks so the output didn’t overflow. |
|