Hacker News new | ask | show | jobs
by pks016 501 days ago
Most of them are experimental studies. So it would be text extraction of something like title, authors, species of the study, sample size etc. And classify based on the content of the pdfs.

I tried the GPT-4o, it's good but it'll cost a lot if I want to process all the documents.

1 comments

1. You can get a 50% discount via batching.

2. Give a few Sonnet or 4o input/output examples to haiku, 4o-mini, or any other smaller model. Giving good examples to smaller models can bring the output quality closer to (or on par with) the better model.