Hacker News new | ask | show | jobs
by PaulHoule 561 days ago
Eval is a good place to start because nobody wants to do it. Back when I was working on search engines and between jobs I talked with many companies in the full-text search space and found it was unusual for them to do any eval work.

Even though I had lived through using eval to make a search engine that was much better than competitors it turns out that enterprise search customers are more concerned about having 500 "integrations" to load stuff from different data points (each of which is like 5-50 lines of code exclusive of initializing connections to the various products) than they are about quality results. It's not like the people who buy it use it.

---

Listen. If you want to make the kind of query systems that people are dreaming about you are going to need some hybrid of RAG plus ordinary databases. For instance, if you are going to be asking questions that filters like "sample size of over 2000" you can build some system that extracts the sample sizes out of papers and puts them in a database column. DONE. On the other hand you could screw around with vectors and get to: 50%, 75%, 81%, 83.5%, 83.7% and such accuracies with increasing effort. Don't be that guy.

1 comments

Yes, a step where you do a structured extraction into a database column would be a potential solution. But, it requires a preprocessing step.

It all depends on the use-case, sometimes you get a query that you couldn't have predicted the filter beforehand. In those cases, usually what you have to do is open up a spreadsheet and then manually categorize every document by hand. LLMs and modern AI are great ways to automate this.

A really good solution might be to have a system that computes these filters on-the-fly, but also caches them for later reuse if a query asks for that filter again.