Hacker News new | ask | show | jobs
by qiller 887 days ago
Our primary issue is that our DB is a dynamic Entity-Attribute-Value schema, even quite a bit denormalized at that. The model has to remember to do subqueries to retrieve "attributes" based on what's needed for the query and then combine them correctly.

NLQ is a somewhat new feature for us, so we don't have a great library to pull from for RAG. Experimenting, I found that having a few-shot examples with some CoT (showing examples of chaining attributes retrieval) sprinkled around did help a lot.

Even still, some queries come out quite ugly, but still functional. I'm thankful that DuckDB is a beast when tackling those :D

> Did you folks experiment with building an Agent-like interface that asks more questions before the LLM finally answers?

That's something I want to figure out next:

1) try to check if a generated query would work but would generate absolutely junk results (cause the model forgot to check something) and ask to rephrase

2) or show results (which may look "real" enough), but give an ability to tweak the prompt. A good example is something like "top 5 products on Cyber Monday" <- which returns 0 products, cause 2024 didn't happen yet, and should trigger a follow up.

1 comments

Maybe you could utilize views to make your EAV schema more friendly for the LLM? Whether that's realistic depends on the specifics of your situation of course.