| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by larve 1080 days ago

I approach things a bit obliquely. I create a custom made DSL (starting from scratch in each conversation, often) that allows me to model my query the way I want. Then, I write a traditional SQL builder on that DSL (or more like, ask GPT to do it for me). Then, I generate DSL statements that match my current domain, and more importantly, modify existing ones.

So, at each step, I do almost trivial transformations.

One key ingredient is that the DSL should include many "description" fields that incorporate english language, because that helps the model "understand" what the terser DSL fields are for.

Straight SQL is a crapshoot, and as you said, more often than not, either obviously or subtly broken or for another database. Which makes sense, considering how much different flavors of SQL it has in its training corpus and how much crappy SQL is out there anyway.

Another thing that helps is use extremely specific "jargon" for the domain you want to write queries for. Asking for "accrual revenue" and "yoy avg customer value" (yes, yoy, not year over year) often tends to bring back much higher quality than just asking for "revenue" or "customer value".