|
|
|
|
|
by mritchie712
804 days ago
|
|
text-to-sql is a dead end. There's no way for a model to correctly interpret the meaning of every column in a real world database using the `information_schema` alone. Most cloud warehouses (e.g. Snowflake) don't use foreign keys, so you don't even know the joins. Imagine you hire a highly skilled data analyst (e.g. 9 out of 10 proficiency in SQL) and start asking them questions about your database. They won't answer them, they'll ask you more questions. The conversation would go something like: you: what is our churn rate by channel? new analyst: where do we store "channel"? what do we use to process payments? where is that data stored? do we include discounts in MRR / churn? etc. If a human can't do it, an LLM can't either. An LLM isn't able to write the SQL from scratch get the right answers without a ton of additional context. We're working on an approach using a semantic layer at https://www.definite.app/ if you're interested in this sort of thing. |
|
Here it seems MQL isn't a query language as much as it's a text-to-SQL translator and you're right... without a bit more understanding of the data's role and purpose and intent it's a hard job for anyone, human or AI.
It strikes me that as I write an sql statement I'm not only using knowledge of sql but also knowledge of domain and database structure that I don't even think about until I need to show someone else how to do the query.