Hacker News new | ask | show | jobs
by metalspot 1147 days ago
the concern i would have here is that if you don't know SQL then you have no way of knowing whether or not the results are correct and if the results are wrong you have no way to debug the query.

to construct an SQL query you need to know the database schema, you need to be able to specify the relationships between tables, and you need to specify what data you are trying to retrieve.

the only way using plain written language can work is if you can infer the correct table names and relationships from the input text, which can only work if the database follows very strict naming conventions and/or proper foreign key usage and is small enough that none of the table or column names are ambiguous.

plain written language is much more ambiguous than a formal language like SQL.

how do you resolve ambiguities in the naming of tables and columns in the schema and in the input text for queries? if the output is wrong, is there an iterative process that can help the user refine the query?

3 comments

>plain written language is much more ambiguous than a formal language like SQL.

This, by the way, is why programmers won't be out of a job due to AI any time soon... using an AI for non trivial generation of code just moves the code-run-debug loop from the computer language to AI prompts in English... which can be much harder to debug than e.g. Python.

Programmers will work hand in hand with large language models, but they're not going to be replaced with AI powered low code solutions any time soon.

Correct, you probably should know SQL.

In my experience though it is often a significant time saver if I only have to review the output of the AI vs. coming up with everything myself.

Re: ambiguities: You are right, the AI won't always be able to infer the correct fields to use. In such cases though it is often enough to help the AI out a little. E.g. by saying "information X is stored in field Y of table Z".

Even if you have to do that, you can still save time and more importantly mental effort by letting AI help you, compared to writing all the SQL yourself.

Re: iterative process: You can simply send a follow up message saying "You did X wrong". It is usually happy to correct itself.

I 100% agree. This is why I think that LLM's will help encourage better coding practices, not worse. The more that you can write "clean code" (things which are named correctly, follow the single responsibility pricinciple, don't violate the law of demeter, build the correct data-models/data-marts), then the easier it will be for LLM's to assist senior engineers in writing code/sql faster.

And if there is an interface given to people who will struggle to evaluate the correct-ness of something, ie: a Business Intelligence tool for people who don't know/care about the data model, then the question becomes "How can the tool facilitate getting the people who can verify the correctness of the sql/code do so in an efficient manner?" I'm thinking like a short-term pull-request, where the "code committer" is the LLM and the reviewer is a human. Said in another way: the chatbot is going in reverse, where chatgpt is asking the questions and the human gives the answers...... haha

>> the concern i would have here is that if you don't know SQL then you have no way of knowing whether or not the results are correct and if the results are wrong you have no way to debug the query.

> Correct, you probably should know SQL.

If your customers need to know SQL, presumably they also need to know the data architecture in order to verify correctness and/or fitness of purpose.

Assuming both of these preconditions are true, why would someone not just write the requisite SQL themself?

> Even if you have to do that, you can still save time and more importantly mental effort by letting AI help you, compared to writing all the SQL yourself.

Not really, at least in my experience. Staying "in the flow" is easier when writing SQL queries instead of having to:

1. Take time to think of a "good" ChatGPT request.

2. Review/test what was generated.

3. Take more time to refine the ChatGPT request to make it "better."

4. Goto 2 until a satisfactory SQL query is generated.

Contrast the above steps with:

1. Take time to determine what SQL query is needed.

2. Write the query.

3. Test the query.

> iterative process: You can simply send a follow up message saying "You did X wrong". It is usually happy to correct itself.

cool. i am interested to try it out.

This is the problem with these models in general, right?

I think it can help you write the queries faster but you have to know these things already. For example, at work, I can recall I need to join 5 tables to perform a certain lookup. It'd be cool if I didn't have to type all that, but without that knowledge, it would be useless.

Thoughts on a view?