Hacker News new | ask | show | jobs
by internet101010 871 days ago
> What kind of applications would this be useful for? What can you build with an AI data science intern that's right 75% of the time?

Yeah this is the issue I have with all of the SQL generation stuff. Not only should the SQL be valid, a prompt like "generate a query that pulls sales for the last quarter" should generate the same output for everyone without fail. Vanna's business logic embedding is a good first step but even then it is only correct like 90% of the time with GPT-4.

Even then, it will only work if there are strong standards and data governance structures in place that everyone within an organization is aligned on. For example, "sales" can mean different things to different people and all of that needs to be buttoned up as well.

2 comments

As someone who works with lots of analysts, I can guarantee that they also don't make the correct interpretations all the time, and that you have to sense-check the results back against reality.

In either case, validation is the key step - you can't just trust that your SQL query is correct regardless of if you have manually written it, you still have to go through the data and check it.

That's where the SQL generation stuff can save time - if 50% of the time you can get to an answer in half the time, then it's great! Normally in my experience with current-gen LLM's when they fail they fail quickly, so the other 50% of queries don't take twice as long to write manually.

Then there is the other use case - if you aren't sure why a particular SQL query is erroring, these LLM's are great at telling you why and fixing your code.

Having an LLM be in charge of business logic is madness.

There cannot be any AI involved when processing the definition of a KPI. Otherwise you'll never be able to roll it out to thousands of users when there's always a 90% (or even 99%) chance that the business logic might not get applied correctly.

Check out what we do at Veezoo (https://www.veezoo.com) with the Knowledge Graph / Semantic Layer to mitigate that.