Hacker News new | ask | show | jobs
by getmeinrn 1071 days ago
The hard part is the SQL query, because you need to make sure the SQL query is safe to execute. Collecting data is far easier by comparison, but you absolutely could use an LLM for that too.
2 comments

I’m not saying that the SQL query is at all easy, but since you have pretty much accomplished in on a short period of time, while Google, Yelp, etc. have still not completely solved the problem of store hours after decades of working on them, I’m going to lean towards that being the hard problem between the two.
The OP said "This is a pretty low bar request IMO" suggesting that the problem they expect an LLM to be able to do is not the hard problem you're saying Google and Yelp has not solved. It's a different problem.
Can you give an example of how you'd define safe operations?

I think a lot of use cases could just be 1) set up a database with only public data and 2) use a read-only user.

The much tricker use case is those where you want to allow inserts and updates but only on specific tables or rows.

That's mostly safe, but even then, a user could execute "SELECT SLEEP(100000000)" thousands of times and DoS your database. There are other unsafe functions that a readonly user can execute as well. I've written extensively on some of the attack surface here https://docs.heimdallm.ai/en/latest/attack_surface/sql.html

HeimdaLLM can allowlist functions and constrain queries to ensure that required conditions exist. This makes LLM + database usage have far more utility, for example, a user can be restricted to only data in their account. Support for INSERT and UPDATE is coming very soon.