Hacker News new | ask | show | jobs
by _akhe 784 days ago
RAG is the indexing and querying of info inside documents. It puts it in a vector database, for example, pgvector - an extension of SQL to allow you to store data in numerical form - then you can query it using natural language (via the LLM).

There's a possibility for errors in regular SQL querying too, like a user-facing search input. I'm not saying language interfaces are foolproof, but it's not generally wrong when you ask specific things like a person's age, blood pressure, criminal history, etc. if querying against a vector DB of that exact info.

1 comments

There's a reason attorneys don't put the facts from cases into SQL databases to query, I think you are missing the point completely.
Not true. How would people look up cases online if that was the case?

I built Checkr's background check ETA in Ruby/React, and had to get background check certified to work there. Part of onboarding was going down to the courthouse to show us how it was done before APIs. While it's true some records are still offline in some courthouses, almost all of it is online, some is even sold to 3rd parties in some states like mugshot websites, background check sites, etc. While others are on-prem servers the state/county runs. But they definitely use databases and computers lol.

I think you're missing the point - you act like I'm suggesting AI replace the entire legal system when I'm talking about a tool people would use instead of older tech like a SQL database and UI.

For courthouses that run their SQL on-prem for security reasons, could do the same with models - they don't even need access to the internet. So if you wanted to be inaccessible to the public you could (though some states/counties require they make it public).

Nothing will satisify the neo-luddite take, just watch from the sidelines I guess!

>Not true. How would people look up cases online if that was the case?

Have you ever used LexisNexis or WestLaw? It's not an SQL database of facts from a case. It's literally just string searching. Do you have any experience with the legal industry at all as you repeatedly make statements about what lawyers would/should/could do?

>While others are on-prem servers the state/county runs. But they definitely use databases and computers lol.

The assertion wasn't that lawyers don't use technology, the assertion was that lawyers do not abstract the facts from a legal case into a database for querying. That you suddenly do not distinguish that from the general use of databases at all is asinine and not conducive to conversation because it's such a ridiculous stretch of what anyone could have meant, let alone what was actually written.

>I think you're missing the point - you act like I'm suggesting AI replace the entire legal system when I'm talking about a tool people would use instead of older tech like a SQL database and UI.

I'm not suggesting that at all. I'm suggesting that the limited utility you think is there, isn't.

>Nothing will satisify the neo-luddite take, just watch from the sidelines I guess!

Rude and unnecessary.

> no reason at all to believe this at all

> Do you have any experience with the legal industry at all

> Rude

Your repeated use of "at all" also comes across as slightly rude FYI :)

As stated, yes I built background check software for a major background check company (they're yc, now worth billions) - in particular I developed their background check ETA and built their React app which is used millions of times per year by Uber, DoorDash, and others, for background checks. I'm familiar with the space and had to become a background investigator to work there. What you say just isn't true.

> they do not abstract facts from a legal case into a database for querying

Again wrong - yes they do. How would courts operate if they didn't, think about it for 2 seconds.

>As stated, yes I built background check software for a major background check company (they're yc, now worth billions) - in particular I developed their background check ETA and built their React app which is used millions of times per year by Uber, DoorDash, and others, for background checks. I'm familiar with the space and had to become a background investigator to work there. What you say just isn't true.

What does this have to do with the legal industry? Nothing? Got it.

>Again wrong - yes they do. How would courts operate if they didn't, think about it for 2 seconds.

No, they don't. I repeat my previous question, have you ever actually used LexisNexis or WestLaw? They do not index specific facts about any cases.

>Your repeated use of "at all" also comes across as slightly rude FYI :)

I can see why you would think that given your insistence on discussing something you clearly know nothing about.

Do you know what is on a criminal background report? It's exactly their criminal history. You claimed that courts do not store documents about cases in SQL databases (e.g. case number, defendant name, their plea, etc.) but that's wrong, they do.

> you clearly know nothing about

I have more direct experience than you do - and startups already exist that do this very thing with LLMs, but go ahead, have fun on the wrong side of history making false claims and straw manning arguments