Hacker News new | ask | show | jobs
by stult 774 days ago
> Preload a city's, county's etc. entire set of laws

You would also need to load an enormous amount of precedential case law, at least in the US and other common law jurisdictions. Synthesizing case law into rules of law applicable to a specific case requires complex analysis that is frequently sensitive to details of the factual context, where LLMs' lack of common sense can lead it to make false conclusions, particularly in situations where the available, on-point case law is thin on the ground and as a result directly analogous cases are not available.

I don't see the utility at the current performance level of LLMs, though, as the OP article seems to confirm. LLMs may excel in restating or summarizing black letter or well-established law under narrow circumstances, but that's a vanishingly small percentage of the actual work involved in practicing law. Most cases are unremarkable, and the lawyers and judges involved do not need to conduct any research that would require something like consulting an AI assistant to resolve all the important questions. It's just routine, there's nothing special about any given DUI case, for example. Where actual research is required, the question is typically extremely nuanced, and that is precisely where LLMs tend to struggle the most to produce useful outputs. LLMs are also unlikely to identify such issues, because they are issues for which sufficient precedent does not exist and therefore the LLM will by definition have to engage in extrapolational, creative analysis rather than simply reproducing ideas or language from its training set.

1 comments

> You would also need to load an enormous amount of precedential case law

Very easily done. Is that it?

> lack of common sense, false conclusions

The AI tool doesn't replace the judge/DA/etc. it's just a very useful tool for them to use. Checkout the "RAG-based learning" section of this app I built (https://github.com/bennyschmidt/ragdoll-studio) there's a video that shows how you can effectively load new knowledge into it (I use LlamaIndex for RAG). For example, past cases that set legal precedents, and other information you want to be considered. It creates a database of the files you load in, so it's not making those assumptions like an LLM without RAG would. I think a human would be more error-prone than an LLM with vector DB of specific data + querying engine.

> I don't see the utility

Then you are not paying attention or haven't used LLMs that much. Maybe you're unfamiliar with the kind of work it's good at.

> actual work involved in practicing law

This is what it's best at, and what people are already using RAG for: Reading patient medical docs, technical documentation, etc. this is precisely what humans are bad at and will offload to technology.

> actual research is required

You have not tried RAG.

> LLMs struggle to produce useful outputs

You have not tried RAG.

> LLMs are unlikely to identify issues

You have not tried RAG.

> the LLM by definition is creative analysis

You have not tried RAG.

You can load an entire product catalog into LlamaIndex and the LLM will have perfect knowledge of pricing, inventory, etc. This specific domain knowledge of inventory allows you to have the accurate, transactional conversations that a regular LLM isn't designed for.

>You can load an entire product catalog into LlamaIndex and the LLM will have perfect knowledge of pricing, inventory, etc. This specific domain knowledge of inventory allows you to have the accurate, transactional conversations that a regular LLM isn't designed for.

Aren't we talking about caselaw? You didn't really respond to the point, which distinguished caselaw from information like a product catalog. And rather rudely at that.

Rudely? Ha - they misrepresented my point about RAG tooling not replacing lawyers into a straw man about replacing lawyers - I never said that, said the opposite.

Secondly, it's obvious they have not used RAG, or they wouldn't say things like "inaccurate responses" etc. RAG is as accurate as any database (because it is a database). It puts all the information from your uploaded files into a database and reads from that. The commenter fundamentally misunderstands the technology and likely hasn't even used it - yet feels the need to comment on it like an expert. It's not like using ChatGPT, and in any case it's not in lieu of a lawyer anyway, that was just a straw man argument that goes counter to my actual post.

I did respond to the points about accuracy and legal precedents. Unlike the other false statements that were made, these are legitimate concerns a lot of people share about whether or not LLM tooling should be used by legal professionals.

Is ChatGPT sufficient to replace a lawyer? No.

Is ChatGPT sufficient as a legal advice tool that a lawyer might use on a case-by-case basis or generally? No.

Could the same LLM technology be used except on a body of specific case documents to surface information through a convenient language interface to a legal expert? Yes. It's as safe as SQL.

The point about pricing and inventory is that, unlike an LLM, RAG involves retrieval of specific facts from a document (or collection of documents) - the language is more for handling your query and matching it to that information. None of the points he made about inaccuracies and insufficient answers, etc. or replacing lawyers apply.

>Could the same LLM technology be used except on a body of specific case documents to surface information through a convenient language interface to a legal expert? Yes. It's as safe as SQL.

I see no reason at all to believe this at all.

RAG is the indexing and querying of info inside documents. It puts it in a vector database, for example, pgvector - an extension of SQL to allow you to store data in numerical form - then you can query it using natural language (via the LLM).

There's a possibility for errors in regular SQL querying too, like a user-facing search input. I'm not saying language interfaces are foolproof, but it's not generally wrong when you ask specific things like a person's age, blood pressure, criminal history, etc. if querying against a vector DB of that exact info.

There's a reason attorneys don't put the facts from cases into SQL databases to query, I think you are missing the point completely.
I have tried a lot of RAG and can tell you that no LLM, including Gemini 1.5 with it's 1.5 million context, will be anywhere near as good at longer context lengths as in shorter context lengths.

Appending huge numbers of tokens to the prompt often leads to the system prompt or user instructions being ignored, and since API based LLM authors are terrified of jailbreaks, they won't give you the ability to "emphasize" or "upweight" tokens (despite this being perfectly possible) since you can easily upweight a token to overwhelm the DPO alignment lobotomization that most models go through - so no easy fix for this coming from OpenAI/Anthropic et al

I'm not so sure human judgement is as comparable to medical terminology or technical manuals as you think it is.

How did you come to this conclusion?

Maybe I wasn't that clear, but I did say in my original post:

I used to think AI would replace doctors before nurses, and lawyers before court clerks - now I think it's the other way around. The doctor, the lawyer - like the software engineer - will simply be more powerful than ever and have lower overhead. The lower-down jobs will get eaten, never the knowledge work.

Yet you and a few other people insist I'm saying "AI will replace human judgment" - why? I'm saying the doctor isn't replaced, the lawyer, the software engineer, etc. aren't replaced. It's more like the technician just got a better technical manual, not like they are replaced by it.

I did not. I pointed out that you assumed a similarity between human judgement in courts to technical documentation and medical diagnostics, and asked on what grounds you make this assumption.

It can't be that engineering and biology are so similar to jurisprudence, because they aren't. There has to be another reason for you to lump them together.

> human judgement

Again the human judgment is not replaced in either scenario, I'm talking about a tool the lawyer, the doctor, etc. would use.

Lawyer and doctor are often listed as comparable examples because both involve sensitive info you can't afford to get wrong, unlike creative use cases for AI like image or song generation.

Not sure why you keep bringing that up instead of answering my question.

Lawyers and doctors get it wrong all the time.