Hacker News new | ask | show | jobs
by lbill 484 days ago
The only way to make check whether a LLM output is true is to do the work (to have it dkne by a real person).

For tasks that are trivial to verify, it's ok: a code compiler will run the code written by a LLM. Or: ask a LLM to help you during the examples mapping phase of BDD, and you'll quickly be able to tell what's good and what isn't.

But for the following tasks, there is a risk: - ask a LLM to make a summary of an email your didn't read. You can't trust the result. - you're a car mechanic. You dump your thoughts to a voice recorder, and use AI to turn it into a textual structured report. You'd better tripple check the output! - you're a medical doctor, attempting to do the same trick: you'd have to be extra careful with the result!

And don't count on software testing to make AI tool robust: LLM are non deterministic.