Hacker News new | ask | show | jobs
by remram 697 days ago
Acknowledging that AI is unreliable, the solution is to layer another AI to hopefully let you know about it. Of course, brilliant, why did I expect anything different from the AI industry.
2 comments

?? but who is monitoring the AI layer monitoring the AI who produced the original output ??

openai audited by claudeai which is then audited by gemini ai...

then to close the loop, gemini ai is then audited by openai

I had read the OP's comment as sarcastic, but you never know these days lol

Your concern would be exactly mine as well, and why I assumed "brilliant" was sarcasm, cause it feels like handing over the problem to the same solution that got you the problem in the first place?

It has the same logic of saying you dont want to use a computer to monitor or test your code since it will mean that a computer will monitor a computer. AI is a broad term, I agree you can use GPT (or any LLM) to grade an LLM in an accurate way but that’s not the only way you can monitor.
> computer to monitor or test your code since it will mean that a computer will monitor a computer

I mean... you don't trust the computer in that case, you trust the person who wrote the test code. Computers do what they're told to do, so there's no trust required of the computer itself. If you swap out the person (that you're trusting) writing that code with an AI writing that test code, then it's closer to your analogy - and in that case, I (and the guy above me, it seems) wouldn't trust for anything impactful.

Even if you're not using an LLM specifically (which no one in this chain even said you were), an AI built off some training set to eliminate hallucinations is still just an AI. So you're still using an AI to keep an AI in check, which begs the question (posed above) of: what keeps your AI in check?

Poking fun at a chain of AI's all keeping each other in check isn't really a dig at you or your company. It's more of a comment on the current industry moment.

Best of luck to you in your endeavor anyway, by the way!

Thanks! I wasn’t offended or anything, don’t get the wrong impression.

What strikes me odd is the fact that an AI that checks AI is an issue. Because AI can mean a lot of things - from a encoder architecture, a neural network, or a simple regression function. And at the end of the day, similar to what you said - there was a human building and fine tuning that AI.

Anyway, this feels more of a philosophical question than an engineering one.

(it was sarcastic. Too late to edit in a /s)
people are lazy, we're more than happy to not be in the loop
I'm sorry but this is not what we do. We don't use LLMs to grade your LLM calls.