| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by giancarlostoro 343 days ago

What's really funny to me is, sometimes it fixes itself if you just ask "are you SURE ABOUT THIS ANSWER?" myself and others often wonder, why the heck don't they run a 2nd model to "proofread" output or spot check it. Like did you actually answer the question or are you going off a really weird tangent.

I asked Perplexity some question for sample UI code for Rust / Slint, it gave me a beautiful web UI, I think it got confused because I wanted to make a UI for an API that has its own web UI, I told it you did NOT give me code for Slint, even though some of its output made references to "ui.slint" and other Rust files, it realized its mistake and gave me exactly what I wanted to see.

tl;dr why dont llms just vet themselves with a new context window to see if they actually answered the question? The "reasoning" models don't always reason.

2 comments

asadotzler 343 days ago

I've asked that question on accurate answers and had the bot say oops and change the answer to an inaccurate one. This seems to happen with about the same frequency on both sides so I'm not sure how helpful it will ultimately be.

link

giancarlostoro 342 days ago

Interesting! Have not tried that

link

ACCount37 343 days ago

Because that would be twice as computationally intensive.

"Reasoning" models integrate some of that natively. In a way, they're trained to double check themselves - which does improve accuracy at the cost of compute.

link