Hacker News new | ask | show | jobs
by ceejayoz 502 days ago
Let an expert in the field examine its reasoning first. LLMs are great at putting out reasonable sounding bullshit.

Just because it came up with a roughly equal result doesn’t mean its explanation is sound.

2 comments

This is not a wrong sentiment. I'd be super interested in hearing an analysis of its CoT. To my ears it was a nearly schizophrenic roundabout with a lot of contextual things like "that's high" but I don't know why it thinks that.
What I think is incredible about this example is how adept it is at mixing precise calculations with rough heuristics. That's exactly what I was taught, to validate the numbers by asking "Does this answer seem reasonable?"

ChatGPT doesn't do that at all, as far as I'm aware