| HN Mirror

I don't think that's the point of the article, happy to read the reasoning used to get to that conclusion.

To me the point is not concerned with usefulness, is with reliability. You could get correct answers out of the agent, but how often do you get correct data versus gibberish? It's an extremely important metric to consider, and it's the same reason you wouldn't hop into a self-driving car in the real world if it can drive flawlessly in a straight line, but once every three intersections turns the wrong way.