| > No one longed for a level of AI where you have to double check everything. This has basically been why it's a non-starter in a lot of (most?) business applications. If your dishwasher failed to clean anything 20% of the time, would you rely on it? No, you'd just wash the dishes by hand, because you'd at least have a consistent result. That's been the result of AI experimentation I've seen: it works ~80% of the time, which sounds great... except there's surprisingly few tasks where a 20% fail rate is acceptable. Even "prompt engineering" your way to a 5% failure/inaccuracy rate is unacceptable for a fully automated solution. So now we're moving to workflows where AI generates stuff and a human double checks. Or the AI parses human text into a well-defined gRPC method with known behavior. Which can definitely be helpful, but is a far cry from the fantasized AI in sci-fi literature. |