Hacker News new | ask | show | jobs
by brokencode 1177 days ago
30% accuracy rate in what exactly? Take a look at the GPT-4 announcement page for graphs showing the accuracy on different standardized tests. It’s not perfect, but making improvements with each release.

One big area where it does poorly right now is math. But they just announced a ChatGPT plugin for Wolfram, which I expect will make it very good at math. Wolfram also has a large database of curated information to draw on.

Technology improves over time. GPT is still new and improving quickly. What it does now isn’t perfect, but it is still incredible.

1 comments

There's a post on /r/askhistorians where somebody asked ChatGPT for book recommendations on various historical topics. Some of them didn't exist. It actually took an expert reader to identify which books were made up, misatributed, and so on. That's much worse than nothing: it's a horrific timewaste.

My guess is stuff like math, where you can fairly easilly verify the factuality of ChatGPT's answers, is an area where you could certainly see progress. More general stuff like history, where it's important to have a really firm grasp of facts, inutition, and nuance, ChatGPT will likely be hard to improve, and worse, much harder to verify. Worse, these things can be insiduous: if you've learned something straightforwardly wrong, it corrupts future conclusions drawn from that erroneous premise.

I think the plugin system will ultimately help for most areas where LLMs are weak today.

Need to do math? Use the Wolfram plugin.

Need to have hard facts from reliable and citable sources? Use a plugin that queries databases like Arxiv. The LLM could give you links to sources and provide quotes from those sources to support its reasoning.