> But honestly, if there are enough materials on the internet for me to learn Sakha, there is absolutely no reason in principle why a machine should not also be able to draw on these. The only reason in practice as to why it does not do so is that our current idea of what would count as passing some variant of the Turing test is basically that the machine that passes need only be conversant in the sort of matters that the corporate interests shaping our use of the internet would prefer to keep us focused on: the Academy Awards and other such presentist illusions, always in English, always limited to the sort of information you might expect to find in your search engine’s top hits.
And it is indeed true that LLMs kinda suck outside of English and they suck more the less online material there is.
I don't even need to do anything as contrived as the author to get it to fail on a tangentially language related task: ask ChatGPT-4 to give you instructions on how to write a chinese character stroke-by-stroke and it will fail miserably (because each Chinese characters is a separate codepoint and because there isn't enough discussion online on how to write each chinese character.)
The author is confusing knowledge with intelligence. Most intelligent people wouldn't be able to answer the questions in the article at all. Because they lack context. For example they probably wouldn't have a clue who the author is and he's effectively "googling himself" here, which is a bit narcissistic.
Gpt-4 is trained on a vast amount of knowledge. So there are a vast amount of topics where you can ask it questions and get reasonable answers.
I know because I've done so. Like world+dog. There's very little need to argue that at this point. I've actually been forcing myself to stop using Google and just ask the question in bing chat several times. I get perfectly reasonable results almost every time. It's kind of impressive and scary at the same time.
Most people know a lot less. So, are they dumber than chat gpt-4? No of course not. They just know a lot less. Most of them have not even read a tiny fraction on what gpt-4 trained on. And given that we have gpt-4 now, they can save themselves the trouble and just use that as an extended brain. That would actually be intelligent behavior. Most scientists distinguish themselves not by knowing a lot of things but by asking interesting questions and doing a lot of hard work to get answers.
Cornering chat gpt on things it doesn't know about proves nothing other than that there is knowledge out there that it clearly wasn't trained on. Which, it will happily tell you, is a lot of knowledge. It's more interesting to corner it in logical contradictions about stuff it actually does know about, which isn't all that hard. But all that proves is that it is not quite an AGI yet. It will happily tell you that as well. In fact it will tell you about it's limitations to the extent it gets really annoying. Unless you ask it not to.
It's actually quite good at different languages. I've engaged with it in my native language Dutch and it seems about as fluent in that as in English. And it translates between the two as well. If you stick to the same topics, you get comparable answers.
> But honestly, if there are enough materials on the internet for me to learn Sakha, there is absolutely no reason in principle why a machine should not also be able to draw on these. The only reason in practice as to why it does not do so is that our current idea of what would count as passing some variant of the Turing test is basically that the machine that passes need only be conversant in the sort of matters that the corporate interests shaping our use of the internet would prefer to keep us focused on: the Academy Awards and other such presentist illusions, always in English, always limited to the sort of information you might expect to find in your search engine’s top hits.
And it is indeed true that LLMs kinda suck outside of English and they suck more the less online material there is.
I don't even need to do anything as contrived as the author to get it to fail on a tangentially language related task: ask ChatGPT-4 to give you instructions on how to write a chinese character stroke-by-stroke and it will fail miserably (because each Chinese characters is a separate codepoint and because there isn't enough discussion online on how to write each chinese character.)
I don't quite agree with their conclusion though.