I easily broke copilot by asking it to make lists of radioactive isotopes in order of half lives. It can put the U.S. states in alphabetical or reverse alphabetical order but for any other order I would bet against it. If I ask it what the probability is that it can correctly complete a sorting task, however, it insists that it is almost certain to get it right.
I had a good conversation with it about the theory of partial orderings, it even corrected my mistakes. I asked it to make a textbook problem determining if a graph was cyclic or not and it made a straight and beautiful example where the partial ordering was realized with a total ordering and everything was written out in a straight order that was easy to follow.
If I wrote a script that made up a bunch of "is this graph cyclic?" problems that are well randomized I am sure there is some size where it just falls down the same way it falls down with sorting.
The obvious answer is that the LLM should pick an algorithm or write some code to do the thing which ordinary algorithms can do such as arithmetic, sorting, SAT solving, etc.
There's the deeper issue that it doesn't know what it doesn't know. It can't sort a list of radioactive isotopes any more than it can help you make an atom bomb. In the second case it will say that it won't help you, in the first case it will try to help you anyway when it really should be saying "I can't do that, Dave" because it just can't.
> lists of radioactive isotopes in order of half lives
ChatGPT-4o does just fine with that. Basing your opinion of a whole technology based on a poor implementation of that instead of the best one doesn't seem like the best analysis.
You'd think with all those billions spent on the software and the hardware it would be a walk in the park to convert a single book on algebraic topology into a formalized Coq, Lean, or Isabelle module. Seems like a very obvious test case for the intelligence capabilities of these systems. I know that it is possible because Kevin Buzzard is going to formalize Fermat's last theorem for less than £934,043 but no commercial AI lab has yet managed to build an AI that can do basic arithmetic. [0] Mira Murati is on the record about their next AI model and that it will have the intelligence of a PhD student so let's see if their next model can actually formalize basic algebraic topology into a logical calculus. [1]
Why would you think that's a walk in the park? Have you actually tried formalising stuff in Lean/Coq? I have, and even with a postgraduate maths degree behind me it's hard as hell!
The fact that Kevin and his team are formalising FLT is incredible, but they all have decades of experience with this stuff (!!).
Transformers can do arithmetic (and many other things) just fine, do a bit of searching on arxiv and you'll find papers from 2023 showing that nano-scale transformer models suffice. It really is a data problem, not a fundamental limitation with the technology.
Perfection is not the problem. An obvious test case of intelligence is to formally model something like algebraic topology in a formal logical calculus like intensional type theory with identity types. Even though all the commercial labs have ingested all of nLab, there isn't a single commercial model that can use logic to perform arithmetic operations.
I asked it to compute the simplicial homology of RP^2 and not only was it spot on with the result, it gave me a detailed and essentially correct computation. This definitely appears in its training set, but nevertheless you should have some humility =P
How do you know it's correct? The only simplicial traingulation I know of is by splitting up the sphere into an icosahedron and then identifying all the opposite faces to get the proper antipodal action for the quotient.
I'm not interested in engaging with you further on this topic after you devolved into ad hominems against me in the other thread. I'm here to argue in good faith. Have a good day.
You made an incorrect assessment of a basic calculation in algebraic topology and claimed that it was correct. You didn't even look at what it was computing and simply looked at the final answer which lined up with the answer on Wikipedia. Simplicial calculations for projective planes are not simple. The usual calculations are done with cellular decomposition and that's why the LLM gives the wrong answer, the actual answer is not in the dataset and requires reasoning.
lmao. you're totally right. RP^2 can be triangulated with a single triangle with all of its vertices identified. that's totally how you compute the simplicial decomposition of RP^2
I had a good conversation with it about the theory of partial orderings, it even corrected my mistakes. I asked it to make a textbook problem determining if a graph was cyclic or not and it made a straight and beautiful example where the partial ordering was realized with a total ordering and everything was written out in a straight order that was easy to follow.
If I wrote a script that made up a bunch of "is this graph cyclic?" problems that are well randomized I am sure there is some size where it just falls down the same way it falls down with sorting.
The obvious answer is that the LLM should pick an algorithm or write some code to do the thing which ordinary algorithms can do such as arithmetic, sorting, SAT solving, etc.
There's the deeper issue that it doesn't know what it doesn't know. It can't sort a list of radioactive isotopes any more than it can help you make an atom bomb. In the second case it will say that it won't help you, in the first case it will try to help you anyway when it really should be saying "I can't do that, Dave" because it just can't.