Hacker News new | ask | show | jobs
by FranchuFranchu 888 days ago
As soon as ChatGPT got released, I tried to make it solve IMO-style problems. It failed.

If this is legit, this is huge. Finding geometric proofs means proving things, and proving propositions is what an intelligence does. In my opinion, this is the closest we have gotten so far to AGI. Really excited to see what the future holds.

3 comments

Proving propositions has been computer work for several decades.

This is not a pure LLM. Search algorithms are obviously incredibly powerful at solving search problems. The LLM contribution is an "intuitive" way to optimize the search.

This is basically semantic pruning.
From the paper:

> When producing full natural-language proofs on IMO-AG-30, however, GPT-4 has a success rate of 0% ...

As soon as ChatGPT got released, I tried to make it solve IMO-style problems. It failed.

Have you tried the same questions with ChatGPT 4? It is a transformational change (no pun intended) over the earlier releases, and over all open-source models.

Just today, I needed to interpret some awkwardly-timestamped log data. I asked it a few questions along the lines of "What time it was 10,000 seconds before xx:yy:zz?" It didn't give me the answer, but it wrote and executed a Python program that did.

From the paper:

> When producing full natural-language proofs on IMO-AG-30, however, GPT-4 has a success rate of 0% ...

"Also, the C++ code this talking dog wrote for me is full of buffer overflow vulnerabilities, and my Commodore 64 still can't run Crysis. There's nothing but hype here, move along."

News flash: either AI gains the ability to postulate theorems, generate proofs, and validate them, or mathematics is as dead as Latin. Humans have reached their limit.