Hacker News new | ask | show | jobs
by zardo 22 days ago
This is something that could be demonstrated rather than just argued.

Train an LLM only on texts dated prior to Newton and see if it can create calculus, derrive the equations of motion, etc.

If you ask it about the nature of light and it directs you to do experiments with a prism I'd say we're really getting somewhere.

1 comments

We tried this experiment with humans, back in the 17th century, and only a few[1] out of millions managed it given a whole human lifetime each.

[1] Obviously Newton counts as one. Leibniz like Newton figured out calculus. Other people did important work in dynamics though no one else's was as impressive as Newton's. But the vast majority of human-level intelligences trained on texts prior to Newton did not create calculus or derive the equations of motion or come close to doing either of those things.

Newton did it at 23 and there would have been very few people with mathematical training. The LLM would be trained on the entirety of recorded human knowledge and mathematics up to that point, and would get to use a lot more energy so it still has a massive material advantage over young Isaac. Yet I don't believe calculus would magically appear in its response.
A good way to look at it is to compare it to today: LLMs are already trained and are operationalizing a lot more mathematical knowledge than any human, including experts.

Why are they not coming up with paradigm shift in knowledge expression/discovery like humans did back then?

Are we just not prompting them right?

LLMs have been trained on a lot more data than any single human (text wise at least) for years now and these sort of results have only been possible for the latest crop of models in the past few months. Models get better as they get better.
The argument is whether models of today, suitably trained on pre-17th century data (if comparable quantity was available) would be able to "invent" calculus et cetera.

If we believe today's models are sufficiently capable to have been able to do so, why are we not getting these types of results today compared to the entire world knowledge and especially math?

Are research mathematicians simply not prompting LLMs in the right way?