Hacker News new | ask | show | jobs
by turzmo 12 days ago
Much of math (or science) research has the strange quality of being mostly curiosity-driven, but having giant benefits that occasionally spin out to the public.

Some questions are more urgent and practical. My feeling is that the more directly practical a question is, the more likely the research community is to support AI usage in that question.

The annoying thing about recent AI advances is that they target questions on the wrong end of the spectrum: Erdos problems are exactly the sort of "useless" questions that people might answer purely for the love of the game. The sort of questions that a young person might cut their teeth on and gain confidence.

Solving questions like these automatically, I think, is not good for the long-term health of research. At least for the foreseeable future you still would like people to become interested and develop skills in these fields. These developments, and especially how they are presented, directly discourage that.

8 comments

To me, the most interesting feature of the OpenAI solution of the Unit Distance (Erdös) Problem is that the solution - using deep algebraic number theory as a source of extremal combinatorial/geometric constructions - is much more interesting than the problem’s elementary statement might lead one to expect.

Writing off Erdös’s problems as random, useless, or meaningless dismisses his mathematical intuition, second-to-none, and strikes me as somewhat uncharitable.

Finally, I agree that AI threatens mathematical training by rendering an entire class of acolyte-level research problems solvable by prompt. But the Unit Distance Problem is not of this class.

> much more interesting than the problem’s elementary statement might lead one to expect

This is reinforced by the immediate (human) use of the idea to resolve in the negative another significant problem, the sum-product conjecture on reals.

Explanation of what was involved: https://www.erdosproblems.com/forum/thread/blog:6

I don't think Erdos problems are useless myself, I put "useless" in quotes to emphasize that they are the sort of research that doesn't have an immediate application, and so their automated resolution should be weighed against the sociological cost.

As opposed to, say, drug discovery.

I am not a mathematician and did not read the unit distance solution too carefully, but my impression was that it used a variation of a known technique to solve the problem. And that makes perfect sense to me, there are a lot of techniques and lot of less relevant problems, I am not surprised that one can solve some of them with known techniques that just nobody has tried [hard enough] before. I am much more sceptical when it come to the important unsolved problems where every known technique has probably been tried several times over. In those instances it will probably take a true leap in understanding to solve them and I am sceptical that large language models are well suited for that because of the way they work.
We're very fortunate to have had some very eminent mathematicians backfill the OpenAI proof with history, context, and a literature review [1]. Ideas behind the proof seem to have been "in the air". Indeed, looked at certain point of view, the OpenAI construction can be viewed as a high-dimensional generalization of a known low-dimensional one. In this vein see the remarks of Gowers, Sawin and Tsimerman in [1]. Are LLMs capable of "true leap[s] in understanding"? I have absolutely no idea. But LLMs keep surprising me.

[1] https://arxiv.org/html/2605.20695v1

>> At least for the foreseeable future you still would like people to become interested and develop skills in these fields. These developments, and especially how they are presented, directly discourage that.

This assumption may well turn out to be correct, but it is not self-evident.

Nearly everyone who has ever got interested in mathematics got discouraged at some point and they left the field. Mathematics is very hard. Those very few that remained certainly have talent, but they also have characteristics that are necessary for success in a competitive field, which are perhaps less valuable per se. Such characteristics as may be over-represented in males for instance. This is not a point about gender differences, but about the intrinsic merit of different success factors.

It seems equally possible that the above assumption will turn out to be diametrically incorrect. People that would have been discouraged before LLMs will now retain their curiosity longer. Democratisation is surely a possible outcome.

Arguably, chess has never been as popular and accessible. And that discipline fell to AI three decades ago.

I've been spending 3 weeks, as a non mathemetician, chasing down a particular, very simply-stated, but secretly quite complex problem, and AI has been _so incredibly helpful_, not just in making progress on it, and doing obvious stuff like formalizing in lean, doing literature searches, reading through 10 or 15 papers and summarizing the results for me and how they apply to what I'm doing, giving me enough of an introduction to _entire fields_, that I can talk intelligently about it (I've had email correspondence with a couple of professional mathematicians in a few different fields about it, who agreed that it's an interesting, simple, but difficult problem). I've gone from "this should be easy", to "okay, I've almost got a proof", to "this is impossible", to literally just nailing down a few remaining sub-cases out of an infinite family.

I don't want to call anyone out, but I emailed one fairly famous mathemetician, and he literally said: "This is very interesting, I thought about it for a while, couldn't figure it out, but I thought ChatGPT had an interesting response..." and he linked me to his chatgpt transcript... (which, was actually helpful, because he asked it a better question than I was asking).

I have a suspicion that math will quite soon be exactly like programming and fall to the same machinery that coding is.

One thing that I noticed is that a common workflow I had was isolating hard subquestions in a self contained way and then "surveying" multiple different LLMs in a totally clean context. They would often say: "Oh, this is a obvious example of such-and-such" and immediately clear the barrier.

I'd be very cautious about "AI psychosis" here, or at the very least becoming a "crank". I've read too many stories of people convincing themselves they're on the verge of some great discovery to not hear "3 weeks to become conversational in mathematical fields" and not see all kinds of red flags.

I studied math at MIT and have several friends who are professors now and they deal with cranks all the time and since they're very kind and conflict averse people they tend to respond with perfunctory emails when they get inbounds like that.

So just be wary. Your external validation may not be as strong as you think it is, though kudos to you for at least trying to step out of the AI vortex to attempt to ground yourself.

> I'd be very cautious about "AI psychosis" here, or at the very least becoming a "crank". I've read too many stories of people convincing themselves they're on the verge of some great discovery to not hear "3 weeks to become conversational in mathematical fields" and not see all kinds of red flags.

---

it's not a great discovery, it's a pretty minor question, that I thought would be easy and it's not -- i've just been poking off and on at it for weeks, and I'm relying on lean to verify everything. It's actually a quite specific CS-adjacent problem that I came up with trying to write code, that just is hard to solve, and nobody in the literature that I could find has looked at directly. The end result of it will have exactly zero consequences other than proving an interesting lower bound for a question that as far as i can tell, nobody has bothered even looking at but me. The reason it touches on multiple fields is that it's sort of both an algebra problem and a CS problem, so i keep having to flip between them to understand what I'm looking at, and there are a lot of sub-fields that span both that have different tools, and it took me a while to find the right one.

Having been in academia for a bit, I find it somewhat hard to believe multiple professional mathematicians in different fields give meaningful reply to a random email solicitation from an internet stranger within three weeks, simply because those people's inboxes are absolutely bombarded every minute.

In reality people would be thrilled to have such response even with a finished preprint on arXiv. Anyway if you really hit the jackpot hope it will be smooth working out the details and get it published!

Well, I was emailing specific people who were working on very closely related things, and had recently published papers about it and I had very small, concrete questions about their results and not much about my question, except for context.
Do you not think that solutions to erdos problems might end up stepping stones to other important problems?

Either by introducing new tools, or by proving things that were previously unproven that end up helping in unexpected ways?

That's often how math goes, isn't it?

This is, indeed, how math often goes.
Sounds like yet another example of how AI is kneecapping industries from the bottom by "removing the barrier to entry" but really just removing the training path by doing the work itself with no guidance for juniors.
Yep, and if history is any guide the only way to play it is to take part and get rich while you can, or play the super long game and be positioned for the collapse.

Businesses will not adapt until they are incentivized to do so, and very few businesses have a multi-decade outlook. Even before AI, the senior 10x employee who retired and took all his domain knowledge with him because there was never any funding to train his replacement was a problem.

We are on tiny 1-5T parameter models with local power stations.

We can reach Q models just by throwing resources at it. That’s a million times current B models.

Is this a known or quantifiable thing? I thought that the limit had already been determined i.e. the existing models top out and at some point it doesn't matter how much time or energy you let the model consume, it won't improve the result. And with regards to training parameters, I thought we were equally limited there, e.g. the existing models can't benefit from a larger parameter space.

I was under the impression that improvements are arriving via how the models are trained and how model prompting context is constructed, rather than just by how much data or how much energy is spent searching over the model space for a particular prompt.

Is there some evidence that we have not reached a pleateau with just resource consumption on existing models?

The existing models "top out" not because they don't get better, but because it is uneconomical.

What we do know is that a model "tops out" wrt training data - that is, for a model of a given size, there's only so much training data you can squeeze into the set before you stop seeing gains. But conversely it means that if you already have a model of say 1 Ttok that is "trained to capacity", then a model of 2 TTok needs roughly twice as much training data to fully utilize all those weights. Which means that the cost of training it is not 2x but 4x (twice as many params x twice as many tokens). And then of course serving it is 2x more expensive, but even with optimal training the gains aren't 2x. So it very quickly becomes uneconomical.

A good example of that kind of model is (was) GPT-4.5. The prices and the consequent lack of demand show why companies don't really do that sort of thing anymore.

But no, there's no evidence of a plateau as such. I'm not sure what "evidence that we have not reached a plateau" would even look like.

what is a B model vs. a Q model? what do these letters mean?
B Billion parameter, T trillion, Q Quadrillion.
You cannot think fast enough when your wires are kilometers long. The only way up is in, and silicon transistors just cannot compete with density with biologic brains, ergo, super intelligence is a pipe dream
Baseless assertions. Fab tech continues to improve. There's no reason ML model internals have to be strictly serial - in fact we're already seeing some shifts away from that.
It seems to me that when you have a tool that automates part of the work, it doesn't make the curiosity go away, it changes the landscape of what problems humans find interesting. Maybe Erdos problems are no longer a good entry-level benchmark for a researcher, but that's going to drive young researchers to explore other areas that might have been out of reach before AI-human collaboation.
I think nuance gets lost in these conversations.

Your distinction between the practical and the theoretical is important. Practicality is important - everything we do is a matter of practicality of means or method, even how we pursue theoretical ends - but two points.

First, there is more to life than the practical. Some truths are known for their own sake, even if they also tell us about still more profound truths (also known for their own sake) or may have incidental practical relevance and consequences in some other context.

Second, while the theoretical terminus is the truth for its own sake, the practical terminus is always something other than itself. Well, what is that "something else"? You can't have an infinite regress of practicality. The meaning of a proximate, practical end is always other than itself. The practical requires an end beyond itself to justify it.

I agree that most people don't seem to inquire much about such ultimate ends. Their thoughts are confined to the proximate. Of course, how have they determined what the proximate should be? Something for people to contemplate.

Where science is concerned, it depends. On the one hand, there are fields that are certainly more theoretically oriented. It's not "the game" that motivates theory - that would make it mere recreation, with the truth taking a backseat - but the truth. (For this reason, I hesitate to call Erdos theoretically motivated. AFAICT, he was motivated by the challenge of problem solving and not the truth, insight, and understanding to be gained which would have been merely incidental and instrumental for him.)

However, I would also say a good chunk of science is motivated by a background motivation of technology production and the mastery of nature. Think Francis Bacon who viewed science as an instrument of power and showed a preference for the "how" over the "what" (τόδε τι) or the "why" (τὸ διότι). This set the tone for a great deal of modern science. A great deal does less explaining and more predictive modeling, because predictive modeling can be sufficient for control. Indeed, a truly theoretical causal account and understanding of a thing's nature can be less useful as a practical instrument than a merely predictive model.

Now, AI is a practical tool. I think they can be enormously useful as research aids, even in theoretical contexts, provided that one

1. understands their nature;

2. understands the purpose of the theoretical activity undertaken.

What is their nature? Well, they're statistical models that can unearth interesting and useful correlations and patterns. But they are not reasoning and knowing things. Their results are generated mechanically and mindlessly. Knowing this means taking their results with a healthy skepticism and a critical eye.

What about the purpose of theory? By analogy, think of a student in school who uses AI to complete all his assignments. Has he satisfied the purpose of those assignments? No, because the purpose of the assignments isn't to produce the effect - the solutions - per se, but to learn something. Theoretical work is like that; it's purpose is to understand and to grasp some truth. An AI can be used to assist this process, just as a calculator or a search engine can, but if you use it in a manner that circumvents that purpose instead of supporting it, then you're not achieve that purpose and wasting your time. What's the point?

I'm not sure why you're being downvoted, but I agree with all of your points. Those aren't things that pretty lend themselves well to mathematical modelling. But... there is a marginal field of math that does apply to this: statistics. The first two cases are somewhat special: - It may be daily obvious that an API is terrible, and that the replacement is not. If API 1 takes 1 sec to call, and API 2 takes 100ms to call, straightforward choice without stats. - provisioning can be dangerous. While not really a stats problem, you do need to have a quite elegant model of what is getting refactored, and how to know when to invalidate those cache entries. For the rest of the examples you provided, you're making changes that may make the problem better, may have no effect, or may make the problem worse. You completely need to use statistics to determine whether or not changes like those are honestly having an effect. Performance analysis is part math and part art, and without the math background, you're likely going to be spinning your wheels a bunch. Beyond stats, fields like queuing theory are going to make a massive breakthrough when you're doing performance breakthrough in distributed systems.
That's an interesting perspective and I wholly disagree with the conclusion

You are saying that tough problems with no applicability are useful because people that you happen to respect got good by their curiosity and pursuit of trying to solve these kinds of problems and failing, but branching off into other cognitive areas as mathematicians

Now if I know anything about math for the sake of math, and academics, these are the same people that lament the idea of intelligent people going to the finance sector or any other trade they just happen not to respect as much

The similarity being that their exact criticism of why, something they don't respect and view as having little utility, is the exact reasoning presented here now that AI can solve their pointless problems

What I'm seeing is that human mathematicians have a laundry list of problems they have failed to solve for decades, centuries, which is what they are funded and employed to do. "Computer" used to a human job title too.

This leads me to being excited about AI one-shotting these problems, let move on to something else.

> Now if I know anything about math for the sake of math, and academics, these are the same people that lament the idea of intelligent people going to the finance sector or any other trade they just happen not to respect as much

IME a vastly more common sentiment among mathematicians regarding mathematical talent leaving the nest to apply their skills in other fields is that those other fields are lucky to get them!

I think you've slightly straw manned the lamentation there. Not that I agree with the lamentation, but using your talent to make the rich richer (which is what quants do, they are paid a fixed amount to provide a larger value up the chain), as opposed to advancing human knowledge, is the reason for the lament, not some sort of respectability issue.
Quants benefit from substantial bonus structure as part of their compensation.
Well exactly.

One way to make money is to create a great product that solves a problem people have and market it effectively. That's a sort of idealised situation, an aspiration, but it's what I would call socially valuable. Another way is to help people who are already rich move their money around so that they become richer, in return for a fraction of the increase in wealth. I personally have no problem with that, everybody has to make a living, but that is not socially valuable. It is debatable how socially valuable pure mathematics research is. But you take a fixed fee and all your best work is public domain.

> but that is not socially valuable.

Liquidity is extremely socially valuable from my perspective

I view the concept of "the market" as a construction project, that isn't finished. Not finished until every device around you can be traded instantly, with fractional shares even, with high liquidity and a capability for to the second price discovery, and there is a liquid options market on top of that.

The price of anything is not really resolved, its ability to be collateral and access a more liquid form of exchange at any time is not resolved.

Time is valuable, all of this reduces the time. There are still people waiting 90 days to access cash tied up in their home's equity, when another part of the market has split second collateralized lending. All of the market for anything should be that way.

The ability to exchange is valuable and wealthy people have liquidity issues. Poor people have liquidity issues. Everyone has a liquidity issue and doesn't know it.

Anything that slows things down slows down the whole construction project. A market with five 8 hour trading sessions a week with settlement the next day moves far slower than a market with three times as many trading sessions during the same time frame, where the trade is the settlement. The opportunities become endless for people aiming to accumulate more, the liquidity of traders to do actual business and negotiation and acquire goods and services and raise capital becomes vastly greater and far faster. Proving a new venture all the way to an exit becomes far faster, and results in the wealth distribution to the employees, vendors, and everyone else far faster and far greater.

That's what I see and look forward to. That has extremely high social value. Promoting liquidity and promoting velocity of transactions helps solve the actual reservations people have about the market at all. More paths for people on the poorer side of the bell curve to afford things.