| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by robertk 839 days ago

Very cool result but the title is overselling the "AI" contribution. It seems like they trained a few standard binary classifiers (Naive Bayes, decision trees, kNN). The novelty is the independent variable coming from an attribute precomputed for many known elliptic curves in the LMFDB database, namely the Dirichlet coefficients of the associated L-function; and the dependent variable being whether or not the elliptic curve has complex multiplication (CM), an important theoretical property for which lots of flashy theorems begin with assuming whether or not the curve has CM. They go on to train another binary classifier (and a separate size k classifier) to determine a curve's Sato-Tate identity component using the Euler coefficients and group-theoretic information about the Sato-Tate group (constructed by randomly sampling elements and representing the two non-trivial coefficients of their characteristic polynomials as independent variables in the classifier). They also run a PCA: https://arxiv.org/pdf/2010.01213.pdf

The cool part is that they then stepped back and scratched their heads wondering why the classifier was so good at achieving separation for these dependent variables in the first place, and plotting the points showed them to be (non-linearly) separable due to a visually clear pattern! The punchline and the reason it's so important to understand these data points, the Euler coefficients for elliptic curves, is because they contain all the relevant number-theoretic information about the curve. With some major handwaving, understanding them perfectly would lead to things like the Langlands program (and some analogues of the Riemann hypothesis) getting resolved. These wide reaching conjectures are ultimately structural assertions about L-functions, and L-functions are uniquely specified by their Euler coefficients (the a_p term in their Euler factors). Will murmurations help with that? Who knows, but the more patterns the better for forming precise conjectures.

Relevant intersectional credentials: I have lead ML engineering teams in industry and also did my doctorate work in this area of math, including using the LMFDB database referenced in the article for my research (which was much smaller back then and has grown a lot, so very neat to see it's still a force for empirical findings!).

4 comments

frakt0x90 839 days ago

This is something I've been thinking about a lot lately. Especially in combinatorics and number theory, there are databases like oeis, LMFDB, etc that contain tons of data with the ability to generate more algorithmically (sometimes easier said than done). Using ML to get heuristics and really good guesses on where the next opportunities lie and then formalizing it once you have a good guess would be SO cool.

Is there a name for that? Or groups working on that stuff that I could follow?

My own little pet project was I scraped OEIS and built a graph of sequences where 2 were connected if one mentioned the other in its related sequences section. You got these huge clusters around prime powers and other important sequences. Then I thought maybe you could use a GNN to do link prediction providing an estimation of a relationship that should exist but hasn't been discovered yet.