| HN Mirror

[0]: Léo Grinsztajn et al. "Why do tree-based models still outperform deep learning on tabular data?" https://arxiv.org/abs/2207.08815 [1]: Jordan Tigani "Big Data is Dead" https://motherduck.com/blog/big-data-is-dead/ [2]: Zongyi Li et al. "Fourier Neural Operator for Parametric Partial Differential Equations" https://arxiv.org/abs/2010.08895 [3]: Jaideep Pathak et al. "FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators" https://arxiv.org/abs/2202.11214 [4]: Huaiqian You et al. "Learning Deep Implicit Fourier Neural Operators with Applications to Heterogeneous Material Modeling" https://arxiv.org/abs/2203.08205 [5]: Arthur Jacot et al. "Neural Tangent Kernel: Convergence and Generalization in Neural Networks" https://arxiv.org/abs/1806.07572 [6]: Pedro Domingos "Every Model Learned by Gradient Descent Is Approximately a Kernel Machine" https://arxiv.org/abs/2012.00152 [7]: Alexander Atanasov et al. "Neural Networks as Kernel Learners: The Silent Alignment Effect" https://arxiv.org/abs/2111.00034 [8]: Yilan Chen et al. "On the Equivalence between Neural Network and Support Vector Machine" https://arxiv.org/abs/2111.06063 [9]: Adityanarayanan Radhakrishnan et al. "Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features" https://arxiv.org/abs/2212.13881 [10]: Mukund Sundararajan et al. "Axiomatic Attribution for Deep Networks" https://arxiv.org/abs/1703.01365 [11]: Pramod Mudrakarta "Did the model understand the questions?" https://arxiv.org/abs/1805.05492 [12]: META FAIR Diplomacy Team et al. "Human-level play in the game of Diplomacy by combining language models with strategic reasoning" https://www.science.org/doi/10.1126/science.ade9097 [13]: DeepMind et al. "Mastering the game of Go with deep neural networks and tree search" https://www.nature.com/articles/nature16961 [14]: DeepMind et al. "Mastering the game of Go without human knowledge" https://www.nature.com/articles/nature24270 [15]: Sebastian Borgeaud et al. "Improving language models by retrieving from trillions of tokens" https://arxiv.org/abs/2112.04426 [16]: Umang Bhatt et al. "Explainable Machine Learning in Deployment" https://dl.acm.org/doi/10.1145/3351095.3375624 [17]: M. F. Kasim et al. "Building high accuracy emulators for scientific simulations with deep neural architecture search" https://arxiv.org/abs/2001.08055

I think the situation with regulations will be similar to that with interpretability and explanations. There's a popular phrase that gets thrown around, that "there is no silver bullet" (perhaps most poignantly in AIX360's initial paper [0]), as no single explanation suffices (otherwise, would we not simply use that instead?) and no single static selection of them would either. What we need is to have flexible, adaptable approaches that can interactively meet the moment, likely backed by a large selection of well understood, diverse, and disparate approaches that cover for one other in a totality. It needs to interactively adapt, as the issue with the "dashboards" people have put forward to provide such coverage is that there are simply too many options and typical humans cannot process it all in parallel.

So, it's an interesting unsolved area for how to put forward approaches that aren't quite one-size fits all, since that doesn't work, but also makes tailoring it to the domain and moment tractable (otherwise we lose what ground we gain and people don't use it again!)... which is precisely the issue that regulation will have to tackle too! Having spoken with some people involved with the AI HLEG [1] that contributed towards the AI Act currently processing through the EU, there's going to have to be some specific tailoring within regulations that fit the domain, so classically the higher-stakes and time-sensitive domains (like, say, healthcare) will need more stringent requirements to ensure compliance means it delivers as intended/promised, but that it's not simply going to be a sliding scale from there, and too much complexity may prevent the very flexibility we actually desire; it's harder to standardise something fully general purpose than something fitted to a specific problem.

But perhaps that's where things go hand in hand. An issue currently is the lack of standardisation, in general, it's unreasonable to expect people re-implement these things on their own given the mathematical nuance, yet many of my colleagues agree it's usually the most reliable way. Things like scikit had an opportunity, sitting as a de facto interface for the basics, but niche competitors then grew and grew, many of which simply ignored it. Especially with things like [0], there are a bunch of wholly different "frameworks" that cannot intercommunicate except by someone knuckling down and fudging some dataframes or ndarrays, and that's just within Python, let alone those in R (and there are many) or C++ (fewer, but notable). I'm simplifying somewhat, but it means that plenty of isolated approaches simply can't worth together, meaning model developers may not have much chance but to use whatever batteries are available! Unlike, say, Matplotlib, I don't see much chance for declarative/semi-declarative layers to take over here, such as pyplot and seaborn could, which enabled people to empower everything backed by Matplotlib "for free" with downstream benefits such as enabling intervals or live interaction with a lower-level plugin or upgrade. After all, scikit was meant to be exactly this for SciPy! Everything else like that is generally focused on either models (e.g. Keras) or explanations/interpretability (e.g. Captum or Alibi).

So it's going to be a real challenge figuring out how to get regulations that aren't so toothless that people don't bother or are easily satisfied by some token measure, but also don't leave us open to other layers of issues, such as adversarial attacks on explanations or developer malfeasance. Naturally, we don't want something easily gamed that the ones causing the most trouble and harm can just bypass! So I think there's going to have to be a bit of give and take on this one, the regulators must step up while industry must step down, since there's been far too much "oh, you simply must regulate us, here, we'll help draft it" going around lately for my liking. There will be a time for industry to come back to the fore, when we actually need to figure out how to build something that satisfies, and ideally, it's something we could engage in mutually, prototyping and developing both the regulations and the compliant implementations such that there are no moats, there's a clearly better way to do things that ultimately would probably be more popular anyway even without any of the regulatory overhead; when has a clean break and freshening up of the air not benefited? We've got a lot of cruft in the way that's making everyone's jobs harder, to which we're only adding more and more layers, which is why so many are pursuing clean-ish breaks (bypass, say, PyTorch or Jax, and go straight to new, vectorised, Python-ese dialects). The issue is, of course, the 14 standards problem, and now so many are competing that the number only grows, preventing the very thing all these intended to do: refresh things so we can get back to the actual task! So I think a regulatory push can help with that, and that industry then has the once-in-a-lifetime chance to then ride that through to the actual thing we need to get this stuff out there to millions, if not billions, of people.

A saying keeps coming back to mind for me, all models are wrong, some are useful. (Interpretable) AI, explanations, regulations, they're all models, so of course they won't be perfect... if they were, we wouldn't have this problem to begin with. What it all comes back to is usefulness. Clearly, we find these things useful, or we wouldn't have them, necessity being the mother of invention and all, but then we must actually make sure what we do is useful. Spinning wheels inventing one new framework after the next doesn't seem like that to me. Building tools that people can make their own, but know that no matter what, a hammer is still a hammer, and someone else can still use it? That seems much more meaningful of an investment, if we're talking the tooling/framework side of things. Regulation will be much the same, and I do think there are some quite positive directions, and things like [1] seem promising, even if only as a stop-gap measure until we solve the hard problems and have no need for it any more -- though they're not solved yet, so I wouldn't hold out for such a thing either. Regulations also have the nice benefit that, unlike much of the software we seem to write these days, they're actually vertically and horizontally composable, and different places and domains at different levels have a fascinating interplay and cross-pollination of ideas, sometimes we see nation-states following in the footsteps of municipalities or towns, other times a federal guideline inspires new institutional or industrial policies, and all such combinations. Plus, at the end of the day, it's still about people, so if a regulation needs fixing, well, it's not like you're not trying to change the physics of the universe, are you?

  [0]: Vijay Arya et al. "One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques" https://arxiv.org/abs/1909.03012
  [1]: High-Level Expert Group on AI "Ethics Guidelines for Trustworthy AI"
  Apologies, will have to just cite those, since while there are some papers associated with the others, it's quite late now, so I hope the recognisable names suffices.