Hacker News new | ask | show | jobs
by hnuser355 2678 days ago
What the hell is that supposed to mean and how it’s different from automatic differentiation
1 comments

I think it's just the realization that the execution graph in machine learning models at this point are not really different from any programming language AST, which means there is potential in exploring the intersection between writing programs and writing machine learning models with the AD tools.
Sorry for curse words I’m very confused
It's ok, it seems at this point the focus is in creating the tools to better allow the exploration, including making the entirety of a programming language valid syntax for building any model (supporting the AD). There are the efforts in Julia Zygote and the Tensorflow for Swift that I know of.

I think the differentiable forth example in the article is interesting in the context, since it has a differentiable program with gaps, and it uses the universal approximation property of a neural network to fill them. When your code is differentiable, it's possible to embed ML models, perhaps to learn a part of an equation when you already know most of it, and which otherwise would have too large of a search space. You might even have, like other commenter said, a compiler being smart enough to rewrite the AST to reduce convergence problems (which seems to be the main problem with such models). Or you could download libraries with pretrained models/architectures in the same way you have any regular program library to embed in deeper systems.

Though I honestly can't tell if any of those are actually valid pursuits or I'm misunderstanding the possibilities.