Hacker News new | ask | show | jobs
by InkCanon 509 days ago
The question depends what you mean by first principles. Usage of the phrase "first principles" has sprawled into many different things since (I think) Musk first mentioned it as a way to learn. The original, philosophical meaning of first principles meant a fundamental truth which could be used to derive others. Much of the philosophising of thinkers like Aristotle or Descartes was to uncover these truths (eg I think, therefore I am). In physics and other sciences, it means calculations using established laws, rather than approximations or assumptions. Then it got borrowed into certain circles of the tech crowd with the vague meaning of thinking about what's important or true and ignoring the rest. Then it trickled down into the learning/self help world as a hack of some sort to learn. If we take the original meaning of first principles, there aren't a great deal of absolute truths in machine learning. It is a very empirical, approximated and engineering oriented endeavor. Most of the research involves thinking of a new approach, building it and trying it on new datasets.

The other big question is why you want to learn it. If you want to learn ML in itself, than anything including the search algorithms (which used to be considered core to ML a long time ago) you mentioned is part of that. But if you want to learn ML to contribute to modern developments like LLMs, then search algorithms are virtually useless. If you aren't going to be engineering any ML or ML products, what you want is to gain some insight into it's future and the business of it. So learning things like transformer architecture is going to be far more unhelpful than say, reading about the economics of compute clusters.

Given the empirical/engineering quality of current ML, I'd say building it from scratch is really good for getting the handful of possible first principles (the fundamental functions involved, data cleaning, training, etc)

2 comments

Ya, the phrase "first principles" is vague...I meant starting from an axiomatic and actionable definition of AI and learning from there. The first chapter of AIMA does a swell job of enumerating different definitions of and then explicitly declaring which one is used and the foundational premises for the concepts and methods to follow. And it doesn't define AI then jump to neural networks, it gradually layers more atomic concepts, like agents (which I know, have been bastardized) and environments, until it gets to machine learning.

> The other big question is why you want to learn it.

Good question. I'm just looking for a wider context to understand contemporary AI. I don't know if this serves any practical purpose but I'm someone who likes to understand the "why" behind everything and starting from "first principles" helps uncover that.

By "first principles" do you mean something long "learn from the ground up" or " from basic building blocks"?

I like learning things starting from small, atomic this, then building up and learning higher layers of abstraction and functionality later. I tend to find hands on totals too "top down" in the sense that they start with all the told in place and then give you a cursory look into what's actually happening.

Personally I feel like most things in the world aren't really that complicated when you understand the building blocks. There are a few core ideas and then a bunch of layers on top to organize and utilize those ideas for different applications. So if I have an interest in something I want to learn from the ground up.

Exactly. Nailed it.
> Usage of the phrase "first principles" has sprawled into many different things since (I think) Musk first mentioned it as a way to learn

In pop culture in 2010+ sure, but he was essentially parroting Feynman IIRC.

"How to learn AI from first principles?"

Start with https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_t... and eventually you'll get to AI, exercise left to the reader ;)