| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjburgess 1201 days ago

Formulate the loss function -- you'll find it's just

    loss(the-right-answer(perfect-x) - perfect-y)

The most important aspect of "the-right-answer" is its ability to ignore almost all the data.

The existence of planets is "predictable" from the difference between the data and the theory -- if the theory is just a model of the data, it has no capacity to do this.

If you want to "do physics" by brute force optimization you'd need to have all possible measures, all possible data, and then a way of selecting relevant causal structures in that data -- and then able to try every possible model.

    loss(Model(all-data|relevant-causal-structures) - Filter(...|...))) forall Model

Of course, (1) this is trivially not computable (eqv. to computing the reals) -- (2) "all possible data with all possible measures" doesn't exist and (3) selecting relevant causal structure requires having a primitive theory not derived from this very process

animals solve this in reverse order: (3) is provided by the body's causal structure; (2) is obtained by using the body to experiment; and (1) we imagine simulated ways-the-world-might-be to reduce the search space down to a finite size.

ie., we DO NOT make theories out of data. We first make theories then use the data to select between them.

This is necessary, since a model of the data (ie., modern AI, ie., automated statistics, etc.) doesnt decide between an infinite number of theories of how the data came to be.

1 comments

XorNot 1201 days ago

> ie., we DO NOT make theories out of data. We first make theories then use the data to select between them.

No we don't, we make hypotheses and then test them. Hypotheses are based on data.

There are physics experiments being done right now where the exact hope is that existing theory has not predicted the result they produce, because then we'd have data to hypothesis something new.[1]

You are literally describing what deep learning techniques are designed to do while claiming they can't possibly do it.

[1] https://www.scientificamerican.com/article/measurement-shows...

link

mjburgess 1200 days ago

Hypotheses are "based" on data in the sense that via imagination we simulate ways the world might be, and then "data" is a clue to a contradiction.

Deep learning models are data: they are just associations between points.

Train a NN on data generated from an exponential function, and the model produced is not exponential.

Train a NN on the covid pandemic, and you will never obtain the SIR model.

AI is just associative statistical modelling. The model is the data.

link

dangond 1196 days ago

I know this discussion is a bit old at this point, but I came across this[1] essay for the first time today, and this shows more of what I was trying to get across earlier in the thread. Hopefully you'll find it interesting. Essentially, they trained a GPT on predicting the next move in a game of Othello, and by analyzing the weights of the network, found that the weights encode an understanding of the game state. Specifically, given an input list of moves, it calculates the positions of its own pieces and that of the opponent (a tricky task for a NN given that Othello pieces can swap sides based on moves made on the other side of the board). Doing this allowed it to minimize loss. By analogy, it formed a theory about what makes moves legal in Othello (in this case, the positions of each player's pieces), and found out how to calculate those in order to better predict the next move.

[1] https://www.neelnanda.io/mechanistic-interpretability/othell...

link

XorNot 1199 days ago

Proving any given AI architecture can't do something doesn't prove all AI architectures forever will never be able to do something. Neural networks aren't all AI, they're not even "neural networks" since the terms wraps up a huge amount of architectural and design choices and algorithms.

Unless you believe in the soul, then the human brain is just a very complicated learning architecture with a specific structure (which we freely know doesn't operate like existing systems...sort of, of course we also don't know that it's not just a convoluted biological path to emulating them for specific subsystems either).

But even your original argument is focused on just playing with words to remove meaning: calling something data doesn't meaningfully make your point, because mathematical symbols are just "data" as well.

Mathematics has no requirement to follow any laws you think it does - 1 + 1 can mean whatever we want, and its a topic of discussion as to why mathematics describes the physical world at all - which is to say, it's valid to say we designed mathematics to follow observed physics.

link