|
I don't think this makes sense and I'm not quite sure why you went to ML, but that's okay. I am a machine learning researcher, but also frustrated with the state of machine learning, in part because, well... you can probably see how "proof by empirical evidence" is dialed up to 11. Sorry, long answer incoming. It is far from complete too but I think it will help build strong intuition around your questions. Will knowledge transfer? That entirely depends on the new problem. It also entirely depends on how related the problem is. But also, what information was used to solve the pre-transfer state. Take LLMs for example. There's lots of works that have shown them being difficult to train for solving calculations. Where they will do well on problems with the same number of digits but this will degrade rapidly as number of digits increase. It can be weird to read some of these papers as there will sometimes be periodic relationships with the number of digits but that should give us information about how they're encoding the problems. But that lack of transferability indicates that despite the problem solving and what we'd believe is actually just the same problem, doesn't mean it is. So you have to be really careful here, because us humans are really fucking good at generalization (yeah, we also suck, but a big part is our proficiency makes us recognize where we lack. But also, this is more a "humans can" more than "humans do" type of thing. So be careful when comparing). This generalization is really because we're focused around building causal relationships, while on the other hand the ML algorithms are build around compression (i.e. fitting data). Which, if you notice, is the same issue I was pointing to above. > Ie, you are the Oracle and whatever model is being trained doesn't know the answer, only if it is right or wrong. But I don't know if the reward function must be binary or on a scale.
This entirely depends on the problem. We can construct simple problems that both illustrate success as well as failure. What you really need to think about here is the information gain from the answer. If you check how to calculate that, you will see the dependence (we could get into Bayesian Learning or experiment design but this is long enough). But let's think of a simple example in the negative direction. If I ask you to guess where I'm from, you're going to have a very hard time pinning down the exact location. Definitely in this example there is a efficient method, but our ML learning algorithms don't start with prior knowledge about strategies and so they aren't going to know to binary search. If you gave that to the model, you baked in that information. This is a tricky form of information leakage. It can be totally fine to bake in knowledge, but we should be aware of how that changes how we evaluate things (we always bake in knowledge btw. There is no escaping this). But most models would not have a hard time if instead we played "hot/cold", because the information gain is much higher. We've provided a gradient to the solution space. We might call this hard and soft labels, respectively.I picked this because there's a rather famous paper about emergent abilities (I fucking hate this term[0]) in ML models[1], and a far less famous counter to it[2]. There's a lot of problems with [1] that require a different discussion but [2] shows how a big part of the issue is how many of the loss landscapes are fairly flat and so when feedback is discrete the smaller models just wonder around that flat landscape needing to get lucky to find the optima (btw, this also shows that technically this can be done too! But that would require different training methods and optimizers). But when giving them continuous feedback (i.e. you're wrong, but closer than your last guess), they are able to actually optimize. A big criticism of the work is that it is an unfair comparison because there are "right and wrong" answers here, but it'd be naive to not recognize that some answers are more wrong than others. Plus, their work shows a clear testable way we can confirm or deny if this works or not. We schedule learning rates, there's no reason you cannot schedule labels. In fact, this does work. But also look at the ways they tackled these problems. They are entirely different. [1] tries to do proof by evidence while [2] uses proof by contradiction. Granted, [2] has an easier problem since they only need to counter the claims of [1], but that's a discussion about how you formulate proofs. So I'd be very careful when using the recent advancements in ML as a framework for modeling reasoning. The space is noisy. It is undeniable that we've made a lot of advancements but there is some issues with what work gets noticed and what doesn't. A lot does come down to this proof by evidence fallacy. Evidence can only bound confidence, it can unfortunately not prove things. But this is helpful and well, we can bound our confidence to limit the search space before we change strategies, right? I picked [1] and [2] for a reason ;) And to be clear, I'm not saying [1] shouldn't exist as a paper or that the researchers were dumb for doing it. Read back on this paragraph, because we've got multiple meta layers here. It's good to place a flag in the ground, even if it is wrong, because you gotta start somewhere, and science is much much better at ruling things out than ruling things in. We more focus on proving things don't work until there's not much left and then accept those things (limits here too, but this is too long already). I'll leave with this, because now there should be a lot of context that makes this much more meaningful: https://www.youtube.com/watch?v=hV41QEKiMlM [0] It significantly diverges from the terminology used in fields such as physics. ML models are de facto weakly emergent by nature of composition. But the ML definition can entirely be satisfied by "Information was passed to the model but I wasn't aware of it" (again, same problem: exhaustive testing) [1] (2742 citations) https://arxiv.org/abs/2206.07682 [2] (447 citations) https://arxiv.org/abs/2304.15004 |
So knowledge transfer is something incredibly specific and much more narrow than what I thought. They don't transfer concepts by generalization, but they compress knowledge instead, which I assume the difference is, that generalization is much more fluid, while compression is much more static, like a dictionary where each key has a probability to be chosen, and all the relationships are frozen, and the only generalization that happens, is the generalization which is an expression of the training method used, since the training method freezes it's "model of the world" into the weights so to say? So if the training method itself cannot generalize, but only compress, why would the resulting model that the training method produces? Is that understood correctly?
Does there exist a computational model, which can be used to analyse a training method and put a bound on the expressiveness of the resulting model?
It's fascinating that the emergent ability of models disappear if you measure them differently. Guess the difference is that "emergent abilities" are kinda nonsensical, since they have no explanation of causality (i.e. it "just" happens), and just seeing the model getting linearly better with training fits into a much more sane framework. That is, like you said, when your success metric is measuring discretely, you also see the model itself as discrete, and it hides the continuous hill climbing you would otherwise see the model exhibit with a different non-discrete metric.
But the model still gets better over time, so would you expect the model to get progressively worse on a more generalized metric, or does it only relate to the spikes in the graph that they talk about? IE, they answer the question of "why" jumps in performance are not emergent, but they don't answer why the performance keeps increasing, even if it is linear, and whether it is detrimental to other less related tasks?
And if you wanted to test "emergent" wouldn't it be more interesting to test the model on tasks, which would be much more unrelated to the task at hand? That would be to test generalization, more so as we see humans see it? So it wouldn't really be emergence, but generalization of concepts?
It makes sense that it is more straightforward to refute a claim by using contradiction. Would it be good practice for papers, to try and refute their own claims by contradiction first? I guess that would save a lot of time.
It's interesting about the knowledge leakage, because I was thinking about the concept of world simulations and using models to learn about scenarios through simulations and consequence. But the act of creating a model to perceive the world, taints the model itself with bias, so the difficulty lies in creating a model which can rearrange itself to get rid of incorrect assumptions, while disconnecting its initial inherent bias. I thought about models which can create other models etc, but then how does the model itself measure success? If everything is changing, then so is the metric, so the model could decide to change what it measures as well. I thought about hard coding a metric into the model, but what if the metric I choose is bad, and we are then stuck with the same problem of bias as well. So it seems like there are only two options, it either converges towards total uncontrollability or it is inherently biased, there's doesn't seem to be any in-between?
I admit I'm trying to learn things about ML I just find general intelligence research fascinating (neuroscience as well), but the more I learn, the more I realize I should really go back to the fundamentals and build up. Because even things which seem like they make sense on a surface level, really has a lot of meaning behind them, and needs a well-built intuition not from a practical level, but from a theoretical level.
From the papers I've read which I find interesting, it's like there's always the right combination of creativity in thinking, which sometimes my intuition/curiosity about things proved right, but I lack the deeper understanding, which can lead to false confidence in results.