Yah but do those words - as we would typically understand them - mean the same thing when applied to a computational model? A great deal of intellectual legerdemain is going on in many of these descriptions.
I don’t know that it’s sleight of hand. Seems to me that we are finally approaching a point where it might be possible to more precisely define terms like “morality” in a mathematical sense.
This would suggest that moral philosophy could be transformed - in much the same way that natural philosophy turned into physics.
I’m just a layperson and don’t really know what I’m talking about, but I find this all very exciting.
I think it's sleight of hand in that "moral self-correction" is a very complicated phenomenon, and proving that a computing system is doing it would require an incredible amount of detailed theoretical and empirical work. Some of which, yes, might include much more careful definitions of morality. Until that work is done, I think it's somewhere between foolish and negligent to anthropomorphize LLMs.
I agree with you that there are outcomes that would be less ideal. If it helps, I refer to them as Intelligent Tools. I do prefer the "tool" metaphor (and so does Bing Chat) and I hope that companies like Microsoft rethink their "copilot" and "assistant" metaphors.
I don't think they're "dangerous" per se, I think metaphors matter and we should choose the best ones.
I meant mathematically as in a proof. With LLMs knowledge has become - well, tokenised. We can now study it at an information processing level which likely to be at least similar in structure to how knowledge is organised in the brain. This in turn is likely to give us access to the way knowledge itself works, in a way that was not previously possible.
So we can actually look at concepts like “morality” and see how that is encoded. And I’m confident that this will give us empirical insight into these concepts in a way no philosophy has been able to do until now.
What assures is that this apparent morality is not a side effect of morally-aware or biased training data? The lack of adherence to the scientific process in this field is saddening.
Morality is biologically and sociologically constructed. It's not consistent between people, social cliques, regions, or nations. It's a nebulous concept. And unlike, say, our understanding of disease processes, which may be fuzzy and inexact at times but has an external truth we can hope to discover, there is no ground truth that we are approaching in moral philosophy. There is no platonic morality that we approximate with our muddled intuitions. Morality is nothing more or less than those muddled intuitions. It cannot be distilled to cold logic.
What AI might enable however is a superior form of democratic process, wherein an AI surveys the entire population through a natural language interface and synthesizes the nuanced and conflicting desires of the entire society. This deliberative democracy process would ameliorate the distorting effects of campaign funding and issues like uninformed or misinformed voters and low voter participation. It could also allow a sort of citizen feedback line-by-line on proposed legislation and government action.
I suppose they're talking about utilitarianism? But that's hand wavey math at best. Eg the whole "torture someone for 50 years or remove a speck of dust from trillions of eyeballs" debate. Those who chose the torture see utility as strictly additive and I'm not aware of any strict definition of utility which requires this. So any math in that instance is built on a shaky or even illusionary foundation.
This would suggest that moral philosophy could be transformed - in much the same way that natural philosophy turned into physics.
I’m just a layperson and don’t really know what I’m talking about, but I find this all very exciting.