| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cproctor 579 days ago

I agree that it can be helpful to think of identity as a trajectory shaped by interactions along the way. However, we also continually shape our environments in large and small ways. TFA ignores this completely. Can this be effectively modeled in RL?

Over 130 years ago, Dewey [1] criticized the model of psychology which looked at human behavior in terms of stimulus -> internal processing -> response. Stimuli don't just come to us; we seek them out and modify the world around us to cause them to occur. Dewey and other pragmatists proposed reframing stimulus/response in terms of "acts" or "habits," or changes to the unified agent+environment. Popper was getting at the same entanglement of agent and environment in "Three Worlds" and Simon in "The sciences of the artificial."

I see RL as an elaboration of the stimulus/response paradigm: the agent is discrete from the environment. Does RL work well in an environment like Minecraft, where the real game is modifying the relationship between actions and future states? What about in contexts like Twitter, where you're also modifying the value function (e.g. by cultivating audiences or by participating in a thread in a way which conditions the value function of future responses)?

[1] https://plato.stanford.edu/entries/dewey/#ReflArcDeweRecoPsy...

3 comments

jerf 579 days ago

"I agree that it can be helpful to think of identity as a trajectory shaped by interactions along the way. However, we also continually shape our environments in large and small ways. TFA ignores this completely. Can this be effectively modeled in RL?"

You don't need to. All that is necessary for an attraction basin to emerge is an iterative system. If you prefer to model the human being and their entire environment rather than the human being and their input, you'll still get attraction basins. You'll just get two views on the same reality, suitable for different uses and different understandings, but it's not like "ah, if we model a human iterations we get these attraction basins but if we include environmental interactions suddenly we get a uniformly random distribution of personalities across the total personality space, it's all totally different once you consider the environment as part of the iterative system too".

link

cproctor 579 days ago

Thanks; I agree--both that you could train an agent in these situations, and that "You'll just get two views on the same reality, suitable for different uses and different understandings." I think the latter seriously undercuts the article's attempt to explain these trajectories in terms of personality; they could just as easily be attributed to the power of culture or social structure.

link

jerf 579 days ago

Heh, well, another lesson from chaos mathematics is that in iterative systems, you don't really get "explanations" the way we humans like to think of them... the answer to "what caused X" for any X than has taken a long time to develop is "everything". So rather than culture "or" social structure, I'd say "and", "and" also a lot of other things, and also the culture and social structure are themselves affected by the very personality structures we're trying to discuss.

Determining "causes" isn't as hopeless as that makes it initially sound, but you need something more sophisticated than the normal human concept of "cause" to even approximate useful answers. The good news is, this isn't impossible; we all live in an iterative world and we operate in it even so, which requires us to have certain models that conform to the world. It's one of those cases where I don't really love the "humans are just horribly irrational" gloss; our instincts and intuitions often have greater rationality than we realize, because they were formed in this iterative world, and sometimes it is in fact the particular naive concept of "rationality" we are trying to measure them by that is deficient, whereas if you use a more sophisticated one we look less bad.

(But sometimes humans just act suboptimally, no question about that.)

Another thing that helps is that you aren't generally interested in modelling the entire system. For considering myself and whether I may want to, as the article discusses, make changes in myself, I can take my culture and environment more-or-less as a given; I need some flex to consider options like "well what if I just up and moved to another country?", but I don't need to consider my own effects on society very much because they are some complex combination of "tiny" and "utterly unpredictable". While society is chaotic, the time frame of the impact on society from me changing from excessively introverted to somewhat less introverted is way, way past my horizon for making decisions.

link

jonnycomputer 579 days ago

I agree that the discussion in the blog post is incomplete because it does not consider that we shape the environments that shape us, though it does briefly touch on the fact that other RL agents (people) try to shape us, and we them. But it is certainly more than that.

link

nephanth 579 days ago

RL absolutely considers that the environment is stateful and that said state is affected by actions of the agent yes

link