A more interesting example of transformers learning a process may be [1].
There's a large literature on applying language models to reasoning tasks, but not many on what's actually going on inside them. But see for example [2]. Also https://transformer-circuits.pub/ has a body of work on it, but still at a very early stage (see in particular "In-context Learning and Induction Heads").