Dasher http://www.inference.phy.cam.ac.uk/dasher/
takes some of the Markov chain ideas to the next level.
If you are a visual thinker it is a good way to get a feel for probabilistic compression techniques, specifically PPM and arithmetic coding.
The general field is called "algorithm animation" but it seems be out of fashion. The old SRC Modula-3 distribution used to contain a cool package for creating animations of algorithms. Here is a video made by the authors: http://www.youtube.com/watch?v=zIgu9q0vVc0