| HN Mirror

My state space mapping approach, which is reminiscent of expert systems and tabelar RL, only makes sense when you repeat the task in the same environment so you can gradually discover the states and their policies. You can look at execution traces to make targeted policy adjustments after each execution.

Here is an example of a state space map rendered in 2D by PCA. It maps LLM research papers from 2025. It does not have policies attached to state positions yet, but can be used as a visual map.

The projection: https://i.imgur.com/a9ESiXs.png

The map itself: https://pastebin.com/pmGzFcPM

A cool thing both for intent weaving and state space policy approach is that they do not prescribe a sequence of steps, they are more like a GPS map allowing rerouting towards goal state at any moment. This is a more flexible description than a static procedure.