| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mitthrowaway2 1206 days ago

I think "simulating" in this context means internally executing a process that is very similar to the process that generated the original material, as part of the prediction process. In general, that's the most compact way to predict and reproduce the original material.

For example, the string "1010101010"... could be the output of a function

  def generate_char_random(prev_string):
      x = random()
      if (x > 0.5):
         yield(1)
      else:
         yield(0)

It could also be the output of this function:

  def generate_char_alternating(prev_string):
      x = float(prev_string[-1])
      if (x < 0.5):
         yield(1)
      else:
         yield(0)

Even if it's not explicitly running those two functions, a model that is very good at predicting the next character of this input string might have, embedded within it, analogues of both of those two functions. The longer the output continues to follow the "101010" pattern, the higher confidence it should place on the _alternating version. On the other hand, if it encounters a "...110001..." sequence, it should switch to placing much more confidence on the _random version.

The LLM of course does not contain an infinite list of generative functions and weight their outputs. But to the extent that it works well and compactly approximates Bayesian reasoning, it should approximate a program that does.