Hacker News new | ask | show | jobs
by PeterisP 2950 days ago
Given a sufficiently large table, an agent can certainly perceive without acting (testing a hypothesis) that "A+C=B" through its compressibility; the most compact(least complex) representation of the data can replace the many individual datapoints in one column with a learned rule how to calculate it.

There's related research on how children learn language, namely, how much observed evidence (i.e. based on cases where we have monitored and counted every word a child has heard in their life) is needed for a child to switch from a "lookup table" approach for certain features to "rule based" approach (detectable by observing overregularization, applying a systematic rule even when the actual language, including examples the child has heard, has an exception to that rule) and then to a "rule+exceptions" correct understanding; the experiments point towards "learning a rule" then and only then when a "compressed representation" is beneficial from information theory point of view.