Hacker News new | ask | show | jobs
by aaroninsf 1003 days ago
TL;DR you ground symbols by connecting them to non-linguistic features in your network. In specific, to the sensorium; but also: to utility.

Things have names, but things are what we perceive about them and largely about what we can and do, do with them.

This is what any agent embodied in an environment must do and do do.

Many of the criticisms about LLM will evaporate for multi-modal models, as they become multimodal, and gain (or infer from) agency.

1 comments

If you think about it, for embodied agents symbol grounding isn’t really the “problem”.

Rather, embodied agents start with reference and indices. The hard problem is actually ungrounding — which takes work — to eventually get to things that approach what people typically think of “symbols”.

It's metaphors all the way down, until you hit sensory grounding, space and time.

Discrete objects give integer arithmetic. Correspondence gives equality. Spatio-temporal behavior gives basic logic: concurrent AND, choice OR, inside/outside, under/over, up/down, more/less... Properties and behaviors cluster to give categories in a context. Action frames give role bindings for actors...

It's Lakoff&Johnson all the way down.