|
|
|
|
|
by codenlearn
460 days ago
|
|
Doesn't Language itself encode multimodal experiences? Let's take this case write when we write text, we have the skill and opportunity to encode the visual, tactile, and other sensory experiences into words. and the fact is llm's trained on massive text corpora are indirectly learning from human multimodal experiences translated into language. This might be less direct than firsthand sensory experience, but potentially more efficient by leveraging human-curated information. Text can describe simulations of physical environments. Models might learn physical dynamics through textual descriptions of physics, video game logs, scientific papers, etc. A sufficiently comprehensive text corpus might contain enough information to develop reasonable physical intuition without direct sensory experience. As I'm typing this there is one reality that I'm understanding, the quality and completeness of the data fundamentally determines how well an AI system will work. and with just text this is hard to achieve and a multi modal experience is a must. thank you for explaining in very simple terms where I could understand |
|
> The sun feels hot on your skin.
No matter how many times you read that, you cannot understand what the experience is like.
> You can read a book about Yoga and read about the Tittibhasana pose
But by just reading you will not understand what it feels like. And unless you are in great shape and with greate balance you will fail for a while before you get it right. (which is only human).
I have read what shooting up with heroin feels like. From a few different sources. I certain that I will have no real idea unless I try it. (and I dont want to do that).
Waterboarding. I have read about it. I have seen it on tv. I am certain that is all abstract to having someone do it to you.
Hand eye cordination, balance, color, taste, pain, and so on, How we encode things is from all senses, state of mind, experiences up until that time.
We also forget and change what we remember.
Many songs takes me back to a certain time, a certain place, a certain feeling Taste is the same. Location.
The way we learn and the way we remember things is incredebily more complex than text.
But if you have shared excperiences, then when you write about it, other people will know. Most people felt the sun hot on their skin.
To different extents this is also true for animals. Now I dont think most mice can read, but they do learn with many different senses, and remeber some combination or permutation.