|
|
|
|
|
by pixl97
1015 days ago
|
|
The token model of LLMs doesn't map well into how human experience the world of informational glyphs. Left and right is a intrinsic quality of our vision system. An LLM has to map the idea of left and right into symbols via text and line breaks. I do think it will be interesting as visual input and internal graphical output is integrated with text based LLMs as that should help correct their internal experience to be based closer to what we as humans experience. |
|
Oh yeah that's i suggested it :)
I do wonder though if we give the LLMs enough examples of texts with people describing their relative spatial position to each other and things will it eventually "learn" to work things these out a bit better