Y
Hacker News
new
|
ask
|
show
|
jobs
by
SoftTalker
138 days ago
LLMs are trained on text. Why would we expect them to understand a visual and tactile 3D world?
1 comments
azinman2
138 days ago
Because they’re also multimodal vLLMs.
link