|
|
|
|
|
by piva00
1 hour ago
|
|
What? LLMs were designed for text, it's in their name "large language model". Only with specialised encoders like vision transformers they were able to process images as well but you're absolutely wrong about the original design intent. In the end you just added misinformation, just save the comment to your favourites and set a reminder to check it again in a few years like you wanted. |
|