|
|
|
|
|
by Jackson__
980 days ago
|
|
I too would like to know about the training dataset, as I just took a look at the one for LLava[0], and found out that they used a pretty big amount of BLIP auto generated captions. This seemed a bit surreal to me, like trying to train an LLM with the outputs of a worse performing smaller LLM. [0] https://github.com/haotian-liu/LLaVA/blob/main/docs/Data.md#... |
|