|
|
|
|
|
by treesciencebot
518 days ago
|
|
For traditional LLMs this might be true (especially large MoEs at bs=1) but I highly disagree with "multi-modal models" phrase since most of the models that output in other modalities are generally compute bound. Which means less flops will make the experience so much worse (imagine waiting a couple minutes for an image and hours for videos). |
|