Hacker News new | ask | show | jobs
by bilsbie 972 days ago
I’m guessing for more embedded, low power, or real time applications you’d still need to train a model?

I’d imagine you wouldn’t have the resources to run a souped up foundation model?

1 comments

Thanks for your question. You are right, current vision-language foundation models are quite heavy. However, for example in NLP there are some works on smaller foundation models. In addition, you could also use a foundational model to help train your smaller model or label more data.