Hacker News new | ask | show | jobs
by Broge 1208 days ago
Yeah I feel like for development, OBM is great and super flexible.

But when you actually want to deploy, a lot of tiny, more efficient models would probably be the best bet.

I read somewhere that the a company ended up fine-tuning FLAN-T5 instead of going GPT-3, which I can imagine saved them lots of $$.

1 comments

FLAN-T5 is a very capable model for anything that is non-generative.