| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ntonozzi 1211 days ago
	I think Flan-T5 is fast enough, but I don't think it generates text or abstract reasoning at nearly the same level as current GPT-3 models. This indicates a deficiency in the benchmarks and metrics that we use to evaluate LLMs. For generating embeddings it might work well enough though.

2 comments

billythemaniam 1211 days ago

It's certainly not quite as good out of the box, at least the open sourced checkpoints. However so far I found it can achieve similar accuracy with enough examples and/or fine-tuning for my use cases. Like everything, it depends on what are doing too.

link

throwaway1851 1211 days ago

For embeddings, it may be overkill. Smaller BERT-type models can provide good embeddings when fine tuned with a contrastive learning objective. Eg: https://sbert.net.

link