| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by occamrazor 868 days ago
	Note that the model is based on RoBERTa and has only 125m parameter. It is not competing against any of the new popular models, not even small ones like Phi or GeMMa.

2 comments

jerpint 867 days ago

It’s also not meant to be a generative model - only to be used as an encoder model (they list retrieval as a potential use case )

link

3abiton 867 days ago

Given the current state of LLMs, I am not even sure this qualify to be called an LLM.

link

mistrial9 867 days ago

second opinion - BERT family are transformer-based, and that is a big threshold right there.. secondly I am not sure that two one-minute comments could capture what exactly went on with fine tuning or graph-based methods of constraint or whatnot.. with respect to the fitness of the production tools for intended purposes.

link