| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by primordialsoup 1046 days ago
	This is very interesting work, but it's not really a LLM. It doesn't have language abilities. They should have called it a seq2seq model, but I think that term is not in vogue these days :)

1 comments

cec 1045 days ago

We use the same architecture as other LLMs, but we include no natural language in our pretraining. We figured a single-domain training corpus would make evaluation easier. We’ll be looking at layering this on top of something like Code Llama next

link