Hacker News new | ask | show | jobs
by primordialsoup 999 days ago
This is very interesting work, but it's not really a LLM. It doesn't have language abilities. They should have called it a seq2seq model, but I think that term is not in vogue these days :)
1 comments

We use the same architecture as other LLMs, but we include no natural language in our pretraining. We figured a single-domain training corpus would make evaluation easier. We’ll be looking at layering this on top of something like Code Llama next