Hacker News new | ask | show | jobs
by wokwokwok 1250 days ago
> or are those small models completely unusable for anything?

Sadly, they really offer almost no value.

For the effort, you’re better off with an NLP framework like spacy.

You can play with the small neo gpt models on hugging face, eg. https://huggingface.co/EleutherAI/gpt-neo-125M

…but, the tldr is they’re cute to play with, but practically, the content they can generate is short, inconsistent and full of errors.

1 comments

.... which is actually not of "almost no value." The value of smaller models is different. For example, I have anonymized data, with fields removed. The smaller models do fine for filling those fields in with plausible values.

The smaller models do okay for zero-shot clustering of data in many cases (e.g. liberal versus conservative text), and if not, with minimal training. For generating statistics or probabilistic information about large numbers of text, they're great.

GPT-3, they're not, but I use them in my day-to-day work quite a bit more than I thought I would. I bought a GPU for one purpose, and I find I spin it up a lot these days.

I /really/ want to be able to use a large-scale language model locally, though. For the types of things I'd like it for, such as helping me draft emails, I don't trust OpenAI with my data.