| How badly is bad? What sort of output are we talking? I am asking as I once had a Markov-chain IRC bot* and while it often struggled to string together a sentence, it was quite hilarious sometimes. Absolutely pointless other than the occasional laugh. Can it form sentences or are those small models completely unusable for anything? I'm not thinking OpenAI level uses - sort of compare a Postgres cluster to a SQLite file (not literally, conceptually I guess). Can it be used for single tasks in any way? Could it figure out how to map search terms to URLs for a knowledge base type thing? Forgive me if these are silly questions. The extent of my knowledge in this field is asking ChatGPT questions and going "that's so cool" when it answers. * Your phone's predictive text except it finishes the sentence itself based on a word someone in chat used so that it felt on topic. In my case it also learned how to form sentences from other people talking in chat, in hindsight it's amazing I never had a Tay issue. https://en.m.wikipedia.org/wiki/Tay_(bot) |
If you are interested in running these models yourself without having a beefy GPU, you can try my custom inference implementation. It's in pure C/C++ without any 3rd party dependencies, runs straight on the CPU and builds very easily. I think it is relatively well optimised. For example, on a MacBook M1 Pro I can run GPT-2 XL (1.5B params) at 42ms/token and GPT-J / GPT-JT (6B params) at 125ms/token.
Here are a couple of generated examples using GPT-J:
https://github.com/ggerganov/ggml/tree/master/examples/gpt-j
These are examples using zero-shot prompt where the model auto-completes a text given a starting prompt. You can try to make a conversation bot with a few-shot prompt, but it's not great. Probably the model needs some fine-tuning for that to become feasible.