|
|
|
|
|
by jjice
1144 days ago
|
|
Does anyone have any resources they recommend for just understanding the base terminology of models like this? I always see the terms "weights", "tokens", "model", etc. I feel like I understand what these mean, but I have no idea what I need to care about them for in open models like this? If I were to download an open model to run on my machine, would I download the weights? I'm just ignorant in the ML space I guess but not sure where to start. |
|
I have felt the same in the past, related to a completely different topic. I know how it feels, it's like people are not saying things what they are, just using weird words.
"weights" - synapses in the AI brain
"tokens" - word fragments
"model" - of course, the model is the AI brain
"context" - the model can only handle a piece of text, can't put whole books in, so this limited window is the context
"GPT" - predicts the next word, trained on everything; if you feed its last predicted word back in, it can write long texts
"LoRA" - a lightweight plug-in model for tweaking the big model
"loss" - a score telling how bad is the output
"training" - change the model until it fits the data
"quantisation" - making a low precision version of the model because it still works, but now is much faster and needs less compute
"embedding" - just a vector, it stands for the meaning of a word token or a piece of image; these embeddings are learned