sorry, noob here trying to make sense of this: you mean you can extract embeddings from the model file or that the embeddings are available in the repo and you can just use those files?
Kind of. You feed the LLM the input text for your prediction, you extract the activations of the final layer of the LLM (so the weights * the input of the previous layers), then use that activation vector, or embedding, as the input for a separate model. This separate model that uses the embedding can be any classifier or regression. A common use case for this is document classification.