Thanks, but I already worked with thus model and it was not good at all for my domain. Therefore, I wanted to finetrain llama for my domain and then use llama for embeddings. Should I finetune this model then?
(I want to focus more attention on that "tl;dr", which I will arguing is carrying a lot of load in that response: the high-level answer to how one does this using the llama weights is "you don't, as that isn't the right kind of model; you need to use a different model, of which there are many".)