|
|
|
|
|
by ftxbro
1139 days ago
|
|
So I tried RedPajama-INCITE-Instruct-7B-v0.1 and the AutoModelForCausalLM.from_pretrained(...) call takes two minutes every time. My GPU is big enough. I don't know why it's so slow. I feel like it's somehow precomputing stuff that can be used across queries, and I had hoped that this stuff would have already been precomputed on the disk and I could just load it up. |
|