|
|
|
|
|
by mr_magoo
1161 days ago
|
|
I've also been struggling to run anything but the smallest model you have shared on paper space: import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM import torch
from transformers import pipeline generate_text = pipeline(model="databricks/dolly-v2-6-9b", torch_dtype=torch.bfloat16, trust_remote_code=True, device=0)
generate_text("Explain to me the difference between nuclear fission and fusion.") Causes the kernel to crash, GPU should be plenty +-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro P6000 Off | 00000000:00:05.0 Off | Off |
| 26% 45C P8 10W / 250W | 6589MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+ I'm extremely excited to try these models but they are by far the most difficult experience I've ever had trying to do basic inference. |
|