|
|
|
|
|
by davrosthedalek
1208 days ago
|
|
I would like to support this request for AI challenged developers :) For things like these, I always wonder:
How much slower would it be to run such a model on a CPU? I mean, clearly a lot less interactive, but is it possible at all? Could it be chopped up and "streamed" to a GPU with less memory halfway efficiently?
What is the bottleneck currently on GPUs, memory bw or compute? |
|
Yes models can be split up. See eg Hugging Face Accelerate.