|
|
|
|
|
by OJFord
234 days ago
|
|
I believe GP means it still to be connected to 'if this kind of latency is unacceptable to you' - i.e. you can't load/use/unload, you have to keep it in RAM all the time. In that case it's massively increasing your memory requirement not just to the peak the model needs, but to + whatever the other biggest use might be that'll be inherently concurrent with it. |
|