|
|
|
|
|
by nullc
1142 days ago
|
|
To the extent that you're memory bandwidth limited you should be able to do multiple inferences at once --- latency stays high but getting multiple samplings can be extremely useful for many uses and can cover up somewhat for high latency. |
|