|
|
|
|
|
by 323454
1568 days ago
|
|
Sorry if this was unclear - in a datacenter use case you are right, but for an edge deployment, you will usually need to quantize, prune or compress your ML model to get it working as fast as you'd like on a sufficiently small CPU/GPU. Compared with running your ML model unchanged on those platforms, Tensil can run with the performance ranges listed above. You can also quantize and use Tensil too! |
|