|
|
|
|
|
by avin_regmi
2659 days ago
|
|
Hey, both prediction for TF serving and panini serving was done in a single thread in the same specification machine. We used a simple model for image classification of CIFAR dataset. Roughly, 500 predictions were made for panini and 200 predictions for TF serving. You can always download the entire panini in your own private server and not pay anything. Ie. used Helm to install in your own kubernetes or DockerHub. For now, We're making it free for models under 2GB. Our main goal is to make it usable and we don't want cost to be a factor. |
|
This seems surprising. What makes it so much faster?
Edit: Unless of course you are hitting the cache for a lot of the predictions?