|
|
|
|
|
by alextttty
668 days ago
|
|
Yeah we want to do exactly this, benchmark and add more data from differnt gpus/cloud providers, will appreciate your help a lot!
There are many inference engines which can be tested and updated to find best inference methods |
|
It's a lot of work, your target users is companies that use Runpod and AWS/GCP/Azure, not Fireworks and Together, they are in the game of selling tokens, you are selling the cost of running seconds on GPUs.