|
|
|
|
|
by emef
2206 days ago
|
|
I wrote our internal lightweight version of neuropod at another SDC startup where we did use TensorRT. Our ML researchers worked in pytorch and more often than not, the pytorch -> onnx -> tensorrt conversion did not work. We ended up needing to replicate the network architecture using the tensorrt library and manually convert the weights from pytorch. Then we'd use the tensorrt serialization to compile the models so they could be run in c++. I imagine that they may have tried this in neuropod and saw the same conversion problems. TensorRT was a big investment to get running smoothly but it did shave off 20% or so off our inference latency |
|
That said, I'm getting ridiculously good performance with it, even without using the TensorCores.