Hacker News new | ask | show | jobs
by emef 2206 days ago
I wrote our internal lightweight version of neuropod at another SDC startup where we did use TensorRT. Our ML researchers worked in pytorch and more often than not, the pytorch -> onnx -> tensorrt conversion did not work. We ended up needing to replicate the network architecture using the tensorrt library and manually convert the weights from pytorch. Then we'd use the tensorrt serialization to compile the models so they could be run in c++. I imagine that they may have tried this in neuropod and saw the same conversion problems. TensorRT was a big investment to get running smoothly but it did shave off 20% or so off our inference latency
1 comments

It's gotten better in TensorRT7. I'm using it quite successfully. It does have a lot of corner cases though, that much is true, and the documentation is really poor, which, coupled with it being mostly closed source, severely limits adoption.

That said, I'm getting ridiculously good performance with it, even without using the TensorCores.