Hacker News new | ask | show | jobs
by waleedka 3152 days ago
This architecture is optimized for accuracy rather than speed. The official paper reports 200ms inferencing time per image on a GPU. This implementation is likely a bit slower because we use Python in a couple of layers. This is easy to optimize, but we haven't gotten around to it yet.

With that said, there are a lot of things you could do to make this much faster. For example, use ResNet50 instead of ResNet101. You can also reduce the number of anchors or the number of proposals to classify, and that should improve performance significantly at the expense of a little loss in accuracy.