|
|
|
|
|
by lostdog
2078 days ago
|
|
This is such a great post. It really shows how much room for improvement there is in all released deep learning code. Almost none of the open source work is really production ready for fast inference, and tuning the systems requires a good working knowledge of the GPU. The article does skip the most important step for getting great inference speeds: Drop Python and move fully into C++. |
|
It's entirely valid to trade-off either a more straight-forward design or minimizing development time for performance and just throw hardware at the problem as needed.... companies do it all of the time.