Hacker News new | ask | show | jobs
by viig99 2116 days ago
Yes accuracy, latency & throughput are the 3 poles we try to achieve, c++ helps with latency & throughput and helps keep the cost low.
1 comments

Why would c++ help with latency in comparison to say Python with numpy / numba / Cython? All the production critical “this needs to be as fast as possible stuff” I’ve ever worked on has been all Python, achieving complete speed parity with C, at a much faster development speed and with way way less boilerplate code.
If you have hard constraints at inference time, then it can be much easier to tune to a time budget with C++.

Like, it's normally not worth it, but when you need it, you really need it.

I definitely agree that could be a case where you want a statically compiled module that avoid any interpreted language overheads or high cost abstractions. But what would make C++ easier to write, tune, integrate or deploy in that case than using Cython to create the C++ extension for you?
I dunno man, I was always against running stuff in C++ if I didn't have to, but I got over-ruled. I guess that the high availability of C++ developers helped swing the decision.
I personally find C++ + pybind11 vastly easier to work with, also transitioning completely to c++ from there was a pretty small leap.
Interesting, I’ve never heard anyone who frequently uses Python and C++ together express this preference, it’s always the other direction that Cython is easier.
pytorch is pybind11 + c++