|
|
|
|
|
by kahnjw
2502 days ago
|
|
This is just not true. The python runtime is not the bottleneck. DL frameworks are DSLs written on top of piles of highly optimized C++ code that is executed as independently from the python runtime as possible. Optimizing the python or swapping it out for some other language is not going to buy you anything except a ton of work. We can argue about using rust to implement the lower level ops instead of c++. That might be sensible though not from a perspective of performance. In a "serving environment" where latency actually matters there are already a plethora of solutions for running models directly from a C++ binary, no python needed. This is a solved problem and people trying to re-invent the wheel with "optimized" implementations are going to be disappointed when they realize their solution doesn't improve anything. |
|
Yeah it’s not reasonable right now because Python has the best ecosystem. But that will not always be the case!