| The 200-100 times slower is a bit cherry picked, but use case does matter. Typically from a user perspective, the initial starting time is either manageable or imperceptible in the cases of long running services, although there are other costs. If you look at examples that make the above claim, they are almost always tiny toy programs where the cost of producing byte/machine code isn't easily amortized. This quote from the post is an oversimplification too: > But the program will then run into Amdahl's law, which says that the improvement for optimizing one part of the code is limited by the time spent in the now-optimized code I am a huge fan of Amdahl's law, but also realize it is pessimistic and most realistic with parallelization. It runs into serious issues when you are multiprocessing vs parallel processing due to preemption, etc . Yes you still have the costs of abstractions etc...but in today's world, zero pages on AMD, 16k pages and a large number of mapped registers on arm, barrel shifters etc... make that much more complicated especially with C being forced into trampolines etc... If you actually trace the CPU operations, the actual operations for 'math' are very similar. That said modern compilers are a true wonder. Interpreted language are often all that is necessary and sufficient. Especially when you have Internet, database and other aspects of the system that also restrict the benefits of the speedups due to...Amdahl's law. |
In summary, it depends. I am talking about compute performance, not I/O or general purpose task benchmarking. Yes, if you have a mix of compute and I/O (which admittedly is a typical use case), it isn't going to be 20-100x slower, but more likely "only" 3-20x slower. If it is nearly 100% I/O bound, it might not be any slower at all (or even faster if properly buffered). If you are doing number crunching (w/o a C lib like NumPy), your program will likely be 40-100x slower than doing it in C, and many of these aren't toy programs.