Hacker News new | ask | show | jobs
by kuang_eleven 1266 days ago
Because it honestly doesn't matter most of the time.

In the majority of use cases, your runtime is dominated by I/O, and for the remaining use-cases, you either have low-level functions written in other languages wrapped in Python (numpy, etc.) or genuinely have a case Python is a terrible fit for (eg. low-level graphics programming or embedded).

Why bother making a new variant language with limitation and no real benefit?

2 comments

This perspective is a common one, but it lacks credibility.

https://twitter.com/id_aa_carmack/status/1503844580474687493...

Look, Carmack is a genius in his own corner, but he is taking that quote vastly out of context. The actual linked article[1] is quite fascinating, and does point to overhead costs as being a potential bottleneck, but that specific quote is more to do with GPUs being faster than CPUs rather than anything about Python in particular.

More specifically, overhead (Python + pyTorch in this case) is often a bottleneck when tensors are quite small comparatively. It also claims that overhead largely doesn't scale with problem size, so the overhead would only matter when running very small tensor operations with pyTorch with a very tight latency requirement. This is... rare in practice, but it does happen to occur, then sure, that's a good reason to not use Python as-is!

1. https://horace.io/brrr_intro.html

You’ve posted this multiple times in this thread, and not once has it been relevant to the point being made. You are sticking your fingers in your ears and deferring to a contextless tweet by a celebrity.
If you want somebody to engage in a serious discussion, insulting them is not the way to go.

The post is highly relevant. Next time, if you don't understand why, just ask.

His code has had 16.6ms to execute since before a lot of people here had been born. Of course Python is hopeless in his domain. It’s creator and development team will be the first to admit this.
John Carmack is hardly an unbiased source.

In any case, if your program is waiting on network or file I/O, who cares whether the CPU could have executed one FLOP's worth of bytecode or 9.75 million FLOPs worth of native instructions in the meantime?

It's trivial to prove that this is true for most software. Luckily modern OS's are able to measure and report various performance stats for processes.

You can open some software such as htop right now, and it will show how much CPU time each process on your system has actually used. On my system the vast majority of processes spend the majority of their time doing nothing.

Is it true for all software? Of course not! Something like my compositor for example spends a lot of time doing software compositing which is fairly expensive, and it shows quite clearly in the stats that this is true.

The "vast majority of software" is now defined as "processes that happen to run in the background on chlorion's machine?" That reasoning is not sound.
I would add high-volume parsing / text processing to the list of bad fits for Python, although I'm not sure if there are native extensions for the different use cases?
Quite possibly; do you specifically mean NLP work? I'll admit, it's not something I work in myself, spaCy seems to be the go-to high-performance NLP library, and does appear to use C under the hood, but I couldn't say how it performs compared to other languages.
I had SAX-style parsing of XML and XSL transformation as concrete use cases in mind, because that happened to be what I worked with. I believe I went with Node.js at the time, which had a library that was much easier to work with than what was common for Python. Although, I mostly used Microsoft's or Saxon's XSLT processors overall in that job.

Another use case was parsing proprietary, text-based file formats with ancient encodings. I believe I did use Python for that as there wasn't that much data to convert anyway and it just worked.