Hacker News new | ask | show | jobs
by budwin 5693 days ago
"- It is compiled into machine code (no interpreter, unlike Python)."

anyone have any kind of measure on how bytecode compares to machine code these days? (most "interpreted" languages have a hidden compile step these days). I understand that this question is largely subjective to the runtime and the task being executed.

3 comments

Compare on what basis?

Speed? http://shootout.alioth.debian.org/ is an up-to-date, super detailed answer to that question. A rule of thumb simplification I go by is: current Python and Ruby implementations are at least 10x slower than C doing algorithmic work.

A really good bytecode VM can, of course, get much closer to C performance (as shown by Java, C# or even modern JavaScript implementations) especially if JIT is given enough time to profile the app at runtime and generate highly optimized code based on that profile data.

There are of course other aspect you can compare. Bytecode is, inherently, cross-platform and machine code isn't.

Bytecode is usually more compact than equivalent machine code (but then you need the constant overhead of the runtime to interpret that bytecode).

Python is my main language, and I think you underestimate the speed different for a same implementation by an order of magnitude. That is, CPU-boud tasks will most likely be around 100x slower.

The realy argument, of course, is that in a given time frame, with people of similar skills, you will not have the same implementation unless your team contains only vulcans. Several people in the scipy community have reported having gone from C++ to numpy/scipy and went faster at the same time - because C++ is so hard to use correctly, people whose job is not even programming ended up doing things very fast but one millions times because they don't understand their code.

This point is surprisingly not understood by a majority of programmers. Most of the time, you see benchmarks for some trivial or even non trivial algorithms, well specified, and get "look, this language is N times faster". But in my experience, this almost never happens in real life - code specification keeps changing, you need to redesign constantly what you're doing.

I think 10x understates the difference. Python (not familiar with Ruby) eats memory so for a lot of the number crunching I do it turns into a swap disaster. End up with Python using up 10% of the processor waiting around for disk I/O.
Your question is relatively meaningless, since 'bytecode' covers such a wide gamut of possibilities.

Python's bytecode is very high-level and basically just saves the expense of parsing the code. The interpreter is not JIT-ed and thus very slow (compared to machine code).

Java's bytecode is much less dynamic. 'a + b' can execute arbitrary code in python, but in java it is known at compile-time whether it is addition between native types or string concat. Java has a very good JIT compiler and thus can be as fast as 1.5x the speed of c.

Yeah, Python is bytecode compiled and run in a virtual machine. Hell, you can even JIT it into machine code (psyco, Jython with a JIT-enabled JVM, IronPython with a JIT-enabled CLR). Python never had a traditional interpreter.
There were tons of people who were walking around 10 years ago thinking Smalltalk ran on "traditional" bytecode interpreters, when the majority of VM instances used back then running were JIT VM. There was also widespread ignorance about Lisp implementation and performance. The sad part: the degree of misconception is still surprisingly bad. (Much the same held and holds true for Lisp.)

The Interpreter/VM distinction had become hazy and lost its meaning years ago. I think people still hold onto it only as a means of excusing poor language performance.