Hacker News new | ask | show | jobs
by andreasvc 4319 days ago
Indeed, Cython is not a very sophisticated compiler (although it does benefit from existing optimized C compilers), but the same argument goes for Julia. That's a matter of implementation for which there is room for improvement; the interesting discussion is which is the more appropriate architecture.

As for debugging, I admit, it can get ugly with Cython, but in my experience this only happens when you decide to do low level manual memory management. This ability to shoot yourself in the foot is part of the trade-off of close-to-the-metal performance. It's not pretty but then again descending to that level is entirely optional.

You say that the bridge between C and Cython is not cheap, but this characterization is mistaken. It is easy to write Cython code that maps 1-to-1 to C code without using any part of Python whatsoever. What is expensive is bridging back to Python; e.g., calling a Python function requires constructing a tuple and all Python objects are heap allocated. However, you can choose to use this bridge as little as you want (the extreme case is only using Python to call a main function defined in Cython).

You argue that Cython only produces modules but not native binaries. In fact this is not true, it can produce such binaries but that implies including a Python interpreter as part of the binary (--embed option).

> and want more speedup than this affords perhaps better approaches are needed.

What are you alluding to here? Cython offers the speed of pure C/Fortran. I can think of 2 limitations: calling back and forth from C code to Python code is expensive (but then don't do that frequently, if it's a tight loop it's worth optimizing), and JIT optimizations.

1 comments

> the interesting discussion is which is the more appropriate architecture.

I am quite convinced that between the two, Julia's is the better way. I dont think you will be convinced so I will leave this thread with this last comment.

With Julia's JIT, macros and multiple dispatch and type specialization there is a whole world of things that you can do in Julia _now_ that you cannot do in the Cython/Python split world. Another advantage that Julia shares is that it is not saddled with Python in a way that Cython is. You might consider this an unfair advantage though.

> As for debugging...

I think you are coming from a position that allows you to brush such issues aside with "low level is hard, so suck it up. That experience is going to be bad anyway"

I disagree. First, with Julia I probably wont need to drop down to that level as often. Secondly, Cython takes away one major redeeming quality of Python in the numeric context: Numpy array syntax. I cannot use that any more in any performant sense because that would callback into Numpy API. So now I have to writethat indexing code in low level C in Cython syntax. Thirdly, if you give me the full power of C or C++ I can manage to get low level with less complexity than the Python / Cython split world and with less things saddled on me.

Why do you think that it is better to talk to C through those limitations ?

If I do, I would be writing C in a Python syntax that supports some fuzzy subset of C and some fuzzy subset of Python, which will then get compiled by a simplistic compiler to produce quite a sizable C code which I would then compile with a C compiler to get a module with questionable debugging support. Compared to Julia this looks clearly worse to me, you may feel otherwise.

If I really need to write C I would prefer to have full C at my disposal without multi language split braining. I would like to speak to the C compiler without an indirection though another compiler.

I stand corrected about Cython's abilities to produce a binary, but dont find the argument "oh by the way it will come with the Python interpreter" unless I really do want to embed a Python interpreter. Dont get me wrong, Cython is awesome if you want to integrate C with Python, I have already said this before. Its great if you have legacy Python code, or co-workers who are unfamiliar or unwilling to work with C. But when you are free of such constraints, Cython only saddles you with more.

As for speed of the Cython_module <---> Python bridge, I think we disagree about what is fast. Take a concrete example of gradient decent code. One way to do it is to have the gradient decent hot loop as a Cython function that takes two Python callbacks the function that you are trying to minimize, and another to compute the gradient (after all the numpy syntax is nice for such things). If you do this the speed is going to be abominable. The next option is to have the loop in Python but have the objective function and the gradient function as Cython. Even if you manage to bind these two names in the closest scope possible, Python will repeatedly lookup the names again and again before calling them from a dynamic interface, and its a Python loop, not known for speed. Furthermore in that Cython implemented function I have lost the pleasant syntax of Numpy. Furthermore, this bridge is a compiler optimization barrier. So the really viable option is to convert the containing loop in Cython, then after that what remains ? If this is what is required I would have just written this whole thing in plain C, or C++ and have had the full language and tooling at my finger tips. What additional advantage is Cython giving me here ? It is not without advantages, one I have already mentioned, integration with Python code, another is prototyping. With Julia the latter is taken care of, and the former is covered somewhat. Although If I have strong need to play well with Python _now_ I would choose Cython over Julia.

The example was by no means hypothetical, have done this and code speed improved by an order of magnitude when I redid it entirely in C++. Doing away with the lookups by itself sped it up and when I coaxed the compiler to optimize across the boundary, in particular inline the functions that is what gave an order of magnitude improvement. Julia's design permits such things to happen without the need for a split-brain problem.