Hacker News new | ask | show | jobs
by p4wnc6 3815 days ago
Numba certainly does not obviate all of these issues. I think user `tadlan` is referring to some of the compiler optimizations, like loop unrolling or fusion (e.g. noticing that two subsequent loops can be 'fused' into the subordinate execution block of one single loop). These things can offer speed-ups even beyond NumPy, and they can work even when the Python code you start out with already uses NumPy.

The thing is, which `tadlan` seems unaware of, a lot of this stuff just fails in production environments and hits corner cases that the Numba compiler does not handle (I'm talking about the first part of the Numba compiler pipeline, where it converts to Numba IR, and not yet to LLVM IR, and does things like examine the CPython bytecode to alter the representation from a CPython stack-based representation to a register-based representation that will be compatible with LLVM and ultimately with the actual machine itself). In that compilation step, the only things that are able to be handled are things that the Numba team (I used to be a member of it) explicitly support. They don't support a full-blown compiler for the entire Python language, nor even for every type of NumPy operation. That's not a knock against Numba at all -- it's a specializing compiler and obviously they need to prioritize what to support, and make longer term goals about supporting more general things. But the point still remains that you cannot just assume that if you call `jit` on any arbitrary Python code, it will always become faster. In some cases, it can even become slower.

I suspect `tadlan` is very interested in Numba and enthusiastic about knowing the taxonomy of Numba details, but it does not seem like that user has had real world experience trying to get Numba to work in production, and seeing all of the numerous buggy and missing features. I don't want to diminish anyone's enthusiasm for Numba, so it's probably best just not to engage with `tadlan` about it. That user's mind seems made up already.

1 comments

As a former member of the Numba team, what is your outlook on the technology and the broader continuum ecosystem?

I'm looking at Numba and friends (Dynd, blaze) for a new stack, but I'm not sure where the development arc will end up vs say Julia. I'm also curious about the sustainability of Continuum's business model and practices in the medium and long term.

Any thoughts on this? I understand if you are limited in what you can say, but I'm open to any nuggets.

This is one of those questions that is extremely hard to predict. Even though I was part of that team, it doesn't mean I have any special insight into what will happen.

A lot will depend on sources of funding. Will folks like Nvidia start sponsoring Numba, and if so what will it mean for support of Nvidia alternatives like OpenCL?

It seems like NumbaPro as a stand-alone for-pay product is not viable on its own, but that claim could be wrong based on more recent data that I don't have access to. So external sponsorship may be necessary.

One form of this could be through Continuum's already established business model of consulting and support services. But then the question is whether the nature of those consulting and support projects will allow for developers to actually further the cause of Numba, or just merely hack in poorly conceived features that are demanded by the consulting and support customers? Since Numba is open source, it should be easy enough for anyone to follow along with commits and discussions on GitHub and make their own opinion about what direction that is going.

The other question that is always hard is staffing. Far and away the colleagues I had the chance to work with on the Numba team were amazingly good. But it's not clear if working solely on Numba can justify the sort of salary that would be required to attract very top engineers and grow the team. You might start to see more interns and/or post-doc type labor feeding into Numba, and again I don't know what that will mean for the project ... could be good or bad.

At the same time, you've also got a lot of active development for Julia, PyPy, and a lot of people still prefer to use Cython rather than jitting functions. Some people even call into question the entire goal of making something that is "easy" but also a "black box" -- like the way just dropping in `jit` works for people who merely use, but don't understand the inner workings of, CPython.

It's an exciting area, and the Numba team has as much talent and ability to claim a significant piece of the tool space surrounding high performance computing as anyone else. Whether that will pan out for them is still really hard to predict.

Thanks very much for your thoughts.

What do you think about the foundational tech of numba itself?

Do you think it is any more of a black box than say Julia? Is there anything about it that would impede extension into a more stable, predictable and feature rich product?

BTW looks like Intel is doing some stuff with Julia: https://github.com/IntelLabs/ParallelAccelerator.jl

There has also been alot of recent funding to Continuum and dev of numba seems to be going strong. Also some recent work with AMD.