| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by xxxpupugo 2502 days ago

CUDA is an important part of the story.

I think the industry is moving to 'MLIR' solution (Yes, there is a Google project called exactly that, but I am referring to the general idea here), where the network is defined and trained in one place, then the weights are exported, delegated to optimized runtime to be executed.

If such trend furthers down, then there will be very little reason to replace Python as the glue layer here. Instead it will become that everything is training in Python -> exported to shared format -> executed in optimized runtime, kind of flow.

Rust's opportunity could be to replace C++ in this case. But do mind that this is also a competitive business, where the computation is pushed further down to the hardware implementations, like TPUv1 and T4 chips etc.

2 comments

skohan 2502 days ago

Personally I hope for alternatives because I don't find Python that nice to work with compared to other languages. It's not awful, but I really miss having type and nullability errors detected by the compiler and reported to me with meaningful error messages.

Also there are a few weird things with the Python workflow, like dealing with Python2 vs Python3, and generally having to run everything in the context of a virtual environment, and the fact that it doesn't seem to really have a single "right way" to deal with dependency management. It's a bit like the story with imports in Javascript: those problems were never really solved, and the popular solutions are just sort of bolted on to the language.

It's amazing how many great tools there are available in Python, but it does sometimes seem like it's an under-powered tool which has been hacked to serve bigger problems than it was intended for.

link

ImNotTheNSA 2502 days ago

> like dealing with Python2 vs Python3, and generally having to run everything in the context of a virtual environment, and the fact that it doesn't seem to really have a single "right way" to deal with dependency management.

Not to mention, this problem seems to be getting worse, not better. People are moving off of python 2.7, which was kind of the de-facto LTS release of python... leaving (currently) no LTS version of python and no clear path for the community to establish a new LTS version that the community will still support — there are so many releases with so many breaking changes in python 3 within the last few years that there is seemingly no consensus and no way to compromise.

> It's amazing how many great tools there are available in Python, but it does sometimes seem like it's an under-powered tool which has been hacked to serve bigger problems than it was intended for.

This is becoming more and more clear with every release of python IMO. The language is evolving too quickly for it to be used on large developments, but it’s still being used for that.

We have an entire test framework and libraries for high performance embedded systems level testing which is written entirely in python. The amount of high speed operations (timing tests, measurements, etc) is very obviously overwhelming for the intention of the language and libraries, yet the company keeps pushing ahead with this stuff. In order to mitigate the issue, we are developing more high speed embedded systems to offload the work from the test framework and report measurements to the framework. I think it’s quickly becoming extremely expensive and will only become more expensive — the framework is extremely “pythonic” to the point of being unreadable, using a number of hacks to break through the python interpreter. Jobs much better allocated to readable C++ libraries are being implemented in unreadable python spaghetti code with no clear architecture - just quick-and-dirty whatever-it-takes python.

I love python but I think it’s a quick-and-dirty language for a reason. What python does well cannot be beat by other languages (for example, prototyping), but I think it is often misused, since people can get something up and running quickly and cleanly in python, but it eventually has diminishing (and even negative) returns.

link

llukas 2502 days ago

Can you name breaking changes in between python 3.x versions?

link

mplanchard 2502 days ago

Python 3.7 made async a reserved word, which broke the widely used celery package up until their most recent release.

Raising a StopIterationErorr in a generator in 3.7 now raises instead of ending the iteration, which broke several functions at my workplace.

3.8 has several further backwards incompatible changes incoming: https://docs.python.org/3.8/whatsnew/3.8.html#changes-in-pyt...

link

cwyers 2502 days ago

There are alternatives -- Julia and R both have bindings for Keras, which gets you Tensorflow, CNTK and Theano. (I think there are bindings for Tensorflow directly from R and Julia as well.) Once you have a trained model, it doesn't really matter from the production standpoint how you trained it.

link

MiroF 2502 days ago

Python should and will be replaced, but not at all for any of the reasons mentioned in this thread.

A good ML language is going to need smart and static typing. I am so tired of having to run a whole network just to figure out that there's a dimension mismatch because I forgot to take a transpose somewhere - there is essentially no reason that tensor shapes can't just be inferred and these errors caught pre-runtime.

link

ddragon 2502 days ago

Do you have an example of a tensor library that keep track of shapes and detect mismatches at compile time? I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information). So they compromise and make only the dimensionality as a parameter, which would not catch your example either until the runtime bound checks.

link

MiroF 2502 days ago

If you explain a little more of what you mean, I might be able to respond more effectively.

> I had the impression that even in static languages having tensors with the exact shape as a parameter would stress the compiler, forcing it to compile many versions of every function for every possible size combination, and the output of a function could very well have a non deterministic or multiple possible shapes (for example branching on runtime information).

I was a bit lazy in my original comment - you're right. What I really think should be implemented (and is already starting to in Pytorch and a library named NamedTensor, albeit non-statically) is essentially having "typed axes."

For instance, if I had a sequence of locations in time, I could describe the tensor as:

(3 : DistanceAxis, 32 : TimeAxis, 32 : BatchAxis).

Sure, the number of dimensions could vary and you're right that, if so, the approach implied by my first comment would have a combinatorial explosion. But if I'm contracting a TimeAxis with a BatchAxis accidentally, that can be pretty easily caught before I even have to run the code. But in normal pytorch, such a contraction would succeed - and it would succeed silently.

link

ddragon 2502 days ago

You understood it correctly. Named dimensions is certainly a good idea even in dynamic languages as a way of documentation and runtime errors that actually make sense (instead of stuff like "expected shape (24, 128, 32) but got (32, 128, 24)"). I hope it catches on.

link

MiroF 2502 days ago

But combined with static checking, it could be very very powerful. Definitely agree re: the power even in dynamic languages, I use namedtensor for almost all my personal development now (less so in work because of interop issues)

link

vbarrielle 2502 days ago

There's a research language that supports compile-time checking of array dimensions: futhark [1]. It's an interesting language that compiles to CUDA or OpenCL. However it's probably not ready for production (not sure if there's even good linear algebra implementations yet). It does feature interesting optimizations to account for the possible size ranges of the arrays (the development blog is very instructive in that respect).

[1] https://futhark-lang.org/

link

ddragon 2502 days ago

Thanks, I'll look into it.

Julia for example does have an array library that does compile-time shape inference (StaticArrays), but it cannot scale for large arrays (over 100 elements) exactly because it gets too hard for the compiler to keep track, I'm definitely curious about possible solutions.

link

mruts 2501 days ago

Normal static typing won't help you there. You would need some sort of dependent typing. For example, look at Idris.

link