Hacker News new | ask | show | jobs
by brrrrrm 1237 days ago
the issue right now with Python support in WASM (at least for machine learning, the main driver of the language) is that Python is largely a wrapper language and none the utilities that make it so powerful (numpy, PyTorch, JAX) work particularly well in wasm, since it's so limited performance-wise (no FMA, no GPU support).

I'm excited for pairing wasm with WebGPU, which will likely unblock these projects from building support for the web/untrusted ecosystem. A useful project would be one that makes this integration really easy to build today and a flip of the switch to turn on in the future.

3 comments

I've come across this notion that nowadays machine learning provides (in some sense) the biggest group of Python users a few times recently.

What reason is there to suppose this is true? It seems surprising to me.

It's really hard to do much ML in anything _except_ python. Virtually everyone improving the ML ecosystems of other language got their start in Python and are knowingly competing with Python (e.g. R, Julia). If you want to get started in ML today, python is the obvious easiest path forward.

So, most ML users are python users. I don't know how that group compares to non-ML python users, but I have a feeling there isn't a flood of eager new Django devs the way there is Pytorch users. Most non-ML things you could do with python can be done similarly well in Go/Rust/Typescript, but there's no other option for most ML stuff.

I found a recentish (2021) survey at [1] which suggests that in 2021 ML was some way behind web development, sysadmin stuff, and data analysis among Python users (and didn't seem to be on the way up the list).

[1] https://lp.jetbrains.com/python-developers-survey-2021/#Gene...

The obvious next question being, what's the difference between ML and data analysis from the perspective of the survey participants (is ML a strict subset)? Given the values don't add up to 100%, there's likely lots of overlap and so you could easily have web developers choosing Python for the ML ecosystem.
Great source; looks like I've quite underestimated the python-web-dev crowd's size.

I'm curious what the longer-term trends look like; not much change between consecutive years.

Data analysis is basically a pre-requisite for ML, so the combined "data stuff" usage is quite a lot bigger than web dev usage!

I can see that a huge part of the ML space is in python. But is the ML space really such a large community? I mean in the companies I have worked for maybe 1 team in a 100 work with some ML stuff while a large majority of the rest come in contact with web/devops/unix scripting. Granted non ML work takes place in many different languages but python is used alot there.
> What reason is there to suppose this is true? It seems surprising to me.

One reason is its just super easy for input output operations. ML is all about data and getting the data to the right place is really easy in python compared to some other languages..

Which languages?

Python is OOP; but the "classical" data-centric languages are actually all more or less in the FP space. (I count array languages and APL-likes to FP in this case).

Just an example: You don't have immutable data types by default in Python. This is actually a pretty bad default for data processing tasks.

Python has a huge library ecosystem and the average Machine Learning programmer is not a CS geek and so prefers pythonic quasi-OOO then over FP (one of the hurdles for JAX adoption is it's functional paradigm)
The claim was that Python is better suited to writing such libs and frameworks than other languages.

You now say it's like that because Python has already quite some libs / frameworks in that direction.

This looks like circular reasoning.

Also the the "prototypical ML dude" comes form the math department. People with math background have a much easier time to grasp FP than procedural programming. FP is much more "natural" when you're used to math.

(Procedural programming says things like `x = x + 1`, but even my grandma would know that "this has no solution", or is likewise "plain wrong" ;-))

Just because python makes sense for ML does not mean that it's primarily used for ML.
I have integrated pyodide + webgpu recently. (you can do matmul using webgpu's compute pipeline). The real problem is that browser tabs have 4gb max memory size. So, training neural networks on this stack is almost impossible. ( I don't even want to mention pyTorch's dependency hell).
WebAssembly Memory64 is coming

https://webassembly.org/roadmap/

My claim is that it’s not easy, not impossible. There’s little incentive to hack in JavaScript or maintain a Pyodide compatible build. The 4gb limit isn’t a technical limitation, just a standards thing (it could change easily).
Don't expect too much from WebGPU, its hardware target model for the MVP is how version 1.0 of modern GPU bindings used to be several years ago.

In fact, in what concerns compute, it is hardly any better than GL ES compute shaders that Chrome refused to add to WebGL.