Hacker News new | ask | show | jobs
by crubier 1106 days ago
Controversial, but I think that rather than trying to make Web stuff (e.g. React) work in Python, a more fruitful direction would be to make ML stuff (e.g. PyTorch, OpenCV) work in Typescript.

Javascript/Typescript is way faster than Python, is ubiquitous and can run pretty much everywhere, has many engine implementations, has an incredibly wide ecosystem, has a type system (Typescript) that blows any Python type system out of the water, runs in browsers, has non-stupid package management systems (PIP is a joke), is easy to get started, etc.

<The world if data scientists/ML/CV people used Typescript.JPEG>

10 comments

The speed of the output of the compiler of the glue language almost does not matter. Whether it is Python or TypeScript, it will not matter. Typically people will not use pip directly in serious projects, except for quick and dirty package installs to prototype something. In any serious project I would expect people to use any of the package managers, that support creating a lock file. But then NPM for example is not better than poetry. Or at least I have not had a single case, where I thought: "Ah, if only it was behaving like NPM." Rather the opposite.

But anyway, the whole language specific package manager business is starting to annoy me. I would like to be able to simply use something like GNU Guix and sometimes I am able to do just that.

> The speed of the output of the compiler of the glue language almost does not matter. Whether it is Python or TypeScript, it will not matter.

Depends on the use case. For ML, sure, it doesn't. For web servers, it depends. For stuff like this React-like tool, it matters.

It matters if it's actually meant to be used on the web, but I suspect the use case here is an easy front end for small python program in which case it doesn't matter at all and python interop is critical.
> But then NPM for example is not better than poetry.

Poetry is good exactly because it copied the concepts from NPM / Yarn. It's almost a port of NPM to the Python ecosystem, 8 years later.

Citation needed.

Poetry itself said Composer and Cargo were the main inspirations[1]. Both of them work differently from NPM; Poetry has mostly the same differences that its claim checks out. You would be in quite some surprises down the road if the sperficial similarities persuade you to apply one tool’s paradigm to another.

[1]: https://github.com/python-poetry/poetry/tree/0.1.0#why

Poetry has more to do with pip and virtualenv than NPM to be honest.
That is not a typical lock file. If it is, then it is a bad one. Lock files need checksums, not version numbers (oh well, both.). Version numbers do not protect from changes. At least not in all important cases or scenarios. I've had packages in the PHP world change their checksum and when I alerted them about it, they were like "So? We only changed some documentation of that version." ... Who knows what else people change in the same version. No. You need checksums.
Pip supports checksums too. A better link might be https://pip.pypa.io/en/stable/topics/secure-installs/
PyPI does not allow file uploads to be changed [0] but while that means this particular scenario is not an issue for PyPI/pip I'm not completely confident it's impossible to come up with a problematic scenario. Perhaps if a broken wheel was later published for an existing release with a working source distribution. In practice this is not something I've run into though.

[0] https://pypi.org/help/#file-name-reuse

pip-compile provides exactly that: https://pip-tools.readthedocs.io
OK, I guess the question is how pip-tools relates to pip then.

Is it a random third party package (someone just grabbing that name) or who manages it? As it is, it looks like not part of pip. (But maybe you were merely posting it as an alternative?) I've never heard of it before. I have been using various tools already, including merely pip, pipenv, poetry. Do I need to look for the newest tool every month? It begins to feel like the JS ecosystem.

There are different tools for different purposes.
No please no. Can we just leave javascript and all of it's flavors behind already. All of this innovation and people want to program in _transpiled_ programming language? It's like turles all the way down but it's just hacks all the way down in the js world and it'll start catching up with us if it hasn't already.
It’s not transpiled. Only some implementations of it are. Just use Bun or Deno with TypeScript. There is literally zero runtime impact. It’s just natively TypeScript.
i believe deno compiles typescript on the fly, in a V8 isolate; it is not interpreted natively in the same way javascript is (but, yes, you don't have to bundle it or transpile it first)
JavaScript is a joke when you consider things like prototypes and classes. Not saying Python is perfect but JS really feels atrocious to me.

TypeScript requires you to compile, the whole point of Python for ML and data science is you're running in an interpreted environment where you delegate as much code out as you can for both data processing and algorithms to invisible C++.

The interpreted part is key as ML is about experimentation. It's not like there's any overlap between current web developers and ML people anyway to need an unified bridge.

You don't exactly need to compile Typescript to make it run, you can just strip the types out which is much faster than compilation. If you want you can run `tsc` to have the compiler compile it and check for errors, but this can be running as you develop to act more as a linter.
> It's not like there's any overlap between current web developers and ML people anyway to need an unified bridge.

I guess you have completely missed people running language models on WebGPU then.

I think webasm + webgpu will be a target for a lot of new ml libraries.

Well yeah, as a sort of compilation target for sure, not just for language models but all ML models need to be integrated somewhere. The ML code and web code will be independent though.
Why does it have to be like that long term? There is no reason to keep them separate if they use the same language and platform. See https://news.ycombinator.com/item?id=36006626
Running esbuild to strip types and then running node, or using deno/bun, is much faster than Python. There’s also nothing that would prevent interactive JS or TS either… That’s what the browser debug console does.
We do with Typescript what a lot of people do with Python. Handle most things with Typescript and then contract off the really compute heavy parts to C\C++ (slowly moving to Rust). I’ve worked extensively with both Python and JavaScript and while we use Typescript because it’s much easier to setup an environment where your code is protected from you than it is with JavaScript you can actually do the same with JSDoc. I want to love Python, hell I do love Python, but I just love JavaScript a lot more.

I think the key difference is that it’s hard to setup good governance for JavaScript. You’re going to need a fascist linter that is actually enforced, and you’re going to need a tight grip on your developers to force them to really, really, think about their dependencies, but once you have that it’s quite honestly the best language I can think off. The ability to use isolated functions instead of putting them into “classes” is just such a great way to do a solid mix of functional OOP programming, which is obviously heresy to hardliners of either, but it’s just so magical when you do it right.

I think Python is getting there, it was such a great language for such a long time that it sort of forgot to improve. But now it has copied the NPM/Yarn package handler, and hopefully it’ll soon be possible to actually do a Typescript sort of Python, so maybe it’ll be able to win me back. Or to be fair, I think it’s a great language until you have to work together. It’s just so hard to get the codebase governance up with Python that it only really becomes worth it in ML shops where your developers want to work with Python. I’m not sure how Instagram managed, and there are certainly the projects that fit into the Django box which absolutely should be put into the Django box, but the only general purpose language to me personally is currently JavaScript.

Part of that is because we need initiatives like this one. We need “React” in Python or Rust or whatever if we want small dev teams in non-tech enterprise to be able to work with other languages than Typescript. Yes I know we have some C++, and a little Rust, but unlike the rest of our many different projects I’m the only one who can maintain them. Which is actually the primary reason we work with JavaScript, because if we don’t, then the React developer won’t ever be able to go on a vacation. :p It helps that JavaScript has become such a great language, and it likely has exactly because the React dev wants to go on vacation in a trillion IT departments. But I’m all for Python having this React Python so ML heavy shops don’t need that React dev.

But to say JavaScript is atrocious is sort of silly to me and I’m not sure you would have that opinion if you gave it a real chance.

Personally I only use SvelteKit which I find is nice by building around what the web technology currently is.

As for static types to be honest I don't see how it's useful in the context of ML. You might use a Pandas dataframe, which already comes with types enforced inside of it. You shouldn't ever use a for loop on a Pandas dataframe or whatever because then you're running Python instead of C++, Pandas has inbuilt functions and operators.

However Python does have type hints but probably not strict enough, Mojo may improve in this if it really supports both AOT and JIT.

Package management in Python isn't that bad with requirements.txt. The real problem is Python versions are breaking and whatnot (very often ML libraries are months behind latest) but the main Python installation you have only supports virtual environments for packages. Really it should support something like conda out of the box where you create an environment with a Python version.

I think the whole environment thing in Python is sort of a thing of the past. I got into Python after containers were the default, but I can see why the environments would’ve been brilliant before that. Today they feel more like a really terrible to work with version of node_modules though. I think Python is one of the languages that does dependencies the absolutely most annoying because of how they sort of tie into that.

But it’s not too terrible, it’s it’s not Node or Cargo, but it is really hard to build governance around.

My experience of doing maths in JavaScript was terrible. Fairly simple things like [1,2] < [10,2] return false (unlike every other language I’ve ever used), as < is implement by converting to strings, then comparing.

Is that fixed by Typescript, or does it inherit all those kinds of weird JavaScript behavior?

I don’t get the example - how is it obvious what should be the result of comparison of 2 lists ?
In Python, when comparing lists using the less than (<) operator, the comparison is performed element-wise.

The first element of each list is compared, and if they are not equal, the comparison result is determined based on the comparison of the first unequal elements.

In this case, [1, 2] and [10, 2] both have different first elements (1 and 10). Since 1 is less than 10, the comparison result is True.

The second element of both lists (2) does not affect the comparison result because the first element is already sufficient to determine the outcome.

[1, 2] < [10, 2] evaluates to True.

All of these could be equally valid results of your list comparison: True, False, [True, False], [[True, True], [True, False]].

I like that typescript does not rely on an implicit choice, but let’s me express exactly what comparison I care about.

I agree there are a few versions you could do, but most languages tend to do lexicographic.

This has been a suprising thread to me -- I just assumed "everyone knew" that the vast majority of languages do lexicographic comparision of lists.

I will say typescript does "rely on an implicit choice", it has a default implement (the "convert to string"), which I'm going to be honest, doesn't ever seem like a sensible choice to me -- although maybe it feels more natural to javascript/typescript people.

My personal upset (I lost like a day to this) is that if you keep your numbers under 10, you do get the lexicographic ordering, as then lexicographic = string. I had a bunch of unit tests (all using numbers under 10), just larger inputs kept breaking, and it didn't occur to me to go read the docs for < :)

What's the use case for this?
Lots of mathematical algorithms sort lists lexicographically. It comes up in graphs and lots of other combinatorics problems. Often you want a total order on 2D coordinates, an this ordering is (usually) the simplest and best.
TypeScript is a JavaScript superset. It adds features, but doesn’t change the meaning of existing stuff. (Almost all of what it adds can just be stripped out syntactically to yield JavaScript, but there are a few features that generate actual code, like enums.)
> Fairly simple things like [1,2] < [10,2] return false (unlike every other language I’ve ever used)

What other languages, besides Python, does this builtin list comparison work in? What's the result when the comparison is `[1, 2] < ["10", "2"]`?

I've never seen this usage of list comparison before but apparently it works in Rust (both of these print true):

println!("{}", vec![1, 2] < vec![10, 2]); println!("{}", (1, 2) < (10, 2));

List comparison works lexicographically in Haskell, C++ (using std::vector or std::array), Rust, GAP, every language I personally use.

Comparing strings and bits is always going to be weird, so I’ve not tested it.

Good luck recreating all the data science libraries in Javascript, that was decades of work. We now have WASM so its far more likely JS just isn't needed
You might be surprised. https://news.ycombinator.com/item?id=36006626

But of course yes, The ML stuff should be Wasm and WebGPU. The point is you can access it from Typescript. Just like most data science libraries are not Python, they just have Python bindings.

Why not just implement a Typescript backend that outputs Python instead of Javascript? (-:
TS is a joke

tsc is the only compiler I've had to step through with a debugger multiple times.

No DS/ML researcher wants to deal with VM args just to use more than 1GB of memory. That alone would cause so much frustration.

Not to mention unpredictable generational GC.

Or the crazy crap people do with the type system (what you call better other people call a mess).

At least Python has some semblance of runtime type safety.

> TS is a joke

I'll discuss facts

> tsc is the only compiler I've had to step through with a debugger multiple times.

99.9% of Typescript developers never ever had to do that. Sounds like a "you" problem

> No DS/ML researcher wants to deal with VM args just to use more than 1GB of memory. That alone would cause so much frustration.

You're talking about NodeJS, which is just one of the many JS engines. It's also 200% easier to start NodeJS with a flag to increase the (sane) default memory limit, as compared to the insanity of setting up a Python environment.

> Not to mention unpredictable generational GC.

Is Python GC better? Really? The good thing with Python is that the whole language is so slow that GC is just a drop in the bucket. On the other hand, Millions (Billions?) have been spent optimizing JS engines and it shows. Also: GIL.

> Or the crazy crap people do with the type system (what you call better other people call a mess).

I don't know what you're talking about, TypeScript go Brrr and I get magnificent Intellisense and subtle type checking, while MyPy and friends keeps crapping their pants

> At least Python has some semblance of runtime type safety.

No. And critically it has no semblance of comp time type safety either.

The Python GC is slow, but predictable. I've seen the Node runtime falls apart past like 8gb heap all the time, but I have python scripts that run just fine with a 32gb heap.

Requiring the max old space flag is already too much. It's annoying.

Python does indeed have some runtime type safety. Way more than any JS runtime I've used. And I've written way more Node/JS/TS than Python.

In terms of the compiler, Nah. The whole ecosystem stinks. I cannot wait to move my larger Node projects to Java or something else. Idk how much dev time people waste on arcane tsc or npm issues but the answer is "too high" regardless. At least esbuild is okay.

I spent a good chunk of last year learning go, and felt dumber for doing so. I feel like Rust is the way to go in 2023 and beyond -- particularly if the rustaceans can get past their own stupidity.
I haven't used go for some reason but it always looked nice. I like the simplicity, for some things. I know old C programmers trying to write web services in C (with hand rolled json...) and I wish I could get them using Go. :)

Rust is neat. I wrote a decent sized side project backend (weatherfy.com) in Rust, with sqlx, postgres, and axum. Shared cache was tough to figure out but everything else was easy.

The whole backend has like one lifetime annotation, so that wasn't too bad. Honestly, I'd use it again, but the compile times scare me.

There's a bunch of negatives I found. I'll gloss over them here.

- New libraries can't just be deployed to the OS, they must be compiled into the project.

- A check for thing==nil doesn't always work for all types of nil. Yes nil does not always equal nil.

- Panics aren't necessarily a bug. They could be a feature (?). Crashes are always a bug in C.

- Garbage collection is mandatory in go. C people are used to managing memory themselves like going for a walk. It's kinda natural.

- GC causes hard to debug performance issues. This was true in Java. And it's true in Golang.

- Golang channel primitives has bugs of their own with deadlocks and other interesting behaviors, whereas a lot of the threading issues in C are pretty well understood.

You cannot really compare Rust and Go.

Rust is for when you really need to perform manual memory management. If you don't need that, then a GC language is better.

If you're choosing manual memory management over GC, you better have a really good reason to do so or a really small application that you're writing.

Literally the whole point of rust is to not do manual memory management and let a compiler handle that for you.
>Also: GIL.

javascript is terminally single threaded, with the only solution being either multiprocessing and message passing, or cooperative multitasking style promises/async. This is not a situtation where javascript pulls ahead, as hard as python seems to try to fall behind.

and before you question my experience again, I have 100k+ LOC Node projects.

You admitted your opinion would be controversial. Guess what? It is. Hardly anyone wants to do ML in TS. No ML researcher <wants ? <to> | deal | with < type signatures > (like ? This)>>>>

Interesting that you say that, this was the top voted comment on a post recently about using typescript for ML stuff: https://news.ycombinator.com/item?id=36007493
> > At least Python has some semblance of runtime type safety.

> No.

I mean.

  >>> 1 + "1"
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: unsupported operand type(s) for +: 'int' and 'str'
> I'll discuss facts

> the insanity of

welp, so much for that.

Whoops ¯\_(ツ)_/¯
> has non-stupid package management systems

Excuse me what

Isn’t mojo the future on that front?
Mojo looks cool. But I'll wait for it to be open sourced AND as easy to use as Typescript + React for real complex web apps before considering it. I expect this to take 2-5 years if it ever happens. It's a possible future, not the present.
I agree.