Hacker News new | ask | show | jobs
by coder543 2681 days ago
As I pointed out in two lengthy comments on day one[1][2], that reasoning is nonsense. If Chris wants to use the language he created in this new endeavor for machine learning simply because he made it, that's totally fine and completely his prerogative, but he should just say so, rather than trying (and failing) to convince people that other languages aren't better suited for this task.

From my point of view, a weak justification is worse than no justification in cases like this.

Rust is much better suited to this task than Swift from a technical point of view. The far superior platform support for Windows and Linux is ample reasoning to say Rust is better suited for this task, since very few data scientists will be training models on macOS. However, that's only one of several areas where Swift has shortcomings for a project like this. Swift is great for iOS and macOS development, of course, since it was designed for that. I don't think Swift is a bad language by any means, and with enough effort, it can be reshaped to be good for Tensorflow... the GitHub document just provides zero useful justification for the work required to make it good for Tensorflow.

EDIT: to some of the replies talking about Rust's learning curve, that mostly applies when you start trying to design efficient, interlinked data structures involving ownership. For most applications of machine learning, this simply wouldn't be a problem. The library would provide the data structures, you just have to use them. Rust can provide simple interfaces to complicated things.[3] The compiler's error messages are usually incredibly helpful.

The learning curve of Rust should not be relevant here, compared to Swift, which is also full of idiosyncrasies. Swift and Rust both have a large learning curve for someone coming from Python. This is because they're statically typed languages that are just different from a scripting language. For an application like this, I would say those learning curves are roughly equal at the language level, but as I pointed out in my comments, Swift has an enormous learning curve of requiring many data scientists to either install and learn Linux, or throw out their current computer, buy a Mac, and learn macOS.

My point here is not that Rust is the most suitable language for Tensorflow (although it could be), but rather I'm making the point that Rust is more suitable than Swift for a project like this, and therefore this document is just annoying. It would be better for them to delete this document and just say "we're using Swift because our team has a lot of experience with it and because the creator of Swift is leading this project, so we would lack enthusiasm and momentum if we were using something else, even if it were more suitable."

Julia would be really interesting to see explored further, since it would appeal much better to many existing data scientists who would be transitioning from Python. The times that I've played with Julia, I was amazed at how slow the JIT is for even tiny scripts. LLVM is powerful stuff, but it is painfully slow at everything. It would be nice if Julia offered an alternative backend for rapid development.

[1]: https://github.com/tensorflow/swift/issues/3#issuecomment-38...

[2]: https://github.com/tensorflow/swift/issues/3#issuecomment-38...

[3]: http://kiss3d.org/

6 comments

I personally find Rust to have quite a learning curve (which I guess is also an opinion shared by others). The language is great though.

I do agree with your criticism of the document here, though. It feels very much like Swift happens to check many boxes, but the lack of Windows support is baffling. It's simply table stakes to be able to run, fully supported, on Windows, macOS, and major Linux distributions. That should be the very first thing anyone considers.

But beyond that, I think even with Rust's macro system it could be difficult to make it work for Tensorflow in a way that feels appropriate for Rust programmers _and_ for TensorFlow. This was explored in F# for Tensorflow research[0] and a completely different approach[1] was taken because making a type system suitable for tensorflow got too unweildy.

[0]: https://github.com/fsprojects/TensorFlow.FSharp

[1]: https://github.com/fsprojects/TensorFlow.FSharp#live-checkin...

> But beyond that, I think even with Rust's macro system it could be difficult to make it work for Tensorflow in a way that feels appropriate for Rust programmers _and_ for TensorFlow.

If you're talking about matrix shape compatibility (matching up rows from one with columns from another) I'm hopeful about const generics here: https://github.com/rust-lang/rfcs/blob/master/text/2000-cons...

It seems likely that the justification is retrofitted and team's familiarity with Swift was the bigger driver. I am surprised they didn't find Scala to be a good fit given that it has already been used with great success in Spark which I presume has similar technical requirements. Anyone can throw light on the short explanation below? Does it really apply to Scala?

"Java / C# / Scala (and other OOP languages with pervasive dynamic dispatch): These languages share most of the static analysis problems as Python: their primary abstraction features (classes and interfaces) are built on highly dynamic constructs, which means that static analysis of Tensor operations depends on "best effort" techniques like alias analysis and class hierarchy analysis. Further, because they are pervasively reference-based, it is difficult to reliably disambiguate pointer aliases."

I agree with you regarding lack of Windows support, however I would rather see Julia as a better alternative than Rust, given the language ergonomics.
More to the point static typing is just not that important for data scientists. Arguably it's not that important for backends devs either (e.g. lisp, erlang).
Should be prefaced with, "I think".

Having done user research on this by speaking to data scientists, I can say that static typing is desired by a nonzero number of who practice what we would consider to be data science and machine learning. Much like how TypeScript is seen as a revelation to hordes of JavaScript programmers who have never used static types before, the ability to get some level of correctness verification at design-time matters.

The more time I spend with strongly typed languages the more I am convinced it is the right way to go. For modern languages with good type inference, and good tools for protocols/interfaces not tied to an inheritance hierarchy, it is a at worst minor inconvenience for a huge benefit.
> I can say that static typing is desired by a nonzero number of who practice what we would consider to be data science and machine learning

Who would trade static typing with fast prototyping any time.

Data science is a really nebulous term covering many drastically different domains of CS. Many DS I talked with, don't really produce code, they do coding to produce analysis, which is the actual delivery. For them, code is ad-hoc and disposable, created on demand and left in the dust until rediscovered when mission comes.

Some of the code do survive and enter production stage, I guess that is where they would seek some assurance from static typing. But I do think they could learn to mitigate most of pain if they can commit themselves to write some unit-tests/functional tests, yet such awareness is rare among the DSs I know and worked with.

So all in all, yes static typing MIGHT help, in some way, but I don't think it addresses the underlying pain point as much.

> Who would trade static typing with fast prototyping any time.

These need not be at odds. Many ML languges like F# or OCAML, by use of type inference, get you type safety without having to type a bunch of stuff and sacrifice faster prototyping. And certainly in F# there is a history of having productive tooling that lets you prototype easily. Simply writing some F# code in an F# script in an IDE, hitting alt+Enter, and letting it execute in an interactive shell is hugely productive for exploratory tasks. And features like Type Providers build out types for an arbitrary data set that let you guarantee your code is actually correct for the data.

What I've mentioned isn't without its flaws, and eventually someone is going to reach head-scratching problems just as they would in any other environment. I don't think there's an objective way to measure productivity across a wide range of professionals, but I do believe that some subset of them would prefer static types for their work. This is backed by conversations with some of them about problems they encounter.

Although I am a big fan of a couple of dynamic languages, when it scales we really need static types to make any sense of it, even to our older selfs a couple of months down the line.

So gradual typing like in Julia is already a good thing for having the best of both worlds.

Correctness verification at the level that data scientists need can generally be achieved with optional typing (presuming a well designed type system)
Perhaps! I personally think it's still a very young field, and there's likely a spectrum of professionals who prefer some strong degree of typechecking.

This is being explored with "Live Checking" in F#[0], which offers a form of static typing over TensorFlow without actually forcing you to express every complex interaction with data in types.

[0]: https://github.com/fsprojects/TensorFlow.FSharp#live-checkin...

> achieved with optional typing (presuming a well designed type system)

Enter stage left: Julia

Julia is already pretty great, I'd really love to see what cool stuff we could have with a swell in community size and investment!

Yeah that's kind of what I'm referring to but the default array typing in flux.ml doesn't encode tensor dimensionality in the type system. If it did (which it very easily could in julia) you wouldn't wind up with a situation where your learning task halts in the middle of a training run, which can happen in flux.ml
In that regard Julia is hardly any different to TypeScript.
Julia compilation time is much improved in 1.1 and they are working on tiered compilation to make it better.
it's fairly well accepted that rust has a high learning curve and their targeted users are not software engineers, so I wouldn't say their point is nonsense
> If Chris wants to use the language he created in this new endeavor for machine learning simply because he made it, that's totally fine and completely his prerogative, but he should just say so, rather than trying (and failing) to convince people that other languages aren't better suited for this task.

Do you have any insider knowledge that Chris Lattner had the unilateral power to choose Swift for this project? I would imagine with the importance of TensorFlow at Google, the decision to go in this direction had to be agreed on by a number of people.

> The learning curve of Rust should not be relevant here, compared to Swift, which is also full of idiosyncrasies. Swift and Rust both have a large learning curve for someone coming from Python.

How exactly would Rust-Python interoperability work? Swift for Tensor Flow allows any python library to be called like a native library in Swift. Could you do that in Rust?

> Could you do that in Rust?

Yes, and companies are even doing it in production. Sentry probably being the best well known.