Hacker News new | ask | show | jobs
by arc619 893 days ago
Personally, I think Python's success is down to the productivity of its peudocode-like syntax letting you hack prototypes out fast and easy. In turn, that makes building libraries more attractive, and these things build on each other. FORTRAN is very fast but it's a less forgiving syntax, especially coming from Python.

In that regard, I'm surprised Nim hasn't taken off for scientific computing. It has a similar syntax to Python with good Python iterop (eg Nimpy), but is competitive with FORTRAN in both performance and bit twiddling. I would have thought it'd be an easier move to Nim than to FORTRAN (or Rust/C/C++). Does anyone working in SciComp have any input on this - is it just a lack of exposure/PR, or something else?

11 comments

Most code in science is written by grad students and postdocs. For them, trying new language is an enormous career risk. Your advisor might not understand it, and you might be all alone in your department if you try Nim.

That makes any sort of experimentation a really tough sell.

As a rule, I have found scientific computing (at least in astronomy, where I work) to be very socially pressured. Technical advantages are not nearly as important as social ones for language or library choice.

Change does happen, but extremely slowly. I am not exaggerating when I say that even in grant applications to the NSF as recently as 2020, using Python was considered a risky use of unproven technology that needed justification.

So, yeah, Nim is going to need a good 30 years before it could plausibly get much use.

Yep, going against the grain in graduate school is counterproductive unless there's a compelling reason.

Many grad students forget that their main purpose is to generate research results and to publish papers that advance the field, not to play around with cool programming languages (unless their research is about coding).

Here's a bunch of mistakes I made in grad school which unnecessarily lengthened my time in the program (and nearly made me run out of stipend money):

* Started out in Ruby because I liked the language, but my research involved writing numerical codes, and at the time there just wasn't much support for it so I ended up wasting a lot of time writing wrappers etc. There was already an ecosystem of tools I could use in MATLAB and Python but nooo, I wanted to use Ruby. This ended up slowing me down. I eventually gave in to MATLAB and Python and boy everything just became a lot easier.

* Using an PowerPC-based iBook instead of an Intel Linux machine. Mac OS X is a BSD (plus I was using a PPCarch) and Brew didn't exist back then, so I ended up troubleshooting a lot of compile errors and tiny incompatibilities because I liked being seen to be using a Mac. When I eventually moved to Linux on Intel, things became so much easier. I could compile stuff without any breakages in the one pass.

I also knew a guy who used Julia in grad school because it was the hot new performant thing when all the tooling was in Python. I think he spent a lot of time rejigging his tooling and working around stuff.

Ah the follies of youth. If only someone had pulled me aside to tell me to work backwards from what I really needed to achieve (3 papers for a Ph.D.) and to play around with cool tech in my spare time.

I guess the equivalent of this today is a grad student in deep learning wanting to use Rust (fast! memory-safe! cool!) even though all the tooling is in Python.

A grad student using a new language definitely definitely does not face any career risk IMO... I cant imagine a single professor or recruiter caring about something like this over material progress in their work.

My guess is that grad students are swamped and are looking for the shortest path to getting an interesting result, and that is most likely done with a tool they already somewhat know.

The question for Nim, like many other new products, is: why is it worth the onboarding cost?

My professor would have asked me what the relevance of Nim is to the actual subject of the research. Going against the grain has a cost, unless you're studying Nim itself.
And not only that, your code is likely to become the next student's code. The professor doesn't need to understand it, per se, but they do need to ensure it's useful for future maintainers/extenders. Will the next Aerospace Engineering grad student coming in understand Nim or be motivated enough to learn Nim and have time to continue the work? They likely already had Fortran, Matlab, or Python experience (which depends on their undergrad and when they went to school). Picking a novel language for the research group needs to have value for the group going forward, not just to satisfy the curiosity or taste of the RA.
None, but neither is Python…?
Depends on the surrounding body of work. In my case, 99% of papers in my references had Python/PyTorch implementations. Which is the entire point of this post.
Absolutely but being the devil’s advocate here - also what does that have to do with the research?

It’s good engineering and good management but research shouldn’t really care.

Python itself isn't really used for scientific computing. Pythons bindings to high performance libraries, many of which use Fortran under the hood, are used for scientific computing.

Combined with the ease of displaying results ala Matlab but much less of the jank, and you have an ideal general purpose sci comp environment

Back when I worked in scientific programming, we adopted a similar approach. The heavy lifting functions we wrote in C, but they were called from R which allowed us to plot the results, etc., easily. And the libraries we used (for solving differential equations) were all old school Fortran libraries.

If I were to start again today, I think I'd give Julia a look, though.

You can make this claim about anything that isn't direct machine instruction.

End users are using python - the advantage of modern computing is that whatever happens afterwards is irrelevant.

Still, people leave a lot of performance on the table when using Numpy in sub-optimal ways or when the problem just doesn't translate neatly to Numpy.

It's tempting to get lazy and just use a for-loop to iterate over an array sometimes and that will absolutely kill your performance.

There is often no real value in optimizing such code, if the computation finishes in a time that doesn’t mess with your workflow. Spending more time on it will often just take time away from something more valuable to the research.
Ahh yes, that's a good point. If you're, for example working in a Jupyter Notebook, it absolutely doesn't matter if a cell needs 3 seconds or 3 milliseconds to execute.
Frequently because those performance gains aren't actually needed. We live in an age where you can cheaply and quickly scale the hardware for 99% of tasks. Tasks that are too expensive to compute inefficiently are also unlikely to be profitable enough to be doing at all.
> Python's success is down to the productivity of its peudocode-like syntax letting you hack prototypes out fast and easy

Python's success starts with academia's movement of replacing Matlab with free software, namely numpy/scipy.

I love Nim and would absolutely use it for every piece of native code I need to write. Unfortunately, I find it suffers from a few big problems. First, the developer tooling and documentation is kind of inconsistent. The build tool, which is also used to install packages, supports a different set of args than the main compiler, which causes some weirdness. Second, the network effect. Most libraries are maintained by a single person, who put in a lot of effort, but a lot of bugs, edge cases, missing features and other weirdnesses remain. It's usually best to use libraries made for C or Python instead, really.
I work in scientific computing and I'm a huge fan of nim. I started writing a few tools at work in nim and was quite quickly asked to stop by the software development team. In their eyes, they are responsible for the long term maintenance of projects (it's debatable how much they actually carry out this role), and they didn't want the potential burden of a codebase in a language none of them are familiar with.

It's sad, as I feel nim would be easier to maintain compared to a typical c or R codebase written by a biologist, but that's what's expected.

I second to this. There's often a huge difference between the languages and tools we'd love to be using, and those that we are allowed / forced to use on the workplace.

I for instance just moved to a company where the data stack is basically OracleSQL and R. And I dislike both. But as _Wintermute pointed out, a whole company / department won't change their entire tech stack just to please one person.

Python is very easy to teach because syntax doesn't get as much in the way as with other languages. You van basicallly start with mostly english and then slowly introduce more complex concepts. With C for example you would have to delve into data types as soon as you declare the first variable.
I'm trying to switch from traditional software engineering to something sciencier--I've been taking computational biology classes and learning Nim.

I like Nim a lot. And I know that it'll scratch a necessary itch if I'm working with scientists. I also know that it's too much to ask that the scientists just buckle down and learn Rust or something like that.

But as someone who is not afraid of Rust but is learning Nim because of its applicability to the crowd that I want to help... The vibrancy of the Rust community is really tempting me away from this plan.

I've really enjoyed the Nim community also. I even contributed some code into the standard library (a first) and was surprised at how easy they made it.

But I have also written issues against Nim libraries which have gone unanswered for months. Meanwhile, certain rust projects (helix, wezterm, nushell) just have a momentum that only Nim itself can match.

Python benefitted from there being no nearby neighbors which resembled it (so far as I'm aware). If you needed something like python, you needed python.

Rust and Go and Zig are not for scientists, but they're getting developer attention that Nim would get if they didn't exist. Also, Julia is there to absorb some of the scientist attention. It's a Tower of Babel problem.

I can't say why the scientists aren't flocking to Nim, but as someone who wants to support them wherever they go, this is why I'm uncertain if Nim is the right call. But when I stop and think about it, I can't see a better call either.

> I can't say why the scientists aren't flocking to Nim, but as someone who wants to support them wherever they go, it's why I'm uncertain if it was the right call.

Because most scientists are only using programming as a tool and don't care one bit about it beyond what they need it to do. They don't go looking for new tools all the time, they just ask their supervisor or colleague and then by default/network effects you get Python, Fortran, or C"++". You need a killer argument to convince them to do anything new. To most of them suggesting a new language is like suggesting to use a hammer of a different color to a smith - pointless. With enough time and effort you can certainly convince people, but even then it's hard. It took me years to convince even just one person to use matplotlib instead of gnuplot when I was working in academia. You can obviously put that on my lack of social skills, but still.

Why is Go often lumped in with languages that don't have garbage collectors? I'm always confused by this. Is Go suitable for systems programming? I myself use Go, but for web development.
It’s advertised as a systems programming language, though the system definition it uses casts a much wider net (think kubernetes) than some people’s understanding of system programming (think bare metal bit banging).
I don't know. I lumped it in because but when I use Nim or Rust for something my coworkers ask "why not go?"
Yes I agree that Python success most probably due to its productory of its peudocode-like syntax that makes building libraries more attractive.

In addition to Nim, D programming is also Phytonic due to its GC by default approach and it is a very attractive Fortran alternative for HPC, numerical computation, bit twiddling, etc. D support for C is excellent and the latest D compiler can compile C codes natively, and it is in GCC eco-system similar to Fortran. Heck, D native numerical library GLAS is already faster than OpenBLAS and Eigen seven years ago [1]. In term of compilation speed D is second to none [2].

[1] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:

http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...

[2]C++ Compilation Speed:

https://news.ycombinator.com/item?id=1617133

The nim syntax only looks like python on the surface. It actually feels quite different when more complex language features are involved. Nim is more restrictive than python and harder to write. IMHO, nim is not the language that common python programmers would like especially if they only know python.
Absolutely. Fortran about 500 lines of code vs <20 lines for Python. The ease of use and flexibility of Python across so many application types makes for a good reason for its popularity. The rise of hardware computing performance makes speed tradeoff trivial.
For code implementing a numerical algorithm, I think the ratio of lines needed in Fortran vs. Python is much less than 25, maybe 2 or 3. And once the code is written in Fortran you just compile with -O3 etc. to get good performance and don't need to thnk about translating to Cython, Numba, or some other language.
I think I asked this in a Nim thread a month or two ago, but to me I don’t see a chance at competing in scientific computing without a good interactive EDA story, and python with a good out-of-the-box IDE and Jupiter Notebooks and iPython has an amazing story for interactive scientific computing.
I haven't tried this, but presumably nim would fit into jupyter pretty easily.