Hacker News new | ask | show | jobs
by crabbone 1171 days ago
For research? -- Julia seems to be definitely better. It's purpose-built for doing just that. R would also be there. If you want general statistics, then add J to the fold.

But specific fields often have their own, bespoke solutions. I've only ever dealt with math, but it has plenty of its own niche languages that are much better than, say, Sage. My personal choice was Maxyma, but that's because I like Common Lisp.

Furthermore, it's just the situation today. It doesn't mean that this is what it has to be tomorrow. Any of the languages I used in this domain have their issues, and could still be improved. We are nowhere near a place where it's hard to imagine something better than what we have. So, I'd say, if you really want a very good language, you might as well start building one now -- you have a very good chance yours will be the best one so far.

2 comments

> For research? -- Julia seems to be definitely better. It's purpose-built for doing just that.

What does this even mean lol, “research” is incredibly broad

I'll elaborate. When I use the word "research" I mean activity performed by accredited researchers. Researcher is a rank, or a title if you want. It's similar to "software engineer" -- if you are in an academic institution the word "researcher" could be part of your title / job description.

The activity has as its goal to publish the findings. Findings are expected to be meaningful and to be found in the domain of sciences. It is also possible to extend the scope to arts, but I'm hesitant about it, and the way I used the word doesn't truly relate to what happens in arts department (for this purpose, math is an art).

Accredited, here means association with academic institution by means of employment or similar.

On conceptual level, all discoveries in sciences must have a hypothesis whose validity and truthfulness is to be established through empirical evidence, i.e. experiment. To interpret experiments one needs to use statistics. This is the vehicle driving experiments. This is where the kind of programming I'm talking about comes in.

Note that researchers might use programming for other activities s.a. eg. automation of keeping their diaries, or automating their correspondence and so on. But I meant specifically the statistical aspect of doing research.

I think what OP means is that Julia has a number of features that work very well for the workflows and processes of programming for scientists.

1. Interactive workflows. One of the defining features of doing science is that you don't know what the right answer or right approach is. This makes interactive workflows (like REPLs) really valuable since you can load data once and do 100 different analyses on it. Notebooks are also really useful as a means of showing both code and results at the same time, and Pluto.jl is one of the best here since it removes the possibility of ending up with inconsistent state by tracking dependencies between cells.

2. Reproducibility. Another important feature for scientific code is that you want someone to be easily able to take your data and code, be easily able to install the code, run it and get the same answer. This is one of python's biggest shortcomings. Python has an incredibly rich package ecosystem, but is lacking a good unified system for reproducibly installing packages (Poetry is the closest but it has problems with binary dependencies). Julia (and Rust) have virtual environments and the idea of a manifest file that records the exact version of all your (transitive) package dependencies built in which make it trivial for someone else with no instructions get an exact clone of all the software needed to run your code.

3. Ease of use. Most scientists are scientists first and bad to mediocre at programming (there are obviously exceptions, some scientists are great programmers). Static type systems and manual memory management are major impediments to beginner use. C++ gets some scientific use for it's performance, but there's a reason Python R and Matlab are the languages of most scientists.

4. Performance. Lots of fields (bio, astronomy, high energy physics, chemistry) need a fast language to be able to get results in a reasonable amount of time. Julia is fast and is one of the easier languages to write GPU accelerated code in.

5. Open source. Closed source languages (Matlab) are a total pain to deal with.

>R would also be there.

Surely you jest. Whatever problems Python may have with proper development or deployment practices, R is ten-fold worse. R is and will forever be an interactive, make-it-work-now language and production backbone second. The language is far too accommodating and will go silent casting all manner of sins in the manner of keeping the program running. Package management is still a huge headache as proper isolation is still not well addressed. Too many R packages assume they have root during installation and can do whatever they wish. Volumes can be written about R namespacing.

> R is and will forever be an interactive

I don't see this as a problem for researchers. Quite the contrary. To compare this to Python: Python's interactivity sucks. Bad syntax prevents it from using it interactively efficiently. Abysmal debugger. And if you consider that a typical environment in which Python is used in research setting is a notebook, then add to it an even worse wrapper for debugger available in notebooks.

Python is garbage for production systems too. If you want to use the results of your research for practical stuff, you will not use Python code you wrote for research. I'm more familiar with the world of medical research, and can confidently say that I've never seen a practical medical device or software product that used Python. Medical equipment typically wants to be realtime, which is a world completely closed to Python, for example.

However, Python (or R) being bad for production systems is perceived to be an acceptable price... I wish it wasn't, but that's how things are.

You cannot unironically claim that Python has superior package management... It's the shittiest ever. I have not seen a language which has done it worse than Python, and I've worked with at least 2-3 dozens of them. In my day job, I'm in the ops / infra department. I maintain a lot of Python packages both for commercial entities and for open-source independent developers. Part of me is very upset that Python is such a shit-show when it comes to packaging, but another part of me is happy because it means that I will have a job dealing with the fallout of Python packaging for a while.

Bottom line, if I had to support a bunch of scientists doing their research, and I had to deal with packaging their stuff, I'll take R over Python in a heartbeat.