Hacker News new | ask | show | jobs
by crabbone 1172 days ago
> Python is a superb API for researchers,

Where does this nonsense come from? No. It isn't. It's just a stupid fashion. Something that should be discouraged, not encouraged by trying to make it work when it's obviously broken.

As someone who does work with researchers who do use Python a lot, I see the everyday painful experiences of people who use it. And this pain doesn't need to be there. It's just masochism. And the only real reason is that they don't know any better. The only other thing they know is Matlab, and that's even worse.

Python is just a bad language. Popular, but awful. Ironically, while researchers are supposed to be on the forerfront of discovery and technology... well, they aren't. Industry outpaced research. So much so that today there are government programs to onboard researchers into more automated and more automatically verified way to do research. And we aren't talking about making an elite force here. These programs are meant for people in research who copy data from Excel sheets one data point at a time into another spreadsheet. It's that kind of bad.

My wife happened to work in such a government center, and that's how I know about what's going on inside these programs. And it's very sad that decisions about the preferred tools for research automation are made by people who, unlike most of their peers, had some exposure to what happens in the industry, but had no deeper understanding of the reasons any particular technology ended up in any particular niche, nor any independent ability to assess the capabilities of any particular tech. It's really sad what's going on there.

2 comments

>Python is just a bad language. Popular, but awful.

You couldn't be more wrong. Python is the language going forward. There is a reason why bleeding edge ML stuff is done through Python, as well as it being the backend to several very popular web platforms and is second most use language on github behind JS solely because JS is hard tied to web.

I have a feeling the hate for Python is just comes from paradigms that are taught in extremely poor CS curriculum in schools. If you think that Python is bad because of dynamic typing, you haven't been paying attention to the direction compute is going.

No, you couldn't be more wrong. Nice argument you dug up there!

Language going forward... I hadn't had such a laugh in a while... It was dead in the 70s. Long before it was born. Are you under influence or something?

> There is a reason why bleeding edge ML stuff is done through Python

No, there isn't, and bleeding edge ML isn't done in Python. It's done mostly in C++ and more seldom in C. Which are another garbage languages, but that's just the reality we live in. Python is just a tiny fraction of what's being done, the tip of the tip of the iceberg.

And, no Python is not the backend of popular platforms, it is, again, a tiny little bit of what's going on in those "popular platforms", and, unlike you, I actually worked for those "popular platforms", so, would know how that is.

But, even, imagine that in your fairy-tale world Python is somehow so super-important and successful? Didn't I already explain how this is possible while still being a garbage language? -- I bet I did. You just rushed to spit your despair and frustration at me, because I offended something that you like... well, you think that you like, but really, you just don't know any better, just like other people using Python :( And that's really the sad part. You lack critical thinking to be really able to tell the quality of your tools. You substituted the ability to judge the quality of something by looking at what the majority is doing and following the herd.

And, I didn't study when Python became taught in school. In my days the language of choice for intro to CS was Visual Basic :| It's also garbage... and the whole idea that the intro to CS has to be done by learning a random fashionable language of the day is just dumb. It's a misunderstanding of SICP, which to someone who didn't understand what the book is about looks as if the author tried to teach the students Scheme, and later intro to CS classes were modeled on this book, but replaced Scheme with another language, while completely forgetting that Scheme was used in the first place as a language with "little syntax", to avoid spending time on learning the language.

You again are confused when you use the expression "dynamic typing", but you don't even know what that is. But you will never accept it because, again, you don't have the ability to critically think, you follow the herd. You red a "definition" somewhere that says that Python is "dynamically typed language", and you believed this nonsense, even though you never really thought about what that might mean...

One thing we agree on though: the direction the compute is going is to utilize more low-skill programmers, and that means, beside other things, following fads more than doing any sensible work. Python is the current fad, and so, for a while, we are stuck with this garbage. And, the way things are going, at best, I can hope for another equally idiotic language to come to replace it. But, probably not very soon.

What are the best alternatives to Python then?
For research? -- Julia seems to be definitely better. It's purpose-built for doing just that. R would also be there. If you want general statistics, then add J to the fold.

But specific fields often have their own, bespoke solutions. I've only ever dealt with math, but it has plenty of its own niche languages that are much better than, say, Sage. My personal choice was Maxyma, but that's because I like Common Lisp.

Furthermore, it's just the situation today. It doesn't mean that this is what it has to be tomorrow. Any of the languages I used in this domain have their issues, and could still be improved. We are nowhere near a place where it's hard to imagine something better than what we have. So, I'd say, if you really want a very good language, you might as well start building one now -- you have a very good chance yours will be the best one so far.

> For research? -- Julia seems to be definitely better. It's purpose-built for doing just that.

What does this even mean lol, “research” is incredibly broad

I'll elaborate. When I use the word "research" I mean activity performed by accredited researchers. Researcher is a rank, or a title if you want. It's similar to "software engineer" -- if you are in an academic institution the word "researcher" could be part of your title / job description.

The activity has as its goal to publish the findings. Findings are expected to be meaningful and to be found in the domain of sciences. It is also possible to extend the scope to arts, but I'm hesitant about it, and the way I used the word doesn't truly relate to what happens in arts department (for this purpose, math is an art).

Accredited, here means association with academic institution by means of employment or similar.

On conceptual level, all discoveries in sciences must have a hypothesis whose validity and truthfulness is to be established through empirical evidence, i.e. experiment. To interpret experiments one needs to use statistics. This is the vehicle driving experiments. This is where the kind of programming I'm talking about comes in.

Note that researchers might use programming for other activities s.a. eg. automation of keeping their diaries, or automating their correspondence and so on. But I meant specifically the statistical aspect of doing research.

I think what OP means is that Julia has a number of features that work very well for the workflows and processes of programming for scientists.

1. Interactive workflows. One of the defining features of doing science is that you don't know what the right answer or right approach is. This makes interactive workflows (like REPLs) really valuable since you can load data once and do 100 different analyses on it. Notebooks are also really useful as a means of showing both code and results at the same time, and Pluto.jl is one of the best here since it removes the possibility of ending up with inconsistent state by tracking dependencies between cells.

2. Reproducibility. Another important feature for scientific code is that you want someone to be easily able to take your data and code, be easily able to install the code, run it and get the same answer. This is one of python's biggest shortcomings. Python has an incredibly rich package ecosystem, but is lacking a good unified system for reproducibly installing packages (Poetry is the closest but it has problems with binary dependencies). Julia (and Rust) have virtual environments and the idea of a manifest file that records the exact version of all your (transitive) package dependencies built in which make it trivial for someone else with no instructions get an exact clone of all the software needed to run your code.

3. Ease of use. Most scientists are scientists first and bad to mediocre at programming (there are obviously exceptions, some scientists are great programmers). Static type systems and manual memory management are major impediments to beginner use. C++ gets some scientific use for it's performance, but there's a reason Python R and Matlab are the languages of most scientists.

4. Performance. Lots of fields (bio, astronomy, high energy physics, chemistry) need a fast language to be able to get results in a reasonable amount of time. Julia is fast and is one of the easier languages to write GPU accelerated code in.

5. Open source. Closed source languages (Matlab) are a total pain to deal with.

>R would also be there.

Surely you jest. Whatever problems Python may have with proper development or deployment practices, R is ten-fold worse. R is and will forever be an interactive, make-it-work-now language and production backbone second. The language is far too accommodating and will go silent casting all manner of sins in the manner of keeping the program running. Package management is still a huge headache as proper isolation is still not well addressed. Too many R packages assume they have root during installation and can do whatever they wish. Volumes can be written about R namespacing.

> R is and will forever be an interactive

I don't see this as a problem for researchers. Quite the contrary. To compare this to Python: Python's interactivity sucks. Bad syntax prevents it from using it interactively efficiently. Abysmal debugger. And if you consider that a typical environment in which Python is used in research setting is a notebook, then add to it an even worse wrapper for debugger available in notebooks.

Python is garbage for production systems too. If you want to use the results of your research for practical stuff, you will not use Python code you wrote for research. I'm more familiar with the world of medical research, and can confidently say that I've never seen a practical medical device or software product that used Python. Medical equipment typically wants to be realtime, which is a world completely closed to Python, for example.

However, Python (or R) being bad for production systems is perceived to be an acceptable price... I wish it wasn't, but that's how things are.

You cannot unironically claim that Python has superior package management... It's the shittiest ever. I have not seen a language which has done it worse than Python, and I've worked with at least 2-3 dozens of them. In my day job, I'm in the ops / infra department. I maintain a lot of Python packages both for commercial entities and for open-source independent developers. Part of me is very upset that Python is such a shit-show when it comes to packaging, but another part of me is happy because it means that I will have a job dealing with the fallout of Python packaging for a while.

Bottom line, if I had to support a bunch of scientists doing their research, and I had to deal with packaging their stuff, I'll take R over Python in a heartbeat.