Hacker News new | ask | show | jobs
by dr_kiszonka 864 days ago
Folks using multiple languages, what is your workflow?

I do most DS/ML work in Python but move to R for stats, and publication-ready plots and tables (gt is really great). I often switch between them frequently, which is a hassle in the EDA and prototyping stages, especially when using notebooks. I enjoy Quarto in RStudio, but the VS Code version is not that great.

How do you make it work?

Also, after so many years using Python and R, I would love to learn a new language, even if only for just a couple of use cases. I considered Elixir for parallel processing and because it has a nice syntax, but ultimately decided against it because it can be a little slow and isn't used much in my area (sadly!). Rust seems to require too much time to get decent at it. Any recommendations? (Prolog?)

4 comments

Use python and write my results in a CSV that I quickly import into R and do my fancy stats.

Tbf python's stats implementations can be garbage; the last time I checked you can't do multiple levels for hierarchical regression.

My workflow is similar to yours: python for deep learning and surface reconstruction. R for stats and plots.

I use go extensively for data preprocessing. Sounds weird but it works well for highly repetitive conversion tasks like DICOM parsing, converting EKGs to numpy, etc.

It's hard to learn a language for fun, so I'd pick something that fits your needs to build something (or even just your curiosity). Elixir and Prolog, although both cool, might not fit the bill because they really excel at one particular thing.

Golang is a popular answer, as you can start building stuff with it fairly quickly (especially compared to Rust). Java can also be useful if you haven't learned it and find a use case (although you will hear it bemoaned as the "New COBOL", there is still a lot of work done using it).

I've been thinking to learn Rust for these use cases, but always get frustrated with the complexity.

I find Go is a great middle-ground though! And now there starts to be a few more bio-related tools and toolkits out there, including:

- https://github.com/vertgenlab/gonomics

- https://github.com/biogo/biogo

- https://github.com/pbenner/gonetics

- https://github.com/shenwei356/bio

... except from there being some really popular bio tools written in Go, like:

- https://github.com/shenwei356/seqkit

I think Go lost a bit of steam in bio after Rust started to take off, but it seems the field is growing to such an extent, and people are also starting to realize Rust isn't the answer to everything. I.e. it is fantastic for fast tools, but for replacing Python for all of the various ad hoc coding in biology ... nah, not so much. That's where I think Go shines.