| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by superdimwit 2535 days ago
	I'd really recommend anyone doing mildly numerical / data-ey work in python to give Julia a patient and fair try. I think the language is really solidly designed, and gives you ridiculously more power AND productivity than python for a whole range of workloads. There are of course issues, but even in the short time I've been following & using the language these are being rapidly addressed. In particular: generally less rich system of libraries (but some Julia libraries are state of the art across all languages, mainly due to easy metaprogramming and multiple dispatch) + generally slow compile times (but this is improving rapidly with caching etc). I would also note that you often don't really need as many "libraries" as you do in python or R, since you can typically just write down the code you want to write, rather than being forced to find a library that wraps a C/C++ implementation like in python/r.

2 comments

opportune 2535 days ago

>you can typically just write down the code you want to write, rather than being forced to find a library that wraps a C/C++ implementation like in python/r.

I don't think this is really a feature. It's nice that you can write more performant code in Julia directly and don't need to wrap lower level languages, without question, but the lack of libraries or library features is not a good thing. It's always better to use a general purpose library that's been battle tested than to write your own numerical mathematics code (because bugs in numerical code can take a long time to get noticed)

For specialized scientific computing applications, which would normally be written in C/C++, I would absolutely look into using Julia instead (though not sure what the openmp/mpi support is like). But I would also recommend against rolling your own numerical software unless you need to

link

jjoonathan 2535 days ago

I don't just think it's a feature, I think it's a killer feature.

You are much less likely to reinvent the wheel if you can add your one critical niche feature / bugfix to an existing library. In python, learning C and C build systems and python's C API are gigantic barriers to doing that.

More importantly, if every fast data manipulation needs to be written in C, a few of them can be profitably shared, but you need more than a few of them. Pretty soon you wind up with a giant dumping ground of undiscoverable API bloat. See: pandas.

link

tomrod 2535 days ago

Maybe I don't understand what API bloat is in this context -- can you give some more detail regarding your thoughts on pandas?

link

jjoonathan 2535 days ago

Here's one of the fifteen API ref sections in pandas:

https://pandas.pydata.org/pandas-docs/stable/reference/serie...

Even though it's long, it undersells the problem, because many of these methods have nontrivial overload semantics that open up like a fractal when you look in turn at their docs. The link also undersells the problem because this junkheap is evidently so incomplete that people are frequently forced to rely on numpy to extend it.

APIs should make hard things easy, but API gloveboxes like this make easy things hard. Minimal API + Performant Glue >> We do everything for you + You can't ever touch your own data or your perf dies + Good luck reverse engineering these semantics if you've forgotten the context and need to port it.

link

tomrod 2534 days ago

Okay, I think I see your point. The different object methods you are seeing as API calls, and because they are granular and have capacity to do many common and uncommon tasks this is viewed as bloat. Makes sense from that perspective. Thanks.

link

ChrisRackauckas 2535 days ago

While Python has good libraries in general computing, and it has good ML libraries, it's really lacking in scientific computing (numerical linear algebra, differential equations, etc.). For example, what's a Newton-Krylov IMEX integrator in Python? Boundary value DAEs? I know of libraries for these things in Fortran, C++, and Julia... but not Python. It's also well-known that Python lacks a lot of the statistics libraries of R. When you chart it out, Python tends to just have the bare minimum of support in every area (except ML, it has good ML libraries), which if it's what you need, great! But...

link

cauthon 2535 days ago

Are all the plotting/visualization options still half baked?

link

spacedome 2535 days ago

I've found Plots.jl and PyPlots.jl to work well for most basic things, despite not always being entirely pleasant to use, for example the compilation time issue, but this should hopefully improve. The only real problem I had is that these are not quite sufficient for plots to be published in a paper, many visual tweaks you might want are broken or terribly documented, and I have to just use matplotlib or R. It is generally great for jupyter notebooks though. I see the current deficiencies as highlighting just how much work went into matplotlib and others to get where they are today (and even mpl is in some ways still lacking, for example 3D surfaces and meshes). It is unfortunate though, as plotting is a core functionality for their main target of computational science. But to answer your question, mostly yes. Everything seems to be slowly improving though.

link

kmundnic 2535 days ago

No matter the tool I use these days for plotting, I export it as a .tex file to use PGFPlots. matlab2tikz, matplotlib2tikz, and the savefig function in Plots.jl all do the job (with the pgfplots backend). This way you can tweak the figure in the final document, which I prefer. You can adjust all of the properties of the plot in Latex.

link

3jckd 2535 days ago

Yes, they are. Slow and hardly as expressive or rich as python/r counterparts.

link

ViralBShah 2535 days ago

One can use matplotlib in Julia by PyCall'ing it. So it is at least as good as anything else.

link

newen 2535 days ago

Or ggplot2 using RCall, which is what I use and it's quite nice.

link