| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by opportune 2533 days ago

>you can typically just write down the code you want to write, rather than being forced to find a library that wraps a C/C++ implementation like in python/r.

I don't think this is really a feature. It's nice that you can write more performant code in Julia directly and don't need to wrap lower level languages, without question, but the lack of libraries or library features is not a good thing. It's always better to use a general purpose library that's been battle tested than to write your own numerical mathematics code (because bugs in numerical code can take a long time to get noticed)

For specialized scientific computing applications, which would normally be written in C/C++, I would absolutely look into using Julia instead (though not sure what the openmp/mpi support is like). But I would also recommend against rolling your own numerical software unless you need to

2 comments

jjoonathan 2533 days ago

I don't just think it's a feature, I think it's a killer feature.

You are much less likely to reinvent the wheel if you can add your one critical niche feature / bugfix to an existing library. In python, learning C and C build systems and python's C API are gigantic barriers to doing that.

More importantly, if every fast data manipulation needs to be written in C, a few of them can be profitably shared, but you need more than a few of them. Pretty soon you wind up with a giant dumping ground of undiscoverable API bloat. See: pandas.

link

tomrod 2533 days ago

Maybe I don't understand what API bloat is in this context -- can you give some more detail regarding your thoughts on pandas?

link

jjoonathan 2533 days ago

Here's one of the fifteen API ref sections in pandas:

https://pandas.pydata.org/pandas-docs/stable/reference/serie...

Even though it's long, it undersells the problem, because many of these methods have nontrivial overload semantics that open up like a fractal when you look in turn at their docs. The link also undersells the problem because this junkheap is evidently so incomplete that people are frequently forced to rely on numpy to extend it.

APIs should make hard things easy, but API gloveboxes like this make easy things hard. Minimal API + Performant Glue >> We do everything for you + You can't ever touch your own data or your perf dies + Good luck reverse engineering these semantics if you've forgotten the context and need to port it.

link

tomrod 2532 days ago

Okay, I think I see your point. The different object methods you are seeing as API calls, and because they are granular and have capacity to do many common and uncommon tasks this is viewed as bloat. Makes sense from that perspective. Thanks.

link

ChrisRackauckas 2533 days ago

While Python has good libraries in general computing, and it has good ML libraries, it's really lacking in scientific computing (numerical linear algebra, differential equations, etc.). For example, what's a Newton-Krylov IMEX integrator in Python? Boundary value DAEs? I know of libraries for these things in Fortran, C++, and Julia... but not Python. It's also well-known that Python lacks a lot of the statistics libraries of R. When you chart it out, Python tends to just have the bare minimum of support in every area (except ML, it has good ML libraries), which if it's what you need, great! But...

link