Hacker News new | ask | show | jobs
by makmanalp 2770 days ago
Argh, it's so frustrating to see this kind of sentiment. We're truly spoiled by choice. The libraries don't all serve the same purpose, and not everyone needs every library. This is literally a guide to help pick which one I may want. What more could one ask for??

That said, I understand it sucks to build on top of someone else's code only to have it be discontinued, but honestly it only takes a few minutes to judge a project's maturity. Here are some very easy rules of thumb. Don't write a lot of important code relying on a library that:

- Is younger than 3 years old. - Has less than 3-4 major contributors and 20 overall contributors. - Has lost steam: a lot of issues and pull requests open and stale with no triage tags, no discussion or responses. - Doesn't seem to have any automated testing / packaging infrastructure set up. - Doesn't seem to have a regular ongoing cycle of releases, be it long or short. - Doesn't have nicely laid out documentation. - Doesn't seem to have a user base, as indicated by a preponderance of questions and answers on stackoverflow, etc. - Just posted their "we made a cool new thing" post on HN a few weeks ago.

Yes, the cutoffs are arbitrary (and flexible!), but this hasn't failed me yet. The python world is filled with many wonderful and mature libraries. It just also has a lot of up and coming, promising young ones. Use whichever!

1 comments

> What more could one ask for??

A refactoring and merging of these libraries to enable fewer people to maintain more functionality, such that the bus-factor of any given part of the functionality is higher.

That's entirely backwards, to reduce the bus factor, you need more people maintaining less, not fewer people maintaining more.

In any case, the incentives are simply not there, as different projects have very different priorities (which is why there are so many projects). Some folks want to monetize their special sauce (Plotly). Some folks want to focus on high level statistical charting (Altair, Chartify) Some folks want to focus on interactive data exploration (Holoviews) Some folks want to focus on high performance and streaming (Bokeh) Some folks want to focus on high quality static image generation (MPL). The human and economical cost of getting all those groups under one tent is astronomical.

> you need more people maintaining less, not fewer people maintaining more

I see how you got confused, but I was making two separate assertions.

Assertion 1: If you merge the library projects together, and strip out the redundancies between them, then you'll have the same number of contributors, now distributed over fewer total lines of code. So each contributor can learn more of the codebase. (Bus factor goes up.)

Assertion 2: If you refactor the resulting libraries to reduce the total complexity (i.e. reduce the API surface from that of the union of all the merged-in libs), then you can begin to strip out technical debt from the project from the outside in. By eliminating now-dead code, you remove places where bugs can arise, and you lower the number of dependencies (which could otherwise have been sources of API-breaking changes when they update.) Thus, the number of people needed to maintain the project goes down. So the same total functionality can then be maintained by fewer contributors. (You don't actually remove contributors; they just become reserve capacity, with each contributor able to be less-overworked for the same result.)

> If you merge the library projects together

Alas, it's not so simple. How much actual overlap is there to merge in Bokeh and Matplotlib, for instance? MPL renders images in Python and has no JS component. Bokeh does all of its rendering in JavaScript! The Python API is mostly just a thin wrapper around BokehJS. MPL has no server component at all. Neither of the does what Datashader does for large data sets. Neither of them has a high level statistical charting API, that's only in Seaborn or Chartify. Merging all these things together would cost a fortune in time and money, and at the end of the day not actually reduce the total codesize to any appreciable degree.