| HN Mirror

It's not the discoverability of the libraries that's the problem, it's that the utility of these libraries is generally not that great for anyone except the authors. One common type of library handles data transformation, normalization, and maybe even workflows. These abound. But they are rarely useful in other people's hands, because to extend them and actually get any work done, you need to spend as much time learning them as it would take to write it from scratch. And the advantage of writing it from scratch is that you know it intimately, and all of its assumptions and flaws, which you don't know about somebody else's code, even if it's extremely well documented. Take something like Taverna [1], which is probably very useful to some people, and had been recommended enthusiastically to me by many people, but after spending three hours reading documents and searching the web, I could not get it to do what I needed to do, so I wrote a simple one-off bash script that interfaced with our cluster system. Alternatively I could try to hack in loops, but that's going to take me 10x as long, will require me to interact with many other people who obviously don't understand my problem since they did not consider it a fundamental need, and may not even be accepted back into the mainline, at which point I'm off on my own fork and lose the benefit of using a common code base. Waiting 1-10 hours to hear back from the dev mailing list is unacceptable when you're trying to get work done.

Is it more important to get the result, or to use other people's code? Reinventing the wheel is a minor sin compared to not getting results.

[1] http://www.taverna.org.uk