Hacker News new | ask | show | jobs
by jimjag 3555 days ago
I am increasingly bothered by the "not invented here" syndrome where instead of taking existing projects and enhancing them, in true open source fashion, people instead re-create from scratch.

It is then justified that their creation is needed because "no one else has these kinds of problems" but then they open source them as if lots of other people could benefit from it. Why open source something if it has an expected user base of 1?

Again, I am not surprised by this. They whole push of Github is not to create a community which works together on a single project in a collaborative, consensus based method, but rather lots of people doing their own thing and only occasionally sharing code. It is no wonder that they follow this meme internally.

13 comments

here's an alternative interpretation: Due to their unique requirements, they were forced to investigate alternative approaches to the problem, and the approach they went with, they believe, is useful to a lot more people than just themselves. This is both plausible and matches the announcement, so why take a bad faith approach to someone releasing some code to the open source? Would you prefer if they kept it closed?

And since we're talking about github, haven't they already launched a highly successful pair of projects in atom/electron, in areas where both had competition? why start with negativity before we see what they come out with?

Huh? Atom/Electron is another great example of GitHub duplicating a ton of existing projects (whether dependencies such as CEF and node-webkit or high-level solutions such as ACE) without seemingly having any interest at all in joining those existing projects. Just because someone is successful at doing this does not make what they are doing any more reasonable: if anything it should just put them in a similar place in your mind to the Microsoft of the 90s which many people here would denigrate. GitHub's model of "open source"--the one which it is, devastatingly, teaching to an entire generation of developers--is only about code being available as opposed to being about community and collaborative design. Asking if one would prefer an alternative where the code is simply kept closed source ignores the premise of the compliant: that the code being an advertised separate project undermines the premise that working with an existing project to solve a problem tons of those users almost certainly also have. :/
Open source does not entail any responsibility to work or not work together with any pre-existing project. What you describe is more cathedral than bazaar. There are thousands of possible reasons, from architectural choices to personalities, that someone may have not chosen to work with an existing project. I will not fault them for giving me a superior result, for free (in both senses). To compare this with Microsoft in the nineties is sheer madness.
Or maybe they wanted to spend their energy actually solving their problems rather than trying to persuade the maintainers of other projects about their approach. For example, could they have proposed some changes to IPVS to get rid of multicast for state sharing? Maybe. But then they'd spend all that time arguing with other users of that project about the relative merits of each. Instead they built a new solution and users now can have a choice, including the choice to take some of these ideas and apply them to the other projects if they are clearly superior.

I would accuse them of NIH if they simply reinvented another wheel when there was a perfectly acceptable solution already out there. But it doesn't seem like that was the case. Instead they clearly evaluated the existing solutions, found shortcomings, and decided to solve those problems for themselves, and then publish the resulting code. I see nothing wrong with that approach.

> Due to their unique requirements [...] This is both plausible

I'm skeptical that Github's load balancing requirements are that different from any of the other large file hosts and SaaS companies. But it's possible and I'm not in a position to tell. That being said, the "NIH syndrome" is a largely overlooked problem in our industry and I think it's reasonable to raise concerns over new projects that may be reinventing the wheel.

Some of the requirements they list are pretty unique to github since they serve .git to git clients as well as http to http clients (which is what most SaaS companies do). Like a very long running git clone from someone with slow internet not having its connection dropped.
As person who has used github on totally lousy connections in remote parts of the world, it faired far better than most Isomorphic web apps. People in Nepal can use github to actually get work done. Most current generation web apps won't even load.
I second this from China. Apple Appstore based OSX updates are literally impossible here. Annoyingly, they are required to upgrade Xcode.
One technique I used is to download updates directly via

    wget -c --limit-rate=200K http://support.apple.com/downloads/DL1833/en_US/osxupd10.10.5.dmg
-c resumes download where left off, the rate limiting can be used to match the speed of your connection or intermediaries reducing stalls, drops and angry network users. I would often keep a list of URLs in `download-queue.txt` and use the above flags with -i to load the list of urls, letting it run overnight at some much lower speed.
Have you ever contributed to HaProxy? Have you ever tried committing massive alterations to major open source projects?

It isn't as simple as here's my massive rewrite, click the accept button and everything works out for the open source community.

Let me be the first to say that the level of politics, circle jerking and knowing people is ridiculous.

Given the good reaction to an out-of-the-blue patch from me on the HAProxy mailing list, I'd imagine that contributing even major changes to HAProxy probably would go rather well. It's one of the best open source development communities I've experienced. Welcoming, but still highly focused on quality contributions. The quality and performance of HAProxy reflects this approach.
HAProxy is exceptionally good in this regard.
As someone who's contributed multiple patches to multiple open source projects, this is 100% truth.

I have >3 month old pull requests to add tests which have never been looked at - whereas someone who knows the project maintainer will get a PR looked and merged the next day.

"Contributing to open source is hard, let's write our own and open source it instead."
Sounds like you're suggesting you can't use your own modified fork until your PR is accepted.
NIH syndrome is overrated. A small application doing specifically what you want is usually more robust, easier to maintain and easier extend as you need than a pre-existing application that includes most of what you need as a subset of its functionality.
> in true open source fashion, people instead re-create from scratch.

True open source fashion is also the freedom to work on whatever you want.

"Just think about it, if they would have contributed to foo instead of working on bar, then foo would be twice as good!" keeps being thrown around every time someone announces something new around here.

This is just ridiculous.

Open Source is about just that - allowing anyone and everyone to have access to the source and (if free) making their own version.

There never needs to be justification to write something from scratch, even if it's been done a million times before.

You can appreciate open-source, you can wish that proprietary code was open source, but you never have the right to tell people what they should do, nor are you ever likely to be correct as to what they should do - you are not them.

It may not based on haproxy but they definitely used it prior to switching. If we're being generous in our interpretation that would indicate that they found that it didn't work for them. I'm not saying your overall point isn't valid but I wouldn't be so quick to judge.
by that rationale, Apache httpd would have stayed Apache httpd, and we wouldn't have gotten nginx.
Shudder.
imagine a world full of sendmail, not postfix
I'm not sure I agree, they mention HAProxy and Foo over UDP so they are leveraging existing open source technologies. Custom additions to suit ones particular use case isn't necessarily the same as NIH syndrome.
They mention HAProxy, but it doesn't look like it's based on it.
Joe from GitHub here, we'll talk about it later posts but GLB is based on a number of open source projects including, haproxy, iptables, FoU and pf_ring.

Many existing open source solutions are optimized for short lived HTTP requests and don't address the long running connection issue (like a large git clone). We wanted something better for our use case.

I'm currently working with GitHub Support on dealing with zip downloads of a 5GB repo failing after 2-3 minutes, with curl error "transfer closed with outstanding read data remaining".

Sure about the long running connection issue being solved? :-)

That is good to know. Thx.
> I am increasingly bothered by the "not invented here" syndrome

I'm bothered by the increasing prevalence of "never invent here".

FOSS is great, but if it's not meeting your needs then writing your own is perfectly valid.

And often one of your needs is "have people on hand who know exactly how it works", and the easiest way to achieve that is to have those people build it from scratch. That's the only way to actually ensure it makes your needs priority #1.
> It is then justified that their creation is needed because "no one else has these kinds of problems" but then they open source them as if lots of other people could benefit from it. Why open source something if it has an expected user base of 1?

Two reasons.

1) Recruiting. Check out our awesome code! Don't you want to work on this too?

2) Our unique problem today will be the problems of everyone in three years.

This has borne out with the Netflix opes source. At the time it was a problem unique to Netflix -- now a bunch of people are using that software or derivatives.

For me, NIH is about the idea of already having deep knowledge of the problem domain so a solution is relatively straight forwards. Sure it'll take X time, and there might be some hiccups, but its all about effort and not having to learn anything new.

Joining a pre-existing project is more reasonable when you just can't replicate the basis of the project without learning a lot of new things.

That's the reason why there's a plethora of compile to JS languages, but only a few actual javascript virtual machines.

basically a load balancer must often do special stuff and most companies building custom software around haproxy, nginx whatever to support their needs.
Why would you start a new company when you can join and enhance an existing one?