Hacker News new | ask | show | jobs
by lt 5424 days ago
From the full article:

The confidence of the SuperCollider programming language has been set to 80%. The reason for this is quite funny. In order to prove that the TIOBE index can be manipulated easily, Adam Kennedy created an empty Perl library called Acme-SuperCollider-Programming. This was to boost the unknown programming language SuperCollider by adding it to Perl's popular library archive CPAN. Now 20% of all +"SuperCollider programming" come from this artificial library.

Yes, quite funny indeed, but not for the reasons they think. Just to show that Tiobe methodology is a joke. I don't understand how people take this seriously.

1 comments

Who has a better methodology? Is there even a good one?
If that was a serious question, places that do developer tools and programming languages for a living (at least MSFT, I assume others as well) pay ridiculous amounts of money to independent third-party companies that specialize in gathering this kind of data. The raw data was then kept fairly private (marketing + upper mgmt only), but the rank and file would see some of it occasionally when things such as trends on the number of VBA or VB5 or VC++ programmers appeared in slide decks talking about the direction for upcoming versions of the product.

Having working with the raw data, it was pretty fantastic. Segemented by industry/business size, handled issues with multiple programming languages or companies where one section used one language and another used other ones, etc. We even knew which tools and add-ons were used for which languages and which compiler on each platform (i.e. how many commercial shops using C++ targeting linux are using gcc vs. icc?).

But that data was also stunningly expensive. My marketing friends tell me that accurate market data always is.

Interesting. It's a tautology but the Internet sees only the Internet - there's a huge swathe of programming work that just isn't advertised online, so is invisible to TIOBE.
http://www.langpop.com is better because

1) I use more metrics and

2) I don't go around making statements about how, from one month to the next, someone displaced someone else, or climbed into a certain ranking, or things like that.

3) I let people reweight the chart based on the metrics they like.

I think the numbers LangPop comes up with are pretty good, but by their very nature are a bit fuzzy.

F# isn't listed on any of your charts. I think it's probably popular enough to show up in all of your data sources at this point.

Also, maybe add GitHub and StackOverflow as data sources?

F# should probably go there, yeah.

GitHub and StackOverflow started out really biased in terms of their communities - GitHub with Ruby, and StackOverflow with Microsoft languages. Do you think they've sufficiently lost that bias?

On another note, you know what else could be a great source(s) for data? Google Scholar, CiteSeerX and arXiv. It'd be really interesting to compare the language usage between "industry" and academia.
I think StackOverflow has; GitHub still seems a bit biased towards scripting languages like Ruby or Javascript though. In any case, since you show the graphs from the various sources, I don't think it matters if one source is more biased than another. If, for example, you used CodePlex as a data source, you'll see a huge bias towards C# -- but visitors to your site could simply draw their own conclusions from the charts.
Where can I see the data for GitHub?

I'm not sure I even trust that... I do lots of bindings from OCaml to C, and whereas I consider them to be OCaml projects, GitHub sees they're more C by LOC and counts them as C.

If nearly all ruby programmers put their code on github, and you don't count it but you do count google code, then isn't that a bias i the sample? Wouldn't it make more sense to include all the major code hosting sites, including github?
Google's Code Search just searches for code on the internet.
This is, by the way, an honest question. I think at a certain point they'll be mainstream enough to say yes, but it's tough to say when.