Hacker News new | ask | show | jobs
by hosay123 4893 days ago
It would take an immensely ballsy release engineer to approve switching from GCC to clang even for a modestly sized, mature codebase. I can think of at least two Fortune 500s where this will almost certainly never happen: the potential cost incurred by the change would far outweigh just hiring a team to keep the old compiler alive, at least in the medium term.
7 comments

Google and Apple already have.

Switching compilers isn't that difficult, especially when they're as reliable as GCC and Clang.

Freebsd has switched over to clang - and that os is used as the basis for many types of embedded devices such as network routers.
FreeBSD has had a long outspoken goal of not relying on GPL in the base system, an ideological choice. The transition to clang/llvm has also been very long and will (if everything goes as planned) be finalized in FreeBSD 10.

https://wiki.freebsd.org/GPLinBase

Apple (as they would not accept GPLv3) was stuck with shipping the very old GCC 4.2.1, also Apple wanted to be able to incorporate the compiler toolchain into their proprietary solutions like XCode, something GPL obviously won't allow. Hence funding LLVM while starting work on Clang made sense for them, but again not a choice made (purely) on technological merits.

> Hence funding LLVM while starting work on Clang made sense for them, but again not a choice made (purely) on technological merits.

I think it's inaccurate to portray the move away from GCC as simply political with happy side effects, though avoiding the GPL3 was certainly a prime motivation. Greg Parker has stated on Apple developer mailing lists that the adoption of Clang/LLVM has allowed them to rapidly improve the Objective-C language, and that there are more Apple employees working on the language and compiler today than 10 years ago.

as far as i understand freebsd is there just because some license fanatics want to maintain a commercial-izable alternative to linux. they switching to clang doesn't mean much, because it is most probably a dogmatic, ideological move.
I at one point suggested it was the license as well, but cpercival at the time corrected me:

http://news.ycombinator.com/item?id=1351634

Please stop with the accusations of dogmatism. Regardless of whether you agree with their goals or not, the reasons behind avoiding GPLv3 licensed components are very pragmatic.
The common myths and fear about GPLv3 are far from pragmatic.

I can understand those that only want permissive licensed software for components. This make sense, as otherwise the copyright license will put some requirements when distributing copies of the new work. If its proprietary components, you got to buy a license. If its copyleft components, you got to use a particular license.

But if we are talking about gplv2 vs gplv3, only a very small subset of people will ever have to care about the differences between those two licenses. It actually only comes down to two questions.

1: Are you planning to sell a product, but want to keep root access to the device after its being sold, while at the same time directly preventing giving the same access to the new owner who bought your device?

2: Are you in a patent license agreement with an other company over patents you pay for and which cover aspect of the new product?

If your answer is No and No, to those two questions, you can regard GPLv2 and GPLv3 as identical licenses. Its that simple.

There's also an issue of code contribution that some people will not experience or know about. I'm working at a large company and I can contribute to any FOSS project... unless it's gpl3 or requires a copyright assignment. In that case I have to go via a review process taking weeks to complete and involving 2 levels of management. Or I can simply fork locally, patch and never contribute back (this happens almost every time).

I doubt this is rare inside of large corporations.

The patent agreement part GPLv3 (which is the difference between GPLv2 and GPLv3 in regard to patents), only cover distribution rights. If you do not distribute the whole work, you can create as many code contributions you want and have as many patent license agreement at the same time. They do not interact.

For code contributions, GPLv2, GPLv3 and Apache license has the exact same deal in regard to patent grants. Actually, the text used in GPLv3 that cover patent grants of contributors, is word for word copied out of the Apache license. GPLv3 only added additional patent requirements for distribution of the complete work. My guess, for the exact reason of not discouraging companies to contribute code.

If the company review GPLv3 contributions, but do not review Apache licensed contributions, they have bitten the FUD apple.

Pretty much the same in any large corporation I worked with.

Not only for contributing, but for using as well, and with every single version change the same process must be followed.

#2 in particular seems impossible to ensure. There's no way to say "No" with any confidence. If you make a product, and someone comes to you with a patent claim asking for money, you put yourself in the position of completely rewriting your project or taking the claim all the way to court on merit. Violating a patent does not require intent, so unless you have intimate knowledge of every possible patent claim that could be made, and you are willing to fight all of them all of the way, you might as well start right off the bat without GPLv3.
There is plenty of ways to say No. If my company is distributing someone else program which is under GPL and then get sued for patent infringement, we would not be willing to pay money per unit so we can continue distribute that software. We would either rewrite the parts covered under the patent or use a different software. But paying for the privilege to distribute someone else project?

If your in a situation where you want to use someones else project, and would be willing to pay per unit to distribute it to your users, what would you pick. A patent deal to get the privilege to distribute a GPLv2 program, or a supported and legal protected proprietary program? I would pick option 3, ie, write my own program or fight the patent depending on which ever is affordable. In this regard, what license the original program was in matter very little if at all.

An ironic statement considering the backstory behind GCC's obfuscation and complexity.
It really depends on the code.

Mozilla switched its Mac builds from gcc 4.2 (or rather the Apple fork of it) to clang, because the cost of doing the switch was lower than the cost of avoiding all the bugs in said fork of gcc 4.2 that didn't exist in the other three compilers the codebase also had to compile with (gcc 4.4 on Linux and MSVC).

FreeBSD is making the change, most of the code for the company I work for is only compiled with clang ... we do government work.
You mean like FreeBSD?
Mozilla switched from gcc to clang on Mac OS X (starting with Firefox 17 last year):

https://groups.google.com/forum/#!topic/mozilla.dev.platform...

I'm confused but didn't Apple just do this for two operating systems.

And then moved their entire OSX and iOS development community as well ?

Apple manufacture consumer entertainment devices that are neither safety critical in nature nor difficult to field test. Think more along the lines of Boeing, Northrop Grumman or Raytheon.
There is a whole industry of people providing operating systems and compilers for high assurance embedded development, led by companies like Green Hill Software. Nobody would ship a safety critical embedded project using either GCC or Clang.

At the highest levels of safety, the DO-178B standard for aviations systems requires compilers to generate code in a way that has a fixed structural relationship between source and output patterns, where each pattern is independently verified, so that every output instruction is directly traceable to a source instruction. All other code generated by the compiler has to be manually verified. This eliminates the use of most of the optimization techniques found in modern compilers. Also, the emphasis on Worst-Case Execution Time (WCET) verification means that performance improvements only matter if your WCET estimates can incorporate them.

Back in 1996-1997 I was a federal contractor working as a software consultant and analyst using several different systems for logistics and provisioning for the US Army to downsize a base to a different state.

We used different operating systems and different databases and programming languages to create a redundant system that solved the same problem or generated the same report. We would have at least three programs running with three different technologies and if the reports were all the same, the system was in good shape, but if one of the reports differed somehow we knew we had to check and fix something.

We did use SunOS/Solaris, HPUX, Linux (Slakware IIRC), DOS/Windows 3.0, Windows 95, Windows NT, etc. In some cases there were C programs using the native C compiler with whatever Unix system that was in use, and in the case of Windows or DOS we used whatever language or software was available. I worked in Clipper, DBase, Oracle PL/SQL, SQL Server, MS-Access, and sometimes even flat text files downloaded from mainframes or FTP sites on the Military network that needed converting to different databases. As a federal contractor I was limited in what I could do, for example my PC did not have a CD drive and they would not allow ODBC drivers to be installed. I did not have administrative access but instead access to a maintenance account that others shared to work with databases that was limited in many ways.

My point is you work with whatever they give you to work with, you may be limited in what you do, but you use what you have available. So yes you might be limited to GCC on a Linux system and no root access to install CLANG or LLVM.

> PC did not have a CD drive

Was this before or after Bradley Manning? (I think he used a CD-RW drive on a "secure" computer to extract the data he sent to Wikileaks.)

> account that others shared

Shared account? Really? Sounds like top-notch security in action. [/sarcasm]

> no root access to install CLANG or LLVM

You don't need root access, you can just do:

./configure --prefix=$HOME/clang

or whatever the equivalent is if clang uses a different build system.

If /home is mounted noexec, that's a real problem, but if you're doing software development, noexec would mean you couldn't run the programs you're writing either.

Of course, just because there's no technical reason you can't do something, doesn't mean there isn't a nontechnical / political reason you shouldn't. If you'd made a project change as large as using a different compiler without approval from (or at least notification to) higher up, you might have gotten in trouble.

It was in 1996-1997 so it was before. They would install the OS and then remove the CD-ROM drive for security reasons. One of which was to make sure no unauthorized software was installed, and yes they had CD Burners back then and saw that as a security risk for copying information.

You will find that not all federal systems are secure and run by experts. Some federal employees are not qualified for their jobs, and that is why federal contractors are hired to make up for it. For example some of our servers and systems were not behind a firewall and had public Internet IP addresses. I agree a shared account is a security risk, and when someone changed the password to the maintenance account we were locked out and had to file a form to learn the new password. In fact to do anything like install software or configure a system we had to file a form first.

Yes the federal government and chain of command requires that I file a request before I use a new software product. Even if it can be downloaded and run in the home directory, if I don't get permission for it, I am in deep trouble. So much as refreshing an IP address, if I do it myself I am in trouble, I have to call their help desk and have them refresh and renew it for me.

You don't need root to install Clang/LLVM.
Actually VxWorks uses gcc (and make for the build system). At level A I think you are required that your tests have 100% coverage, at machine instruction level, but I don't remember there must be that kind of fixed relationship between source code and output from the compiler. I have developed only for level B and have never been involved in toolchain verification, so maybe there is something I am missing.
> Nobody would ship a safety critical embedded project using either GCC or Clang.

They would and they do.

I'm not sure what the second paragraph is trying to say. GCC (and Clang) optimizations can be turned off selectively or entirely altogether.

DO-178B doesn't prohibit optimizations in general.

I was writing some networking code for an avionics project in C, and stumbled upon a bug in GCC that was causing structures to be packed incorrectly in memory. The bug had long been fixed in newer versions of GCC, but it's so time-consuming and money-consuming to approve new compilers for avionics use that we just had to work around the bug.

If upgrading from one version of GCC to another is a big deal, I can only imagine how excited avionics stakeholders wouldn't be to switch compilers altogether.

It may happen eventually, but then again... it may not. Just so long as they are happy to use some GCC build from 2002 or whatever, and just so long as there's no pressing need to use something else, why should they take on the effort and expense of switching?

I'd be surprised and disappointed if those companies were using GCC to compile avionics code written in C++.
Really, why? What should they have been using - at least just a few years ago before LLVM became mainstream?
Probably icc (Intel's C++ compiler), MSVC++, or one of IBM's (i.e., xlcc).
ICC is a proprietary compiler which not only focuses it's optimizations on their own cpu range but also have a history of deliberately generating subpar codepaths for non-Intel cpu's.

MVSC is Windows only proprietary compiler and certainly hasn't had stellar C++ standard support although they are trying to rectify that with their recent 'Go native' push.

I have had no experience with IBM's XLCC compilers though, maybe someone else has?

Are they really that much better? I remember hitting a bunch of incorrectly generated code for MSVC++ at least back in the 6.0 days. Like, the disassembly revealed dangerously incorrect code.
cwzwarich's comment above covers it pretty well.
What else do you propose they should be using?
Does your back hurt from moving those goalposts?
Are you objecting to the removal of "overpriced" (irrelevant opinion) or the addition of specific examples?
You made a claim ("It would take an immensely ballsy release engineer to approve switching from GCC to clang even for a modestly sized, mature codebase"); people pointed out counter-examples (Apple and Google); you move the goalposts. Would it have killed you to admit that you were wrong in the post as written?
There is little motivation in commenting unless in the process I can enhance or validate my understanding. Often this naturally means arguing details until out of breath, or my diatribe is so sufficiently beaten back that I'm too exasperated to continue – at which point the topic will sink in, until reiterated at some later date in some later argument – with a level of belief that cannot be attained unless all alternatives have first been eliminated. I'd much rather die fierce yet wrong than live a life shrouded in vagueness.