Hacker News new | ask | show | jobs
Bazel 2.0 (blog.bazel.build)
160 points by the_alchemist 2364 days ago
12 comments

One issue we hit with our CI, and mix of build systems is this - given a changelist, find out which targets needs to be built, and which one needs to be tested on pre-submit, and which on post-submit.

With that, we end up paying so much extra time building everything over and over without need, and then not building things that we ought to.

So that's one reason to switch, but at the same time lots of people simply do not get it. To them it seems intrusive, new, opinionated, and makes them not happy to use it. I've used it for 2+ years at google, and yes initially - was WTF is this? Then it hit me... And I'm sure the same is for buck, pants, please.build, gn and other similar systems.

At the end of the day, you need way to express "end to end" your build graph, from any single individual source file, shell script, or configuration downto building your executables, deploying them, etc.

It's an industry tool, that needs to be looked, and if it takes 5 people to support it, then it takes 5 people to support it, but you won't be wasting other peeople's time on issues like - "Why this build in the CI did not trigger?", why it takes, and wastes my time (waiting for presubmit), etc.

Yes, it does not come for free, but it's worth knowing and trying it out at least.

If nothing else, here is the takeaway - Try to use a system with static graph, where relationships are known before you start building things. It's not always there, e.g. your #include "header.h" file is dynamic, but bazel forces you to express even that, and later it catches it whether you've done it, and breaks unless it's fixed.

> Then it hit me... And I'm sure the same is for buck, pants, please.build, gn and other similar systems.

There’s an exercise you can do where you design a build system on the basis that it shouldn't do unnecessary work (which can be very slow and frustrating in practice).

My personal experience is that you can really quickly get to the point where just reading the entire graph into memory gets expensive. People talk about how Google is huge… but long before you get to that scale, you can end up with a build graph that just takes forever to parse and evaluate. (At Google's scale, it doesn't even fit in memory any more.)

So you decide that, as a hard design requirement, you should be able to only load the portion of the repository that you are building. And then you want to make this cacheable, so you can change the repository and know what’s changed in some quick / reasonable way.

If you go down this path, you end up rediscovering some of the big design decisions behind Bazel, Buck, Pants, Please, and GN.

Good heuristic for whether it's worth considering moving to bazel for your build system:

- Do you have 200+ developers working on a monorepo?

- Are you willing to vendor all of your dependencies and maintain their builds yourself?

If so, consider it. The productivity you're losing to unnecessary rebuilding and re-running unchanged unit tests will probably be paid back if you can contort your development process to the one Bazel expects.

If you're a small shop, the benefits Bazel is going to provide over, say, Make (or whatever standard build system your primary language uses), are going to be minimal. And the overhead of maintaining Bazel is going to cost you a ton of developer time you may not be able to afford.

Another factor: are your languages supported by bazel? If you use the same languages that google uses (C++, Python, go), it's fair to say that those are well supported. For all other languages, even if they are widely used outside of Google (JavaScript, nodejs), you may be out of luck.
Go support is not great either. Bazel can build Go just fine, but you will need to throw away the standard Go tooling and use Bazel instead. There are third-party helpers like Gazelle, but you know you’re in for a bumpy ride when even basic operations require a helper.
Go support is awesome, IMO. Personally I have favored Bazel over “go build” for a while, except for pure Go projects with no generated sources.

Gazelle is wonderful and it doesn’t belong in Bazel core. Bazel is a build system for every language, and Gazelle is for a subset of Go developers. Since it’s not part of Bazel core, you can always replace it with something else.

But would you recommend using Bazel and Go without Gazelle or an equivalent third party?
I recommend Gazelle for importing third-party Go dependencies but not for your own Go code. If you are using Bazel, just write the BUILD.bazel file yourself with the appropriate go_library / go_binary / go_test rules.
Gazelle is a recommended tool, developed by the same people who make the Go rules. It's not a third party tool, it's part of the normal experience.
Python is not well supported. The official rules_python has an inadequate third-party packaging solution that has led to ~5 open source alternatives existing that each fix a subset of the issues in the official.

Given that the Python community is really oriented around the PyPi registry and pip packaging tool having a good Bazel-native packaging solution is near essential, but right now it’s not quite there.

Yeah, what's missing is first-party integration with the package manager (like Gazelle for Go) with proper dependency locking.
I don't know much about bazel, but JavaScript is very widely used inside Google. It's the main language I work in.
So I always wonder how Google does this. Somehow they're able to determine which individual unit test are impacted by a change. Maybe this is only for Java but I'm pretty sure I recall claims that they can change 1 line & know that only 1 other test case is impacted (i.e. not even the other test cases within the same unit). Now Google's monorepo does imply also that if you are changing a line that is foundationally part of a lot of other components that they get rebuilt & retested too - that's the risk tolerance Google has chosen (i.e. they'd rather tests take longer if you're changing a core component rather than skipping tests & risking destabilizing 1000+ other engineers).
It depends on the granularity you write your BUILD files with. You can certainly write a target for each individual test, and track dependencies independently. In practice, you might glob together all files in a directory and put up with a few extra test runs in exchange for less bookkeeping
What I was saying is that Google somehow manages to track test dependencies at the source level so globbing wouldn't matter.
Frankly you heard wrong. There isn't anything like this.
Yeah, I was misremembering the post from ~8 years ago. I was thinking of this https://testing.googleblog.com/2011/06/testing-at-speed-and-... and it sounds like they do just use Blaze to keep track of explicitly expressed dependencies.
I don't think you need 200 developers to make it worth considering.
I was subjected to bazel on a small project because the manager insisted we use it. The rest of the company used a number of either custom tools or cmake or premske.

It is utter hell when you have tons of third party libraries (internal or external to the company) that you don’t have the source to and it is especially painful when trying to integrate bazels behavior against other build systems. Also bazels packaging and use of internal symlink renaming was a constant source of suffering. Bazel pretty much destroys a number of totally valid work linux commands for looking for so files.

Bazel might be useful in the case of a monorepo with a massive engineering pool AND a massive cloud infrastructure backing that repo to handle all the artifact sharing, but after having used cmake, premake, waf, random perl and ruby scripts, or just checking vs projects into perforce manually, I’d pick any of those before bazel for most projects. I say that having worked on code bases from a few 10s of thousands to 25+ million LoC with teams small, large, and distributed.

Bazel probably has its place but I have yet to find it.

My personal experience is that Bazel cut through a bunch of the problems that I’ve had with CMake, Waf/SCons, etc. Builds were fragile, they were not reproducible, and there were implicit dependencies. This is mostly as someone who’s rewritten a few build systems, rather than as someone who’s been subjected to build systems by others (I mostly inflict these changes on other people). With Bazel, I have much higher confidence that I’ll get consistent results when I check out the repository on different computers or work with other people.

That said, the major sore point with Bazel for me is the general lack of expertise about how to work with it sanely. Depending on what part you’re looking at, it’s somehow both “too opinionated” and “too flexible” at the same time.

I think it will capture a big chunk of the mindshare for build systems over the next few years, and you’ll see more and more of it. Over that time, people will develop the expertise and best practices for different development problems.

For managing third-party dependencies specifically, Bazel gives you a ton of options, including options that only really make sense for huge orgs like Google. Google vendors their third-party libraries directly into the monorepo. If that doesn’t make sense for your org, Bazel lets you work with external Git repos, with artifact repositories, with package repositories like NPM, or with tools like pkg-config.

The thing that makes this hell, right now, is that few people how to use it well and the documentation is rough. I’m personally very happy with it, even for small codebases, but I’ve used it a lot.

Lack of docs plus lack of user base is also a giant failing of bazel. It’s almost always impossible to figure out something I could make happen 6 ways with most other build systems make work in bazel. And there’s little community so now instead of getting work done I am debugging bazel source.

Also building distributable packages with bazel never seemed to work well due to the constant aliasing of so files. Things that would work in the direct bazel build would fail in packages and vice versa so now we had even more pain.

Trying to suck up just header files and multiple so files was always arcane bullshit as well.

We did work with git and other such functionality, but if you had to build a package from another build system to bring into bazel there were always annoying pain points.

Also bazel managed to bring in implicit dependencies in our system so that clearly isn’t something bazel magically handles but was rather a product of your expertise.

After reading build systems ala carte I am just more convinced bazel is not the build system that I would really ever need. I’m not sure that build system exists yet to be honest :). But in the work I do other systems solve my problems better.

For anyone thinking about Bazel for their project/organization... run as fast as you can in the opposite direction. It's easily the most complex and unintuitive build systems in the world, and I'm saying that as someone who used SCons. At the last job where I used it, I was on a team of 5 whose responsibilities included Bazel upkeep, which required anywhere from 10 to 50% of our time. This was used by a broader engineering team of 50, working on 3-5 "big" projects and a few dozen small ones.
If you are an organization with a large enough codebase (especially if it's in a monorepo) that you need a shared remote cache of build artifacts, or remote build sharding and execution, and have multiple languages (even protocol buffers) interacting in complex dependencies, then you should run as fast as you can away from less rigorous Blaze-alikes (Pants, Buck, etc.) straight towards Bazel.

Yes, it's complicated, but it's also quite rigorous, and the rigor pays off.

(We at Square had already found a Blaze-alike necessary. We are currently busy converting our Java build from Pants to Bazel.)

I'll never understand the fascination with mono repo's.
Once you reach a certain size of codebase, you're either going to be investing significantly in making many repositories work together and look a bit like a monorepo, or you're going to be investing significantly in making working on individual parts of a monorepo more efficient and look a bit like an isolated repo.

Both approaches take a huge amount of work and tooling.

The big selling point of a monorepo is that the time and effort taken to follow strict versioning and upgrade discipline for multiple interdependent projects can be somewhat avoided. On the code side.

If you're looking for a magic bullet argument proving that either approach is strictly better, I'm not the person to ask.
At a certain size, monorepo becomes the worst way to do it except for all the others.

Essentially: version skew across numerous artifacts in a large organization starts to look like the version skew across an industry or ecosystem. The aggregate cost of dealing with it project by project is probably higher, at least that is what most of the biggest tech companies have concluded, than dealing with it at the source level using a monorepo and single-version policy.

Well, don't have version skew then? Require that anything merged to master doesn't break any tests? Require that tests exist in the first place? Google makes it work at a dramatically larger scale. Everything at tip-of-tree is always ready to go.

EDIT: Looks like I've misread the parent's argument as one against monorepo. It was in fact an argument in favor, and one I agree with.

Yeah but Google does that by being a monorepo.
Well for one you can commit to multiple projects in a single PR. Makes coordinating changes across projects much easier.
It gives you that illusion; it doesn't solve versioning and deployment orders, and I'd argue that that's the harder part of changes across projects. Polyrepos make messy things...messy.
Deployment ordering at large scale is avoided and usually done by not making breaking changes. 4 phase migrations, always. Roll out new API, update existing software to use new API, wait for everything to stop using old API + backfill, remove old API.
It pretty much does solve the versioning issue. “Latest, always”. The downside is the abysmal state of monorepo build tools. With multirepos, who updates the downstream repos’ dependency files (e.g., requirements.txt) when an upstream project releases a change? And is the policy “latest, always” or do you support N versions of every package? I would argue that the latter is insane at any scale, and the former leaves you dealing with dependencies manually (someone is updating the downstream repos’ dependency files when an upstream change is released) or you build automation that does it and you’re well on your way to implementing your own monorepo-like build tool.

Everything is hard, unfortunately.

Oddly, this is also one of the bad sides. Committing to two projects, by necessity, means deploying to two workflows. If not more.

Doing that in one repo makes the commit part easier, but hides the complexity of deploying separately. Or to other places.

Not that two repos makes it easy. Just gives a much earlier signal to where it happens.

Or you can have a single workflow that includes all the projects in the repo. I found it's actually easier to do things like wait for project A to deploy before project B.
There are tradeoffs both ways. With multirepos you likely have a dependency hell problem and you often have to submit and release several PRs for otherwise small updates. With monorepos, (if you want reasonable build times) you have to be able to determine what has changed and what needs to build (including tests, etc) as a result. This is technically true of multirepos as well, but the problem is pushed into git and manual process.

Having looked seriously at both options, I think the monorepo world is the right one, but it presently lacks good tooling to sanely model your dependency graph AND create custom build rules while still being affordable for small or medium-sized orgs. Git/hub simply isn’t designed for this kind of modeling and everything I’ve seen built atop it is either way too manual or a kludge. Maybe the “kludge” solutions are actually reasonable, but my confidence is low.

Bazel is the right idea, but it’s execution disappoints. The documentation is abysmal, last I checked they advertised Python 3 support, but it’s been broken for years with no signs of progress. Building custom rules also looked hopelessly complex (by which I mean, “not something our organization can afford to implement and maintain”) but maybe there’s some undocumented happy path that I’m missing out on? These things seem easy enough to implement. We’re using Pants right now, and for it’s many similar problems (bugs, documentation, poor code base, difficult extensibility), it at least does a passable job at building Python projects.

I’ve thought about it a fair amount, and I think it’s reasonable to build something simpler that might not meet Google’s use case, but would at least enable small and medium sized shops to play the monorepo game.

rules_python has supported py3 for a while.

The next obvious question is, what would you do to make it simpler? Tons of people have tried (you listed 5), and they all rebuilt the same thing. What features do you drop?

Last I checked (maybe 6 months ago), it definitely _didn't_ support py3, although it was advertised. I thought I was doing something wrong, but there were half a dozen issues in the tracker that indicated it was critically broken.

I understand that "it should be simpler" is a pretty lazy criticism. It's been a while since I audited Bazel and friends, and I've forgotten which issues apply to which tool. Moreover, because of the awful state of the documentation and the messiness of the code base (or perhaps this is just standard quality for Java projects?), it's really difficult to tell whether any given issue is actually a fundamental shortcoming in the application or whether it's simply a knowledge gap.

As far as what I want, keep the starlark configuration file format; implement all rules as starlark libraries (such that no one needs to write Java to extend, and if you must write Java then for goodness' sake fix the plugin interface or document it better or something such that one doesn't need to be a core contributor to implement a plugin--perhaps this is fine for an enterprise audience, but it's not fine for my use case). The rules should call into a base `mktarget()` or similar that takes args like the target's ID (the package:target_name pair), a target type that identifies the code used to build the target, and a dict of args/params that are passed into the aforementioned code. The args/params can be an arbitrarily nested JSON-like type so long as the leaves are primitives (int, string, etc), references to source files, or other targets and all leaves (and transitively, the whole structure) must be hashable such that we can identify a given execution of the build.

Beyond that core operating model, the code and the user interface should be clean and well documented. Ideally, small and medium-sized projects shouldn't need to run it in daemon mode to get reasonable performance. This is important because a daemon running on local development machines introduces a larger maintenance burden (there's just more that can go wrong). Language-specific plugins (custom rules, whatever you want to call them) should adhere pretty closely to the conventions of the target language. Lastly, there should be good support for building toplevel artifacts--this means I should be able to build a whole CloudFormation package including lambdas, Docker images, etc just like I would build a JAR or a C++ binary.

I realize that those things are easy enough to say, but the devil is in the details. I've actually gone so far as to prototype the implementation, so I'm confident that those goals are achievable. Unfortunately, it's a pretty significant effort (mostly due to the breadth of project types/languages to support and the nuance/expertise required to support any of them), so I'm bound by free time. If anyone is interested in collaborating or discussing more in-depth, hit me up on Twitter @weberc2 or email me (username at gmail.com).

cross building is even worse with many repos. I've been there, done that and it broke so often. now we have everything in one repo and we barly have problems. btw. we are a small shop with less than 5 people, but have a product on metal that requires multiple services (that sometimes interact with each other)

we don't use bazel (yet), because dotnet is not that supported.

I am sure you will when you will end up working in a huge organization with intricated and heterogeneous projects/teams interdependencies.

You will soon experience:

- dependencies hell due to transitive and conflicting dependencies

- one back-incompatible change in some obscure library end up breaking some other unknown service that happens to transitively depend and it

- the entire codebase will become a mess due to inconsistent code styles and formatting because hey we are developers and we can never agree on anything. Thus each team lead will have its own opinion

- each team will have to maintain its own CI/CD jobs

- heterogeneous builds: maven, node, sbt, webpack, etc ...

the list goes on ...

All (or most of) this mess is solved by centralizing the codebase in a monorepo.

Yeah, I imagine the test is “do you need those things badly enough to dedicate a large portion of a team’s capacity”?
Nix also exists...
Nix is not (yet) suitable for fine-grained (read file level) build targets though due to lack of recursive nix and content addressed store. This means you don't have early cut off and mass rebuilds if only one file changes; for example. Both are being worked on actively though.
If I understand it correctly (unlikely), Nix has the degree of purely-functional rigor necessary to do this correctly, right? Sounds like it would eventually be awesome for Bazel usecases.
Nix isn't great as a build system, because it throws a lot out and rebuilds everything when something changes. It's intended to get a correct, isolated package installed, not to maximize sharing.

Bazel goes through great pains to only rebuild the minimum necessary for correctness. It's able to do that because bazel build files get a lot more information about the source level dependencies than a Nix file does

I've only experienced issues when using bazel with third party package management systems. If you can own all of your source it, and it's descendants, are easily my favorite build systems. It's features complement modern software development in a very ergonomic way: uniform build language, API for learning about your source, testing your entire code base in every language with one command, hermetic and reproducible builds, distributed builds, and caching I can actually trust.

Using Bazel with external packages on the other hand is one of the most tedious and frustrating endeavors imaginable. If you can vendor all of your source it's much less frustrating. This is extremely manageable in the C and C++ worlds where there aren't really any package mangers and you end up needing to do that anyway.

I would advice checking out https://www.tweag.io/posts/2018-03-15-bazel-nix.html for external dependencies
Pretty much the same experience. 2y old monorepo company with >150 engineers now, being slowed down by hacks upon hacks in the usage of the bazel build system with around 5 people at the company understanding how bazel works. And those people are discouraged quickly when trying to improve things because of the shear brittleness of the existing build files and "optionated" approaches in the community that don't quite cut our usecases. Even things like enabling remote caching and execution take many months (and even external companies) and still drag on.

Edit: I don't want to come off as too negative about Bazel. But I really think it needs more time and is nowhere near something that I would call a 1.0 let alone a 2.0

Indeed. If Bazel had a slogan, it would be "Everyone who doesn't do things my way is stupid". That's probably great within Google, but trying to integrate it into a company that has already made conflicting decisions (for good reasons) is hell.
I feel like that slogan could apply to most OSS projects Google dumps on the world.
To each their own I guess.

I’ve been using it for the last 2 years, and I am not doing any project again without it.

Bazel is not that complex if you start a project with it. Migrating to it and learning it at the same time will be hard though since you’re likely to uncover a lot of skeletons.

This comment is not internally consistent.
How so?
> Bazel is not that complex if you start a project with it.

Bazel is quite complex and when you start a project you do not yet need it, rarely will an organization start with something like Bazel, they use it because - you hope - they need it.

vs:

> Migrating to it and learning it at the same time will be hard though since you’re likely to uncover a lot of skeletons.

So the bulk of Bazel use cases will revolve around migrating an existing build system to use Bazel instead, and that is hard, because Bazel is difficult and has a very steep learning curve, and requires a lot of work to keep it running.

Tooling should adapt to use cases, if you need to adapt your use cases to the tooling then that's a fault of the tool. If that limits use of the tool to those projects that are started with it then you have already lost the vast majority of your potential audience. So yes, if you start using Bazel right from day #1 then that might be the way to go. But I suspect - and so far have not seen any evidence - that that is the way it is actually used.

It is true that you rarely start with a new build system.

Bazel is hard in the same way Rust is hard. If you port your existing project to it, chances are you will run into issues because you were doing things wrong with respect to hermeticity or reproducibility. It goes really far to make things correct. You may not need it, but when you do it’s a godsend. Or at least it was for a lot of people I talked to. And my own experience as well.

If your project is vanilla enough, things will go mostly smoothly and the benefit will be immediate (ie bazel clean is a legend).

Think of Bazel as a framework. If you do thing its way, it will spoil up. But sometimes a framework is not what you need. That said, if you’re happy with your current system, then good for you!

Seconded. Any org with less than a few 100 engineers (and many with that many and more) would do better to stay away from this. I've had the dubious honor of using it for one project and to me the slogan is 'tools should work, not require attention', rather than the opposite. Bazel will require a lot of your attention and for smaller companies that could easily be a big percentage of their available capacity.

For very large organizations with the capability of assigning one or more teams to tooling it may very well be the right choice.

Could you elaborate? I've been using it for a decade, for all my projects big and small, and it's been a _massive_ time saver compared to any of the alternatives. It's fast, well documented, and it doesn't rebuild/retest stuff when it doesn't have to. I've also done multiplatform builds with it, as well as cross-builds to ARM. Not once did I have the need for any "upkeep". At Google there was a team to do that, of course, but even outside Google, even very early on when Bazel was just released, the maintenance is pretty minimal, and the build files are by far the most readable of any build system I have used so far in 20+ years in this industry.
> run as fast as you can in the opposite direction

But which one? Are there any other (non blaze-like) build systems enabling hermetic (possibly remote) builds and caching?

If you don't need these properties and have a mono language project, the language's native build system sure fits and may be a better choice.

There is build2. It has "high-fidelity" (instead of hermetic) builds meaning that besides sources it keeps track of changes to options, compilers, etc. This gives you similar benefits at a fraction of the cost. There is no distributed compilation or caching yet but it's coming. In other benefits, it doesn't need Java or Python (or any other "platform").
Assuming you didn't actually care about hermetic builds, the challenge with build2 is that it's c++ only AFAIK. Larger orgs turn into polyglot scenarios (python/bash for scripting/gluing things together, Go for web services, C/C++ for high performance code, now Rust, etc).

You can of course try to use the native solution on each but that makes it more difficult for people to jump between projects/languages as the syntax for describing the build changes. Moreover for centralized build infra this becomes more difficult to orchestrate/co-ordinate because now you have to add remote caching/parallel compilation & whatnot to multiple places (with all the associated challenges of trying to upstream the same set of logical changes into many different projects with their own maintenance schedules/philosophies).

One of the main benefits of Bazel (and similar systems) is that you get a build cache that you can mostly trust. When you have a project that takes an hour or more to build and test, and lots of machines to run a distributed build, it really makes a difference.

If you have a small project that you can rebuild in a couple of minutes, Bazel is probably an overkill.

Hey Drew, I'm a huge fan and have a lot of respect for your work.

I feel like I see the pattern of people on HN being disappointed in some of the tools that come out of Google and other large engineering orgs, when they don't work out really well in orgs that are not operating at the same scale. People have similar complaints about the complexity of other projects that come out of Google. K8s comes to mind as one such example. Often times these tools must be robust to such a large variety of uses that they are simply overkill for smaller organizations. I'll readily admit that I could be wrong and Bazel is simply poorly designed, but it is perhaps worth considering that the build system used by an engineering team of 50 need not be as complex as the build system used by one of the largest engineering orgs in the world. My guess is we'd see a lot less backlash if people tried to step off the hype train for a moment and critically evaluate whether they really need to use something like Bazel or K8s when something simpler would suffice.

Bazel advertises itself as

> Build and test software of any size, quickly and reliably

Given the comments here, a tagline focusing on its strengths for large orgs/projects probably would be better marketing.

They need to do a better job of making the assumptions behind the design of the tools clearer then. Because, from what I can tell, many people get the idea that the path to success is to do what Google does (even knowing about the meme that people just try and copy Google). This doesn't just apply to their software tools, but also to their corporate processes (OKRs, etc).
> the path to success is to do what Google does

This kind of cargo-cult process copying has infested the start-up world and is akin to sending a lot of shirts to the laundromat because that's what rich people do and you want to be rich too.

These things work for large companies because they are large companies. Their problems and associated solutions rarely if ever are a good match for the kind of issues that your average start-up contends with, especially early on in the life cycle.

You and your buddy the first-hire developers are not going to gain anything by copying the Spotify development model, and other examples in that vein.

OKRs came from Intel, FWIW. Google got them via John Doerr.
I was talking with an engineer who saw two people burn out over Bazel, though the specific gripe was with Scala support. I'd expect first-class languages at Google (C++, Java, Python, Go) to get better support.
The languages themselves have decent support. The problem is that it works great if you code the way Google does internally with all your dependencies vendored. Outside of the googleplex where we have, you know, package managers, bazel adds a ton of complexity and bugginess. The core algorithms are battle-hardened in Google, and the third-party package manager support is a tacked-on afterthought on to the open sourced version.

I don't blame the bazel authors, but the development process it was designed for is not the development process of 99% of companies out there. Maintaining BUILD files for all of your vendored dependencies is expensive for your company. You need a full time team working on it.

Unless hermeticity and build correctness issues are absolutely killing your team's productivity (and at a certain size, they might be!) think twice before adding bazel to your maintenance overheads. You can always move to it later if you need it, and it might be more stable and have a better 3rd party package story by then

You were an order of magnitude too small to need Bazel.
I think it entirely depends on how you use tools. Where I currently work people decided that it would be a great idea to extend waf (another niche build system) with all kinds of features. If you ever worked with waf you can probably imagine how bad that can turn out. As long as you stick to bazel’s default features and don’t start to extend it, I’ve found it really pleasant to use especially for C++ and python projects (where you want to expose C++ libraries to python as well). If you start to extend it it can probably become horrible really quickly. The only experience I have with that are abortive attempts at integrating system verilog tools (turned out to be hard) and integrating a custom GCC toolchain (worked fairly well)
As with many projects using semantic versioning, the major version bump just signifies there are some breaking changes. Most projects will just switch from 1.x to 2 work noticing.
I can’t understand why the Bazel team hasn’t learnt from Go team how to handle breaking changes. Bazel is an amazing piece of tech, but it can definitely be a lot of work to keep up to date.
I don't know the team specifically, but I suspect the difference comes down in part to Go open sourcing early in development, thus finding a bunch of the rough edges that exist outside Google's walled garden early in the project's life when there wasn't much compatibility _to_ break. Blaze was a mature project within Google for years before Bazel was opened up. Many of the breaking changes seem to be taking one-offs built for features within Google (e.g., the handling of protocol buffer rules) and building those in terms of more general and composeable features.

The net result is a Bazel (and Blaze) that are less burdened by the baggage of legacy, but the cost is a faster treadmill to keep pace with changes.

> The net result is a Bazel (and Blaze) that are less burdened by the baggage of legacy, but the cost is a faster treadmill to keep pace with changes.

(I work on Bazel.)

This is accurate. A few of the biggest breaking change themes are:

1) Converting functionality linked within the Bazel binary into the extensibility mechanism implemented in Starlark. An example includes converting the native Java, C++, Android, Python, Protobuf, Obj-C and packaging rules into rules_java, rules_cc, etc. Many languages now are already implemented exclusively in Starlark. See rules_scala, rules_rust, rules_go and rules_haskell.

2) Starlark and Build API cleanups that accumulated over organic growth and development within Google for the past decade.

3) New build system features to support seamless integration with other build systems and package managers.

What's the Go strategy? Just never have breaking changes?
Don’t break, wait. After a decade revisit.
That "changed some corner cases that likely won't affect you" and "rewrite it all" looks the same in SemVer makes it next to useless, not that any other system would be better. We just shouldn't have any expectations about version numbers conveying much information.
How is SemVer next to useless? The major version bump informs you that you should go look up what breaking changes have occurred before you upgrade. It is inherently useful for under-approximating the "safe" range of versions of a piece of software that can be used, which is seen in practice in many package managers.

That it can't differentiate between those two cases is because it's not meant to. It's like complaining that the blurb of a novel is "next to useless" because it doesn't tell you the complete story in a detailed way over several hundred pages.

SemVer isn't useless because of major bumps, but because of the minor and bugfix.

Theoretically every version change can introduce a bug, which leads to an implicit API change and as such require being a major version bump.

Also, fixing a bug can also introduce an API change, because the API can behave differently with and without the bug.

SemVer just covers the intent, not what's actually happening, which makes it kinda useless in most scenarios. I guess Elm gets it right, tho'.

> SemVer just covers the intent, not what's actually happening

If I say "I'm leaving the office to get a sandwich", that statement only covers my intent. If I then sprain my ankle badly, my statement doesn't say what's actually happening.

SemVer has this flaw because it is a way for a human to say "this change does not introduce a change to the API" and that human can be wrong. That seems to me not useless, it just means it is only useful for projects who are willing to trust the maintainers of your dependencies to avoid being wrong about introducing bugs.

--------

It seems like you're arguing that a project which uses a dependency should:

1) Have humans check the dependencies anyway.

or

2) Wire up their automated test suite to something which can record calls to the API of the dependency and the results of those calls. Turn the record of those calls into an set of API contract test cases. Then, on any version bump (minor, major, or patch), run those autogenerated test cases on the new version.

... I think option 2 might be a good idea? It could be a required reviewer for any dependabot PR.

Yes, and this is my biggest frustration with semver. it adds something valuable by communicating breaking changes, but it loses something else valuable, signaling the magnitude of the changes.

Hopefully in the coming years something will eclipse semver which solves both problems sufficiently. I don't know of any candidates offhand though.

Don't know how you could reliably do it.

You could do something like "LoC from last version" + SemVer.

So, 1.2.3k to indicate Major 1, minor 2, 3k lines of code changed from 1.1. It would also possibly be a good way to say "2.0.3" meaning, we moved from 1.2 to 2.0, but only changed 3 lines of code. The breaking change is likely not going to affect you but it is there.

That might make magnitude changes easier to communicate.

I'm not sure how useful this would be for automated build tools though. Would you set bounds on how far drift would go before automated updates?

Semver provides low resolution data on the nature of a change in a release in an easily comprehencible format.
It's just a few config flag flips.

This is a bizarre naming. They could call it Blaze installer 2.0 for Blaze 1.

(I work on Bazel)

This is how Bazel rolls out incompatible changes:

- Introduce a new behavior behind a flag

- Wait

- If there was no push back (and key projects could migrate), flip the flag to enable the new behavior by default.

The goal is to give an opportunity for users to update their code in advance, and get feedback about migration issues.

Bazel 1.0 was released on October 10. Not much more than two months ago. At best, semantic versioning expresses the intent to break existing code. It's an anti-pattern. It's not a feature. Projects using semantic versioning are explicitly caveat emptor.
I didn't know, so, just for anyone else who didn't:

> Bazel is an open-source build and test tool similar to Make, Maven, and Gradle. It uses a human-readable, high-level build language. Bazel supports projects in multiple languages and builds outputs for multiple platforms. Bazel supports large codebases across multiple repositories, and large numbers of users.

I’ll jump here to say that Bazel 1 was awesome, and I’m looking forward to trying out Bazel 2.

I was wondering how to make sure Bazel doesn’t rebuild something it has built previously? (Caching)

There are many layers of caching within Bazel (remote/local, inmemory/disk), but the central functional incremental engine is called Skyframe [1]. Almost every computation within Bazel that can be incrementally executed is managed in this engine.

[1]: https://bazel.build/designs/skyframe.html

Does bazel use the word “provenance“ at all?

Provenance is a word I first saw advertised in a platform called dotscience.io — that I find fundamentally interesting. And it seems quite relevant to hermetic builds.

Provenance is about giving any state derived from an arbitrary computation an identity that is derived from the content hash of the inputs needed to re-compute that state ... in dotscience they achieve this by instrumenting io and creating zfs filesystem snapshots when computing new provenance artifacts.

I think this concept could be the ultimate building block for a build system — and it could become the job of oses/containers/runtimes/databases to Coordinate to allow this abstraction to be tracked with sufficient efficiency that programmers would feel allowed to freely use the concept of provenance when building ... it seems to me like provenance could provide all the information needed to support a distributed build cache? You wouldn’t actually need a build language at all — just an api in each language to ask for the saving of provenance artifacts. The artifact would hold all the info needed to be able to recompute the artifact with the same state — which is also all the info needed to decide when the artifact is out of date ...?

Bazel is part of the story of how Google manages provenance for build artifacts (https://cloud.google.com/security/binary-authorization-for-b...)
This is not entirely correct. It's not Bazel but "build system very similar to Bazel" (from your source) and that's I guess their internal Blaze tool.

I wonder what's the real usage of Bazel (not Blaze) in Google.

According to this comment [0] by laurentlb (one of the people working on Bazel who also commented in this post) from a year ago, Blaze is just Bazel but with integrations to Google-internal tools.

[0] https://news.ycombinator.com/item?id=18823546

Build systems seem to sit in that category of perenial category of things that keep getting re-invented, and either recapitulate existing problems or create new ones.

I don't think people will ever fundamentally all agree on:

    - static vs dynamic configuration
    - custom language vs piggy back on existing
    - intelligent, deeply integrated / understands code 
      it is building vs "language agnostic" but 
      necessarily shallow integration
All of these are fundamental tradeoffs that mean every tool will have limitations that about 50% of people don't like. And so we will keep re-inventing forever I think.
Didn’t this just hit 1.0?
Yes, 2 months ago.
Must have some people from the Chrome team working on it.
Not that they can’t also be contributing to Bazel, but I believe that Chrome uses GN.
I hadn't heard of this, and see there is a lot of concern over using this project except for specific use cases.

I'm always weary of build tools that try to do multiple languages. On Scala projects I use SBT, and for people who have tried to hack on SBT itself or its plugins, you know it's a big mess under there. On other projects I've tried using Gradle with Scala, but I found a lot of times Gradle just wasn't setup for a Scala workflow or was missing essential tooling to make it as effective as SBT (although its configuration is considerably more sane). Most of the tooling and plugins around Scala are built around SBT as well (for better or for worse).

I try to stick with the major tool for a given language; cargo for Rust, SBT for Scala, the built-in tooling for go, with the exception of Java projects where I'd gladly take gradle over the hellscape that is Maven.

That works for small teams/projects. If you're looking at hundreds, thousand, or even tens of thousands person orgs you're spending time training everyone on every individual build system (very time-consuming and error-prone). Additionally because of the lack of consistency & unfamiliarity with having to deal with multiple interconnected projects, these tools fall apart spectacularly in that each team will end up with their own flavor of the build system. This makes transitioning between projects very hard & silos off teams.

That can be fine but can make it even more of an efficiency loss for someone switching projects/contributing partially to another project. Uniformity reduces costs on many fronts but like anything else it's a tradeoff. Now you need a team to maintain your Bazel/Buck/etc for each language & it may not jive 100% well with languages that have opinionated package managers/build systems alread (Node, Cargo, SBT, etc). On the other hand you'd probably end up having to create teams to maintain your company's Node, Cargo, SBT builds anyway except now you need to hire domain experts who not only understand each language but also how it should integrate within your larger infrastructure. A single uniform build system framework makes that easier.

I miss the days when JavaScript frameworks could be built with a simple npm install and executing a Grunt/Gulp file. Now to build Angular I need Yarn, Java, Bazel, and hundreds of megabytes of additional tooling downloaded by the build script. On a slow connection it takes ages to download everything, and even then the build often fails (on Windows I have yet to get it working successfully).

Edit: I'm referring to building the framework itself (e.g. to contribute a fix). Building an Angular project with the CLI works quite well.

I miss the days when JavaScript frameworks required only a script tag.
I miss the days when Javascript didn't exist.
Those days are BAAACK. Use Vue ! (The best JS framework in the world!)

<script type="module"> import Vue from 'https://unpkg.com/vue@2.6.0/dist/vue.esm.browser.min.js'; new Vue({ ... }); </script>