Hacker News new | ask | show | jobs
by minighost 1429 days ago
The author is absolutely right. codesearch is amazing. Blaze/bazel is amazing. When will sourcegraph support bazel repos? :-)
2 comments

Ironically, some of the steps that Bazel takes to make builds hermetic make it more difficult to integrate with to pull the information we need out of the build for precise code navigation. But we're working on it :) If this is an area of interest to you, pop into our community Discord and say hello!
Have you checked out Kythe? It seems possible to use their existing action magic to pull out build data.
Bazel often comes up as an awesome aspect of Google's tooling but as far as I can see it's open source and hasn't really caught on elsewhere. Is there something I'm missing about why?
Bazel works well when you're in Google's situation: many teams working on separate but connected projects, in multiple languages, infinite money to fund infrastructure teams and near constant migrations, a mandate that all code (*approximately) must be developed using it, and a culture of vendoring every dependency. Meanwhile the typical OSS project is a group of folks working on a monolith in one or two languages, and the predominant development culture avoids vendoring (for better and worse.)

I'm a strong apologist for Bazel, but I can absolutely see why projects avoid it. The more you deviate from what Google does the more paper cuts you'll get, but the less work you'll have to do. Bazel is great but it's certainly not a clear win in the general case.

I also think it is partly as the tooling is not there yet - especially in the typical case when a project depends on lots of external dependencies.

I am looking forward to new set of bazel rules being worked on for eg. https://github.com/aspect-build/rules_js and https://github.com/jvolkman/rules_pycross which will makes it more idiomatic to work with existing language ecosystems.

I think you've hit the nail on the head with the comment about funding infrastructure teams, where I work there's a mandate to use Bazel everywhere, but the project I'm working on basically doesn't use it because our use case doesn't fit nicely with the existing bazel infrastructure (it's currently not hermetic). The infra team who say we should all be using bazel also say that we can't use the the CI infra they maintain unless we're bazel. So we're basically out in the cold, but despite insisting we must do it their way, they don't offer to support our use case.
Bazel by itself is not nearly as useful as Blaze (the Google-internal version), in large part because of how standardized the internal ecosystem was around Blaze. A few reasons (non-exhasutive list):

- Blaze defaults to building everything remotely, on a massive farm of build servers. (This farm was sizeable enough to have its own capacity planning teams. Yes, plural.) There are a few startups trying to clone this SaaS for Bazel, but it is - predictably - not easy.

- Since all builds were remote, build _telemetry_ was also uploaded and available on the intranet. This meant that if you wanted to ask someone "hey, my build failed, can you help", you could literally send them a link "https://google-build-system/00000-11111-22222-the-rest-of-my..." with all the information about your build: the thing(s) you were trying to build, the options you were building with, all the foreign keys you needed to look up build timing data, etc. You could also query company-wide build telemetry data to see what people were building more frequently than others, if specific areas needed engineering investment, etc.

- Since builds were remote, they pretty much had to be hermetic and reproducible, which meant platform teams - e.g. C++ team, Java team, Python team, Go team, etc. - could make infra-level changes and test out the effects on various teams. This is a very Google-scale specific need: allow another team to do your infra maintenance for you, i.e. upgrade the Python runtime underneath you. (Yes, this comes with caveats. Google eng tends to live at head, so while there are "my team needs to control our rollout to the latest infra version" concerns, there is a consistent mandate for most teams that "upgrading to the latest infra that we run on" is a priority.)

The big place where problems came up was when source code needed to exist _outside_ of google3: open-source repos, for instance, very frequently have to do a delicate dance of figuring out (1) is their primary source of truth going to be inside google3, where the tools/infra are uniform and Better TM, or on github/etc, where they have to configure their own test stack, CI stack, etc., and (2) what are the downstream implications of decision (1).

For many reasons, it's not a great general purpose build system. It solves Google's problem fine, but fails to cater to any other use case.