Hacker News new | ask | show | jobs
by ninjanomnom 1140 days ago
Code that only triggers from a yearly holiday, disaster alerts, leap years, or the like, would have longer periods of going unused and likely be very problematic if removed. Unless by dead code you mean unreachable code in which case it shouldn't exist in the first place and I agree should be removed.
2 comments

That isn't dead code.

Dead code is shit that has a build rule and no other build rules in the entire repo depend on it. Or private functions that have no callers in an application that isn't built with any kind of reflection capabilities.

Or code that is behind if(false), System.exit or null pointer dereference. Finding those kinds of things is one of the best part of modern compilers, it's nice that they've extended that to the whole ecosystem.
Dead code or abandoned project with goals that got forgotten but might still be important?
Yes, the nice thing about blaze/bazel + sensenmann is that you can very accurately say "this code was not built into a binary that has run in the past 6 months".

Sometimes you still want it (e.g. python scripts that are used every once in a while for ad-hoc things and might go months between uses), but usually the right thing to do is productionize stuff like that slightly more (and also test it semi-regularly to make sure it hasn't broken).

You can probably get most of that by just looking at the atime attribute on the file system.
Nah, there's stuff that scans the entire repo regularly for all kinds of interesting purposes, and of that's ignoring the fact that `atime` isn't available or a source of truth in piper.

Like conceptually I believe this could be wrong in both directions, since there's heavy caching of build artifacts, you can totally build a transitive dependency of some file without actually reading the file (and potentially do this for a relatively long period of time, though I don't think that will happen in practice), and stuff will regularly look through large swaths of files that aren't necessarily run.

Like conceptually why wouldn't you turn on atime and like why would you cache an entire file instead of just reading it?
> conceptually why wouldn't you turn on atime

Because piper[1] isn't a normal filesystem. It's often accessible through a FUSE-based api, so it appears like a filesystem to some users sometimes, but it can also be accessed over an RPC api. So the concept of "atime" isn't really a fit, because, well, you access the filesystem from a view that is based on your workspace, (think, akin to the git commit hash being in the filepath), and the "real" underlying file isn't necessarily on your machine.

So under a reasonable definition of atime, the atime of most files is never, because on any given arbitrary commit, you don't access/build everything.

> you cache an entire file instead of just reading it

You don't cache the file, you cache the file's outputs, keyed by a hash of the file (or all the files which are deps of a particular output). With bazel, you have a shared build artifact cache that can securely and reliably be shared across every user at a company with tens of thousands of engineers. If I build some target `:foo`, which corresponds to `foo.o`, generated from `foo.c`, but someone built `:foo` five minutes ago, as long as `:foo` hasn't changed (and you can check that the file hasn't changed without reading it because the fancy filesystem stores a hash of the file alongside the actual file), I won't actually read `foo.c` or go though the motions of building `:foo`, I can just pull `foo.o` directly from the object cache and use that in any dependencies, which means that I can build my output without ever invoking a compiler, which is really cheap and fast.

You could argue that upon reading the hash you should update the atime, but that's extremely expensive since now you have to do writes instead of idempotent reads a bunch of times a second.

[1]: https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

Nah, like conceptually you could check the atime of the cached outputs too