Hacker News new | ask | show | jobs
by watt 814 days ago
I advocate for checking in the auto-generated code. You can see the differences between the tool runs, can see how changes in tooling affect the generated code, can see what might have caused a regression (hey it happens).

Sometimes tooling can generate unstable files, I recall there was time when Eclipse was notorious there, for example when saving XML files they liked to reorder all the attributes. But these are bugs that need to be fixed. Tooling should generate perfectly reproducible files.

2 comments

We started off doing this, but you end up with enormous diffs which are themselves confusing. Example, only about 5% of this change is non-generated:

https://github.com/libguestfs/libguestfs/commit/5186251f8f68...

Probably depends on the project as to whether this is feasible, but for us we intentionally want to generate everything we can in order to reduce systematic errors.

in github, you can mark a file a generated [1], which hides it in the PR view by default

[1] https://docs.github.com/en/repositories/working-with-files/m...

Wouldn’t an attacker like JiaT75 do that to increase the odds of someone skimming it?
They might try - that's why it's important if you're generating + committing generated code that you also have a CI step that runs before merging anything which ensures that the generated code is up-to-date and rejects any change request where generated code is out of date.

Mostly this helps with people simply forgetting to re-run the generator in their PR but it's a useful defence against people trying to smuggle things into the generated files, too!

Yeah, I guess my general thought is that anything which encourages hiding files is actively risky unless you have some kind of robust validation process. As an example, I was wondering how many people would notice an extra property in a typically gigantic NPM lock file as long as it didn’t break any of the NPM functions.
The same feature recently added to GitLab
I disagree - you should ensure your dependencies are clearly listed. Docker excels at this - it's a host platform independent way of giving you a text based representation of an environment.
Docker is a Linux thing, and very much not host-platform independent. It's just "chroot on steroids", and you're essentially just shipping a bunch of Linux binaries in a .tar.gz.

It works on other systems because they emulate or virtualize enough of a Linux system to make it work. That's all fine, but comes with serious trade-offs in terms of performance, system integration, and things like that. A fair trade-off, but absolutely not host-platform independent.

Sort of. I have about 15 containers running on my dev laptop as I type. Which versions of xz are on each of them, and how do I make sure of that?