Hacker News new | ask | show | jobs
by jiggawatts 1189 days ago
The mistake is that arbitrary transformations != arbitrary code.

I want the build process to be able to generate arbitrary code based on the inputs given to it from the source control — but nothing else. No reaching out to HTTP command and control endpoints, making database calls, or deleting my home directory.

It’s not just because of security. Security is a side-benefit here.

The real benefit is that unrestricted build processes cannot be versioned with source control. If the build process can “reach out” and pull in data from external sources, then it will always use the “latest” version, not the version in that branch or commit.

It’s about being hygienic.

2 comments

Then avoid crates that do such things. Other people however are able to make use of compile time code execution to do some pretty awesome things. For example, a database library sqlx can check all the SQL in your code as being syntactically correct, and also typed correctly against a test database at compile time. A feature that is useful and convenient for users of the library.
"Allowing connections to http://ga1sdf4saf.ru is fine, because it's so convenient not having to put things in source control."

The database example is (largely) a solved problem. Microsoft SQL for example lets you check in an ".MDF" database file into source control. If it's a "schema only" file, it's probably just a few megabytes. It can be loaded locally without a "server" using a connection string that simply references the file name. Similar things can be done with SQL Lite, etc...

Even these approaches miss the point to a degree. Relying on an external executable is also a mistake. What if the developers update their database engine version on their laptop, and they need to go back to a previous major release branch to produce a security hotfix update? They might not be able to if the build tools have "moved on".

This is not some esoteric scenario, I'm facing this issue right now with some old SOAP endpoints where I need to rebuild the front-end that has been untouched for 10+ years, but I can't because the endpoints are HTTPS with TLS 1.0 but all new desktop and servers enforce TLS 1.2, so now I'm stuck.

The correct solution instead of the dirty shortcut is to include the WSDL file into the source code and reference it from there.

This also allows builds in cloud-hosted build platforms like GitHub Actions or Azure DevOps Pipelines, because with a hygienic build process no "LAN connectivity" is needed or assumed.

Your convenience will become someone else's security nightmare.

I agree with you and I'm not sure why you're being downvoted.

That being said, it's nice to be able to have guarantees about your build without having to look at the transitive closure of dependencies in your project. It'd be nice if crates could be marked as "hygienic build" or something, and a hygienic crate can only depend on other hygienic crates. And then something like `cargo check-hygienic` which fails if any dependencies are non-hygienic.

'avoid the crates that do it' requires careful vetting of all code in the crates you use and all the crate's dependencies, now and in all future versions of your crate and crate's dependencies. Which in reality turns out to be impractical for most projects in most work environments. And even if practical, turns out that many ways of vetting the code will expand the macros and do arbitrary code execution.
> requires careful vetting of all code in the crates you use

I just explained it would be useful to have a cargo sub-command for automating this

You = “You and all your coworkers, forever.”
As the proud owner of a production database, a test database, a duly ancient build system, etc, this is entirely wrong.

It would be delightful if my build system checked my SQL against the schema that is checked in to the same repository. It should absolutely not look at my test database, nor should the test database even need to be running, thank you very much.

IMO builds should be sandboxed and deterministic by default. And turning off that default should require whoever invokes the build to explicitly grant permission to escape the sandbox.

If you need fancy things in the sandbox, put them in the sandbox.

These are optional features. You can decide whether or not you want to use them. No one is forcing you to do these checks at compile time.

My point is, the capability is useful to some people, and there are many other ways that doing arbitrary things at build/compile time can be useful or make things easier. The sqlx example is one of many.

Another usage, is calling out to another tool, e.g. a protobuf code generation tool. That requires the build toolchain interacting with another tool, that would "break the sandbox."

The ability to reach .ru addresses is also convenient to some people.

Speeding down the highway as fast as possible is also convenient to some people.

Convenience becomes some else’s bad day.

Isn't there an effort to use compile rust macros to wasm to sandbox them ?