Hacker News new | ask | show | jobs
by laumars 2127 days ago
There’s a few problems with forking out:

1. Do those programs exist and what happened if they don’t? That behaviour is already understood in Bash, less so in random 3rd party Rust libraries.

2. Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out? (eg easy to check $PATH in Bash but does this library honour that? Easy to ‘which tar’ but is that going to be the same tar that this library forks?)

3. Are the people using this software even aware what external programs are being executed? How do they validate this? A .sh file clearly signals that there are external dependencies that need to be audited. A .rs file does not. This problem becomes magnified if you then start shipping compiled binaries rather than source.

I get people who like Rust are unlikely to be people who like writing shell scripts but the better way to think of this is like an MVC-like design where you have separate concerns that should be clearly separated in source.

2 comments

None of these are, as far as I can tell, "unsafe" in the rust sense. They won't result in memory safety issues in the current process.

Your concern is a generic concern about shelling out which maybe makes sense in some cases, but is untenable in general.

I also don't see how 1 & 2 are real problems. From looking at the readme I understand what happens if tar doesnt exist. The macro raises an error. This is similar to what calling subprocess.check_call would do in python. It's quite safe and well understood.

And 2 feels truly made up. Not only could you invoke "which tar" within the macro to find the answer, Id bet you a good bit of money that the answer is whatever is in your path. Anything else would be weirdly complicated. This is exactly the same as every other language that has a way to shell out to a subprocess.

And even if you have MVC like design where things are clearly separate, something will need to call the shell at some point. So issues 1, 2 and 3 never go away, even if you stick the script in a .sh file, you still need to invoke that file. And now how do you deploy that?

It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Writing your own shell script parser also introduces other surprises to new developers to your code base (eg what POSIX tokens are supported?).

I’ve seen all to often people trying to get clever because they don’t like a particular ugly but well understood standard and it usually results in more problems than it solves. Which is fine if it’s a personal pet project but such solutions don’t belong in production code.

> It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

If I call a function that can result in an error condition, and I correctly handle the error condition, my code is not "unsafe".

So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

>The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant. And yes, there are cases when you need to shell out to another program, because otherwise you reduce yourself to needing to do everything in bash, or have bash be the entrypoint in some weird inversion of control scheme, and I'd much prefer to construct a single command invocation in bash than to parse a complex set of flags, for example.

Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

> So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

The point is that code doesn't belong in Rust to begin with!

> Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant.

No it isn't. Code auditing and vetting has been a thing for years. Say you have a CI pipeline that hooked into Shellcheck to validate your .sh files for errors, that same pipeline wouldn't vet any pseudo shell code inlined in Rust.

> Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

Not all, only the concerns you've cherrypicked.

> The point is that code doesn't belong in Rust to begin with!

I'll reiterate: it is often safer to embed short snippets of bash into other languages than to invert control and call out to other languages from bash. By calling out to bash, you do the majority of your work in better languages.

> No it isn't. Code auditing and vetting has been a thing for years. Say you have a CI pipeline that hooked into Shellcheck to validate your .sh files for errors, that same pipeline wouldn't vet any pseudo shell code inlined in Rust.

You're making a rather particular set of assumptions there.

> Not all, only the concerns you've cherrypicked.

The three you originally mentioned...

> I'll reiterate: it is often safer to embed short snippets of bash into other languages than to invert control and call out to other languages from bash. By calling out to bash, you do the majority of your work in better languages.

"It depends" is a better way of putting it. However the advantages of embedding Bash doesn't, in my opinion, make up for the problems it creates by obfuscating those calls. Putting Bash inside separate .sh files clearly draws attention to those calls.

It's the same reason Rust has the unsafe block - to draw developer attention to unsafe code. So what I'm talking about here is more idiomatic to Rust.

Not to mention this also creates potential surprises due to being a custom parser which could trip new developers on that code base. Smarter code creates fewer surprises, even if that sometimes means uglier code.

In short, if inlining a shell script in Rust seems like a good idea, then I'd suggest one would need to reinvestigate the original problem and possible solutions. There's bound to be a more predictable and maintainable solution out there, even if it is a little less interesting / fun / trendy.

> You're making a rather particular set of assumptions there.

Inlining code is often regarded as an anti-pattern. Separate out your concerns, separate out your languages. It helps with your IDE (eg syntax highlighting, code completion, etc), your code validation tools (eg Shellcheck), with humans understanding the code (path of least surprises), etc.

> The three you originally mentioned...

They weren't the original points I mentioned nor even the only points I've discussed since. They were only a breakdown of one of the points I had raised.

The subprocess API is without system calls? Otherwise it calls into heretic C code.

Of course Rust itself is not "safe" either:

https://rustsec.org/advisories/CVE-2018-1000810.html

> Do those programs exist

If those programs don't exist, they are simply missing dependencies of the program.

In the shell, we might use the type command. Something similar could be integrated into this scripting system to detect whether some string corresponds to a command that can be found in the PATH.

> Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out?

PATH is actually used by low-level routines in POSIX, like execvp. If execvp is used as the basis for dispatching commands, then PATH is searched.

> A .rs file does not.

That's a fair point. Over the years, I have seen a fair share of C programs break because they were actually using system() or fork()/exec() to run programs that were missing or had some other problem.

I've also seen (and written myself) complex shell scripts that check for their dependencies up-front and complain if some are missing, which is a good idea, especially if not all execution paths use every dependency, or if an unexpected termination could occur after a lengthy process that the user will have to recover from and repeat.

It can also be loudly documented as part of the system requirements of the program. "This program relies on the utilities tar, awk and expect which are expected to be in the PATH. It was tested with GNU tar 1.29, GNU Awk 4.1.4 and Expect 5.45.4."

If we are packaging this program into a distro, we can express those dependencies in the packaging meta-data, so they are pulled in automatically. The package manager has to be conscientious and to understand that program's requirements.