Hacker News new | ask | show | jobs
by laumars 2123 days ago
Please people, don’t do stuff like this for anything other than personal projects. You might think it’s safer than writing Bash but it isn’t.

It results in unsafe Rust code since you’re now forking external code that might be missed by people who are strictly vetting for code inside “unsafe” blocks. Ironically anyone who writes she’ll scripts will know that there are problems with shell scripting but thankfully dot-sh files stand out and bring attention to themselves as files that need to be audited. This wouldn’t. If you need to embed other languages or even just the approximate concept of then, then please at least keep those language files separate rather than inlining them.

Then you have an issue that people who are already aware of the pitfalls of shell scripts would know to read through any such scripts but this introduces a newer and unfamiliar scripting language to audit (eg how do we knew that what’s been declared is run but free?). At least Bash et al has had many years of eyeballs on it.

1 comments

> unsafe Rust ... since you’re now forking external code

Are you saying that Rust becomes unsafe because it used a C program as a subroutine? E.g. "tar xvf -" or whatever? What is the fix: rewrite tar, awk, scp and whatever else as Rust functions? That's a lot of work.

I'm surprised that you're simultaneously overlooking what ought to be a more gaping problem: that every system call made by a Rust program is a trip through a kernel written in C.

Could you please be more specific how it's a "gaping problem" that the underlying kernel is written in C? I think even you'd write a pure Rust kernel from scratch it would take a considerable time to achieve the same quality/performance ratio as we are currently witnessing with C based kernels (*BDS & Linux). It's so easy to throw these "radical claims". Yes? =)
I think the idea is that if calling external C binaries is a problem, then a kernel written in C is an even larger problem. It was meant as reductio ad absurdum.
The point is that if forking a process to invoke an external C program to run in another address space is "unsafe", directly calling into the OS (like making that fork call) should be considered "mega unsafe".
Actually I think Python did that? There is tarfile in the standard library and in my experience it worked quite well. So perhaps that is actually the answer. I do not know how tarfile is implemented though, so pergaps it is itself using any available tar implementation?
There’s a few problems with forking out:

1. Do those programs exist and what happened if they don’t? That behaviour is already understood in Bash, less so in random 3rd party Rust libraries.

2. Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out? (eg easy to check $PATH in Bash but does this library honour that? Easy to ‘which tar’ but is that going to be the same tar that this library forks?)

3. Are the people using this software even aware what external programs are being executed? How do they validate this? A .sh file clearly signals that there are external dependencies that need to be audited. A .rs file does not. This problem becomes magnified if you then start shipping compiled binaries rather than source.

I get people who like Rust are unlikely to be people who like writing shell scripts but the better way to think of this is like an MVC-like design where you have separate concerns that should be clearly separated in source.

None of these are, as far as I can tell, "unsafe" in the rust sense. They won't result in memory safety issues in the current process.

Your concern is a generic concern about shelling out which maybe makes sense in some cases, but is untenable in general.

I also don't see how 1 & 2 are real problems. From looking at the readme I understand what happens if tar doesnt exist. The macro raises an error. This is similar to what calling subprocess.check_call would do in python. It's quite safe and well understood.

And 2 feels truly made up. Not only could you invoke "which tar" within the macro to find the answer, Id bet you a good bit of money that the answer is whatever is in your path. Anything else would be weirdly complicated. This is exactly the same as every other language that has a way to shell out to a subprocess.

And even if you have MVC like design where things are clearly separate, something will need to call the shell at some point. So issues 1, 2 and 3 never go away, even if you stick the script in a .sh file, you still need to invoke that file. And now how do you deploy that?

It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Writing your own shell script parser also introduces other surprises to new developers to your code base (eg what POSIX tokens are supported?).

I’ve seen all to often people trying to get clever because they don’t like a particular ugly but well understood standard and it usually results in more problems than it solves. Which is fine if it’s a personal pet project but such solutions don’t belong in production code.

> It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

If I call a function that can result in an error condition, and I correctly handle the error condition, my code is not "unsafe".

So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

>The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant. And yes, there are cases when you need to shell out to another program, because otherwise you reduce yourself to needing to do everything in bash, or have bash be the entrypoint in some weird inversion of control scheme, and I'd much prefer to construct a single command invocation in bash than to parse a complex set of flags, for example.

Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

> So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

The point is that code doesn't belong in Rust to begin with!

> Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant.

No it isn't. Code auditing and vetting has been a thing for years. Say you have a CI pipeline that hooked into Shellcheck to validate your .sh files for errors, that same pipeline wouldn't vet any pseudo shell code inlined in Rust.

> Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

Not all, only the concerns you've cherrypicked.

The subprocess API is without system calls? Otherwise it calls into heretic C code.

Of course Rust itself is not "safe" either:

https://rustsec.org/advisories/CVE-2018-1000810.html

> Do those programs exist

If those programs don't exist, they are simply missing dependencies of the program.

In the shell, we might use the type command. Something similar could be integrated into this scripting system to detect whether some string corresponds to a command that can be found in the PATH.

> Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out?

PATH is actually used by low-level routines in POSIX, like execvp. If execvp is used as the basis for dispatching commands, then PATH is searched.

> A .rs file does not.

That's a fair point. Over the years, I have seen a fair share of C programs break because they were actually using system() or fork()/exec() to run programs that were missing or had some other problem.

I've also seen (and written myself) complex shell scripts that check for their dependencies up-front and complain if some are missing, which is a good idea, especially if not all execution paths use every dependency, or if an unexpected termination could occur after a lengthy process that the user will have to recover from and repeat.

It can also be loudly documented as part of the system requirements of the program. "This program relies on the utilities tar, awk and expect which are expected to be in the PATH. It was tested with GNU tar 1.29, GNU Awk 4.1.4 and Expect 5.45.4."

If we are packaging this program into a distro, we can express those dependencies in the packaging meta-data, so they are pulled in automatically. The package manager has to be conscientious and to understand that program's requirements.