Hacker News new | ask | show | jobs
by bayindirh 1090 days ago
Currently vendoring all of the dependencies for Rust implementation pulls in 1.75 million SLOC, which I find amusing.

This is a lot of SLOC, and a huge surface area to pull targeted supply-chain attacks, IMO.

P.S.: I know you don't compile in all of this into the binary, yet consider the eyes and work hours required to verify that the whole chain is sane and safe.

This is how it looks:

    vagrant@rust-playground:~/development/sudo-rs$ cloc .
        3549 text files.
        3271 unique files.                                          
         453 files ignored.
    
    1 error:
    Line count, exceeded timeout:  ./vendor/proc-macro2/src/parse.rs
    
    github.com/AlDanial/cloc v 1.86  T=9.67 s (320.2 files/s, 208065.8 lines/s)
    --------------------------------------------------------------------------------
    Language                      files          blank        comment           code
    --------------------------------------------------------------------------------
    Rust                           2637          49276         111413        1754298
    diff                              2            884          32618          35892
    Markdown                        182           4903              9          13341
    TOML                            137            807           1073           5734
    Assembly                          8             90             71           1244
    YAML                              5             80             26            393
    JSON                            106              0              0            120
    reStructuredText                  1             70              4             90
    C/C++ Header                      9              5              2             79
    Bourne Shell                      5             14             17             64
    C                                 2              6              6             46
    Bourne Again Shell                2              7              8             41
    Python                            1             17             19             38
    Dockerfile                        1              0              0              2
    --------------------------------------------------------------------------------
    SUM:                           3098          56159         145266        1811382
    --------------------------------------------------------------------------------
7 comments

Quite a few of those ~1 million lines seem to be in libc-type wrappers, which is roughly analogous to libc for C (which also isn't included).

If I remove those the line count is about 100k, which is about the same as sudo.

> Quite a few of those ~1 million lines seem to be in libc-type wrappers, which is roughly analogous to libc for C (which also isn't included).

I didn't know that Rust programs run without a libc.

They don't. The way it's set up is:

The stdlib only exposes some syscalls via rust-friendly wrappers, and in a way that is generally a cross-platform subset, sometimes with platform extentions. So the `fs` module (https://doc.rust-lang.org/std/fs/) exposes common file operations across most (all?) supported platforms. Some unix specific file operations are exposed at: https://doc.rust-lang.org/stable/std/os/unix/fs/index.html . These things do a pretty good job for most work, but sometimes you gotta get wierd...

The libc crate allows more direct access to the syscalls on your flavor of *nix by creating a rust-> bridge for them, and exposing the C types directly. This bridge isn't a lot of code, mostly it just does the work of creating a rust function that minimally wraps the external C function. For a lot of low-level software you end up pulling the libc crate to hit your system specific calls.

And for some very system specific calls (e.g. io_uring) you end up having to pull in another crate that calls into that subsystem for you (often pulling in libc also).

All of this ends up being linked to the libc you build against tho.

Speaking from nearly total ignorance of Rust, I don't understand why things can't be hidden in a ton of wrappers and interfaces.
What’s being counted here is the interfaces and wrappers.
Rust programs can run without a libc:

1. on Linux

2. in embedded

The vast majority of them do use libc, as the underlying platform requires. The default on Linux is to use a libc as well.

Doesn’t matter. It’s still quite a big haystack to hide some tactical needles here and there.
It does matter, because to get a comparable number from sudo you'd have to count libc. I don't know how many lines of code that is exactly (and e.g. musl or OpenBSD libc is probably smaller than GNU, so there isn't a fixed number to start with), but I bet it's roughly a comparable number, or at least much closer.

As a general point I agree with you, but both sudo and sudo-rs are pretty large.

libc is used by the entire system. It's not a vendored dependency of just sudo. At best, you'd amortize it over everything that uses it to get a reasonable comparison.
Are we not on the trail to doing the same with rust, at least in theory? You don't think someone somewhere has the idea to make an all-rust system eventually? Even without that evetual ultimate expression, if you have even a few rust components in your system and not just sudo alone, then the same point stands. You don't count rust itself any more than you count gcc.
> You don't think someone somewhere has the idea to make an all-rust system eventually?

In progress. Looks good so far. I haven't tried it, though.

https://www.redox-os.org/

The sudo-rs Cargo.toml [1] file seems very reasonable. This is the curse of being cross platform. The inclusion of https://github.com/Stebalien/tempfile as a dependency is responsible for the overwhelming majority of lines due to including *-sys crates for multiple OSs.

    ~/Code/tempfile !! tokei vendor
    
    ===============================================================================
     Language            Files        Lines         Code     Comments       Blanks
    ===============================================================================
     GNU Style Assembly      8         1405         1276           39           90
     C                       1            3            2            0            1
     Shell                   1           23           18            1            4
     TOML                   25         1536         1178          234          124
    -------------------------------------------------------------------------------
     Markdown               29         2193            0         1560          633
     |- C                    1            2            2            0            0
     |- Rust                11          277          220           19           38
     |- TOML                 8           60           58            0            2
     (Total)                           2532          280         1579          673
    -------------------------------------------------------------------------------
     Rust                  917       811953       792140         4536        15277
     |- Markdown           304        17252          124        14364         2764
     (Total)                         829205       792264        18900        18041
    ===============================================================================
     Total                 981       817113       794614         6370        16129
    ===============================================================================

[1]: https://github.com/memorysafety/sudo-rs/blob/60985b2f5f7ffa8...
and tempfile is not even used as a runtime dependency. It is only ever called, or even linked, in tests.

(by the way, Herman, much belated thanks for putting together the original Rust LA meetup all those years ago!)

Curious: Does cargo not distinguish between dev and runtime dependencies?
It does, but all dependencies are put in the same folder, which is why the original poster conflated them.

If you look at the linked Cargo.toml you'll see

  [dependencies]
  libc = "0.2.139"
  <snip>

  [dev-dependencies]
  pretty_assertions = "1.3.0"
  tempfile = "3.5.0"
In general, you have a very valid point, but how many lines of code do we need to build the normal sudo if you are generously adding stuff we don't compile into the binary? Compiler, tools, some machine with some userspace and kernel for those to run on and so on?
Doesn't matter. Both implementations run on the same platform and poisoning the compiler for both versions are equally probable.

If you can pull a lower level attack with general purpose toolchain, targeted for either implementation, it's a more impressive feat, for sure.

However, Rust implementation adds a significant SLOC on top of that complexity.

Arguing that complexity comes from SLOC feels like paying per LOC... it sort of misses the point.

The languages are different - a lot of C behavior feels "inferred" or "implicit". A lot of Rust behavior is explicit, that is you have to write down exactly what's happening. So things like casting a void* to a $whatever require a couple of lines of rust, not just a single line (or fragment) of `($whatever *) p`.

My personal experience is that the explicit nature of Rust is pretty nice when visiting new code, or revisting code I wrote a while back - everything is written down for me, whereas I have to puzzle out a lot of behavior from the C. It's a bit annoying at first, "cmon compliler, why do I have to tell you this?" is still a common refrain in my head, however its worth it in the long run - revisits to the code are much faster to grok/reload, and once I got used to it, writing it down as it was all loaded in my head the first time wasn't so much of a pain anymore.

I’m on the sudo-rs team: we are actually very mindful of the dependencies we use. Our current main branch already uses significantly fewer dependencies than a few months ago. Aside from security it also really helps with adoption since it makes packaging way easier. A large part of the output you got is due to a tempfile dependency, which is a dev only dependency that is not touched when compiling a release/dev binary, only during testing a small part of this would be used. I say a small part because most of it is related to tempfile running on windows, which is irrelevant for sudo-rs since we don’t support or intend to support windows.

Just for good measure though and to prevent any further discussion about this I’ve removed the tempfile dependency from the main sudo crate as our usage of it could easily be replaced by a simple timestamp/pid combination. But again, this really only affected testing, I think the size of the code that ends up in the final binary is very reasonable.

As for people suggesting supply chain attacks via dev dependencies: I doubt we would be the final target of such an attack: i.e. what an attacker would really want is access on all/some machines that have sudo-rs installed. The only way to do that would be to change the release artifacts, which dev dependencies do not have the ability to change, at least not directly, so such an attack would only be the first step in a chain of attacks. I have a feeling that there are way easier and less detectable ways of manipulating us than by using modified dev dependencies. Of course that doesn’t mean we should ignore the risks.

You're probably running into this: https://github.com/rust-lang/cargo/issues/7058

`cargo vendor` will download dependencies and dev dependencies for all platforms, which leads to a lot of unused code being pulled in. In this case, the Windows API and Microsoft compiler wrappers.

In this instance, during the build process "tempfile" is used as a dev-dependency, which has a runtime dependency on windows-sys when compiling Windows binaries. I'm not entirely sure why (commenting it out in Cargo.toml doesn't seem to break the build).

After commenting it out and manually removing the spurious Windows API files as well as the unrelated packages (`cd vendor; rm -rf ctor diff output_vt100 pretty_assertions proc-macro2 quote syn unicode-ident yansi win*`), I get the following results:

          0.0358 secs
    ┌───────────────────────────────────────────────────────────────────────────────────────┐
    | Language                        files        size       blank     comment        code |
    ├───────────────────────────────────────────────────────────────────────────────────────┤
    | Bash                                3    939.00 B           7           2          30 |
    | C                                   1     1.31 KB           5           6          44 |
    | C Header                            1    226.00 B           0           0           7 |
    | D                                  15    31.75 KB          32           0         143 |
    | JSON                               22    39.69 KB           0           0          22 |
    | Markdown                           16    53.46 KB         425           0        1054 |
    | Rust                              396     4.98 MB       13852        9502      131650 |
    | Shell                               5     2.24 KB          11          18          50 |
    | Toml                               14     9.60 KB          54          61         319 |
    | Yaml                                2    10.14 KB          70           0         341 |
    ├───────────────────────────────────────────────────────────────────────────────────────┤
    | Sum                               475     5.13 MB       14456        9589      133660 |
    └───────────────────────────────────────────────────────────────────────────────────────┘

As a comparison, this is the output for https://github.com/sudo-project/sudo:

       0.0439 secs
 ┌───────────────────────────────────────────────────────────────────────────────────────┐
 | Language                        files        size       blank     comment        code |
 ├───────────────────────────────────────────────────────────────────────────────────────┤
 | Autoconf                          124     1.90 MB        2618        4317       59031 |
 | C                                 365     4.20 MB       15977       22626      111340 |
 | C Header                           90     1.14 MB        1816        4911       18803 |
 | JSON                                7     9.22 KB           0           0         236 |
 | Markdown                           10   133.62 KB         676           0        2498 |
 | Pascal                              3    33.63 KB          79           0         925 |
 | Perl                                2    12.81 KB          54          83         306 |
 | Plain Text                          1     15.00 B           0           0           1 |
 | Protocol Buffer                     2     5.54 KB          22           0         185 |
 | Python                             10    26.41 KB         152         259         295 |
 | Shell                              77   358.96 KB        1589        2534        8961 |
 | Yaml                                4     7.98 KB          16          38         205 |
 ├───────────────────────────────────────────────────────────────────────────────────────┤
 | Sum                               695     7.81 MB       22999       34768      202786 |
 └───────────────────────────────────────────────────────────────────────────────────────┘
It should be noted that the sudo project's dependencies and autogenerated code aren't included in this overview
Thanks for this summary. I was going to write something about the safe rust sudo depending on a load of handwritten assembly but it seems that's a spurious windows thing.
The dependency list[0] looks pretty reasonable, AFAICT the overwhelming majority of that line-of-code count comes from autogenerated Windows API methods.

edit actual counts:

  $ cargo vendor && cd vendor && {
      for p in * ; do
      echo -n $p
      tokei $p | rg '^\s+Rust'
  done } | sort -n -k 4 | tabulate
  ----------------------------  ----  ---  ------  ------  ----  ----
  errno-dragonfly               Rust    2       9       8     0     1
  windows_aarch64_gnullvm       Rust    2      11       9     0     2
  windows_aarch64_msvc          Rust    2      11       9     0     2
  windows_i686_gnu              Rust    2      11       9     0     2
  windows_i686_msvc             Rust    2      11       9     0     2
  windows_x86_64_gnu            Rust    2      11       9     0     2
  windows_x86_64_gnullvm        Rust    2      11       9     0     2
  windows_x86_64_msvc           Rust    2      11       9     0     2
  winapi-i686-pc-windows-gnu    Rust    2      25      13    12     0
  winapi-x86_64-pc-windows-gnu  Rust    2      25      13    12     0
  windows-targets               Rust    1      54      46     3     5
  output_vt100                  Rust    2      67      55     0    12
  cfg-if                        Rust    2     164     131    16    17
  instant                       Rust    4     316     260     6    50
  countsctor                    Rust    2     331     254    21    56
  errno                         Rust    5     375     280    41    54
  diff                          Rust    4     561     485     9    67
  autocfg                       Rust    9     702     558    41   103
  yansi                         Rust    7     741     627     3   111
  signal-hook-registry          Rust    3     818     566   150   102
  fastrand                      Rust    4     830     710    16   104
  hermit-abi                    Rust    4     847     601     5   241
  pretty_assertions             Rust    5    1231    1072    33   126
  glob                          Rust    2    1589    1291   113   185
  bitflags                      Rust   20    1715    1373   105   237
  unicode-ident                 Rust   11    1794    1697    36    61
  signal-hook                   Rust   17    1969    1520   147   302
  tempfile                      Rust   15    2367    1928   102   337
  quote                         Rust   17    2458    1979   148   331
  redox_syscall                 Rust   23    3595    2996    83   516
  log                           Rust    9    3635    2970    97   568
  io-lifetimes                  Rust   15    4218    3605    80   533
  proc-macro2                   Rust   17    5286    4514   139   633
  cc                            Rust   13    5861    4767   488   606
  rustix                        Rust  236   39927   33837  1467  4623
  syn                           Rust   92   51956   48946   493  2517
  libc                          Rust  224  109836   99688  2073  8075
  linux-raw-sys                 Rust   61  145628  145455    84    89
  winapi                        Rust  405  179933  176630  3299     4
  windows-sys                   Rust  281  497624  497608     4    12
  ----------------------------  ----  ---  ------  ------  ----  ----
[0]: https://github.com/memorysafety/sudo-rs/blob/60985b2f5f7ffa8...
And this is what that small list of dependencies pulls in:

https://github.com/memorysafety/sudo-rs/blob/60985b2f5f7ffa8...

And most of those dependencies are only transitive via [dev-dependencies], which causes the entire Windows API to be pulled in.
Still doesn’t invalidate the fact that it’s a lot of surface area to hide a targeted attack.
If you're intending to install the package on an image, [dev-dependencies] are not going to be included in the package. So, no, it's not actually relevant to the surface area of the package.
It matter.

The great grandparent was talking about audit for supply chain attacks.

dev-dependencies can run on dev machine in compile time

It's weird. You'd think Rust has been around long enough that they could finally take the step toward a mature implementation with a stable ABI and support for shared libraries.

It's OK as long as you're just prototyping the language and need to make changes fast. But I'd hope that we could one day actually build a real system in Rust.