Hacker News new | ask | show | jobs
Ask HN: Language design to mitigate supply chain attacks?
2 points by SebastianFish 1704 days ago
Question: Hacker News, if someone were designing a new programming language (or library system within an existing language) to mitigate supply chain attacks, what features would be most important to you?

Background: Open-source code has been an amazing boon for software productivity. However, I believe that most of the code that companies run isn't written by their own developers, it is from various packages and frameworks written by outside individuals and groups. My main focus is writing data intensive software where malicious actors, or just poorly written code, could compromise the confidentiality and/or integrity of customer data via a popular package/dependency. Better sandboxing of relatively untrusted code could be a huge boon for applications that have lots of plugins (think Chrome of VS code).

A few thoughts I have had on this front is limiting access of untrusted code to system calls that control file and network access is critical. Also, there would be a need to ensure that dependencies can't "takeover" a process by overwriting the call-stack to run their own code or spawning a new thread / process. Interested in your thoughts.

2 comments

What makes you think a language can mitigate supply chain attacks?

For any arbitrary language you define, I can create a known-bad library and punch it into the download stream

If you want to mitigate supply-chain attacks, you need to look at all of the following (at least!):

- library source

- file-signature

- signature key-verification

- static analysis of library functionality

- processor-dependent rogue behavior detection

- OS-dependent rogue behavior detection

This is not a language problem - this is a source and runtime problem

I think you have good points about the file signature and key-verification. I was hoping that we could expand the conversation around static analysis of library functionality as some languages offer features to simplify certain kinds of static analysis. For instance, Rust's borrowing semantics are a language feature that make certain memory usage attributes possible to verify statically.

From a run-time perspective, there are lots of instances where untrusted code has to be executed and there are various sandboxing related approaches there (running in a walled off-VM for instance). From a deployment standpoint that doesn't scale if you need to have an actual VMware instance running per package/dependency. My hope is that a language implemented over a virtual machine might be able to achieve similar levels of security with less overhead.

>a language implemented over a virtual machine might be able to achieve similar levels of security with less overhead

So ... anything running on the JVM (Java, Scala, etc), for example?

Can you prevent these types of attacks simply by pinning dependencies to known-safe versions?

Because you mention supply-chain attacks, I assume you're familiar with Nix/Guix linux distributions/package managers?

Go's dependency management is pretty sophisticated w.r.t. mitigating supply chain attacks.

I'm working on a post about it

https://verdverm.com/go-mods/

That's not a language thing, though - that's a dependency management thing :)
It's both, for example parsing an import string, or what makes a valid module name, are part of the language. These have restrictions that prevent Unicode glyph attacks.

The fact that imports require a domain name prefix prevents dependency confusion.

All the language (and really the interpreter/compiler) is doing in this example is making sure you haven't done something silly like try to assign a `char ` to a `long double`

That's barely* addressing supply chain attacks

Linguistically, an import of `bubbleglyph.myimport` is no different than `bubbleglypf.myimport` (except for one [valid] character) - nor would importing it from repository A be any different than repository B

You sitll have to rely on outside-the-language security checks to ensure you're getting

1) what you want

2) from where you want

3) and that it's "correct"/"safe" to use