Hacker News new | ask | show | jobs
by pulse7 1076 days ago
Here are Language-Specific ones:

1. CWE-787 Out-of-bounds Write: C, C++, Assembly

4. CWE-416 Use After Free: C, C++

7. CWE-125 Out-of-bounds Read: C, C++

10. CWE-434 Unrestricted Upload of File with Dangerous Type: ASP.NET, PHP, Class: Not Language-Specific

12. CWE-476 NULL Pointer Dereference: C, C++, Java, C#, Go

15. CWE-502 Deserialization of Untrusted Data: Java, Ruby, PHP, Python, JavaScript

17. CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer: C, C++, Assembly

21. CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition'): C, C++, Java

23. CWE-94 Improper Control of Generation of Code ('Code Injection'): Interpreted

6 comments

>12. Null pointer deref.

In java you'll get an exception, while in C you might dissapear your cat. Those 2 are quite incomparable when talking about "dangerous-ness" of a mistake

And C# is making references non-nullable by default.
Kind of, it doesn't work that well with existing libraries, and because of that, even when you enable it, it is only a warning.
> 15. CWE-502 Deserialization of Untrusted Data: Java, Ruby, PHP, Python, JavaScript

> 21. CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition'): C, C++, Java

Why those languages specifically? I would say these two issues apply to all languages.

So, the memory related ones are in position 1, 4, 7, 12, 17, and 21.

I understand memory safety is important, but still: only one in the podium (though it is first), only 3 in the top 10… clearly security is about much more than memory safety.

Of course. Security is the exercise of making programs not do things. Since the very beginning of computer science we've understand that programs want to be able to do anything at a very fundamental level. We'll never solve security completely.

But it is embarrassing that we've been living with memory safety issues for 50 years and they still remain very common and very severe, despite being addressable via type systems in ways that something like a logical bug that leads to data leakage isn't.

This isn't about choosing security measures from a menu. This is about the foundations of what you build.

To the extent that memory safety is slowly, oh so slowly, but steadily dropping down the list, it is because we are taking it seriously as a foundational issue and actually addressing it. To turn around and then use the success we've had as evidence that it isn't important is making a serious error.

There is no reason to use a memory unsafe language anymore, except legacy codebases, and that is also slowly but surely diminishing. I'm still yet to hear this amazingly compelling reason that you just need memory unsafe languages. In terms of cost/benefits analysis, memory unsafety is literally all costs. Even if you do have one of the rare cases when you need it, and you only need a very particular variant of it (reading bytes in memory of one type as bytes of another type, you never need to write out of the scope of an array or dereference into an unallocated memory page), you can still get it through explicit unsafe support that every language has one way or another. You do not need a language that is pervasively unsafe with every line you write so that on those three lines of code out of millions that you actually need it, you can have it with slightly less ceremony. That's just a mind-blowingly bad tradeoff and engineering decision.

How are we supposed to address the other issues from a foundation of a memory unsafe language? If we can't even have such a basic guarantee, we sure aren't going to get more complicated ones later.

Safety is nice, but it often costs significant performance. So what you are saying is… if I need performance my only choice is Rust?

Don’t get me wrong, I absolutely hate the insanity of Undefined Behaviour™ in C and C++ (my pet peeve being signed integer overflow), and I’m totally behind systematic bounds checking (which with compile time support tends to lie between free and cheap). I’m less sold on ensuring the safety of memory shared between threads because I tend to prefer message passing, and I’m not sure how to best address use-after-frees: using the general allocator for each and every object is often even more wasteful than just using a GC, so RAII based schemes aren’t quite enough. I have yet to really test Rust’s borrow checker however.

One thing I have noted, is that C, despite its expressive weakness and its unsafe insanity, remains pretty capable at some niches. Low-level cryptographic code for instance is hardly affected by its flaws (having no heap allocation and constant time code helps a ton).

"I’m less sold on ensuring the safety of memory shared between threads because I tend to prefer message passing"

Memory safety, at least to my eyes, has not traditionally encompassed that as a requirement. I don't consider this a solved problem, in that it has a lot of solutions and consensus about them is still developing. (e.g., I still expect async as it has been implemented in Node & Rust to eventually be considered a gigantic mistake but clearly that is not an uncontroversial opinion in 2023; check in with me in 2033 or 2043). So I'd advise trying to use one of the better solutions but I'm not quite to "there's no reason to not use one of these things".

So my passion is mostly about out-of-bounds access and use-after-free. If it costs you performance... take the hit. It's not a lot. And if you do need unsafe approaches, they are almost always some tight loop somewhere or something where you can selectively take the gloves off and drop down to assembler or something. You don't need you entire language to be unsafe just so you don't have to wrap "unsafe { }" around your tight inner loop.

> So my passion is mostly about out-of-bounds access and use-after-free.

Yeah, those are the big ones indeed, and I am willing to take a performance hit to get there. If that’s the only hit I take I’ll still be much better than paying an Electron tax.

I do however still feel some discomfort about use-after-free, because to be honest I just don’t know enough about the relevant use cases, compilation techniques, and runtime checks. So far my only relevant experiences have been GC, RAII, and stack-only. They all solve my problem (or at least I can see how I could write a compiler that would solve each use case for me). But I know those aren’t the only use cases, and I’m not familiar enough with the other allocation patterns (pool, arena…) to have a relevant opinion.

But perhaps I’m just stressing over nothing? The problem is easily stated after all: no object should be accessed after its backing memory has been freed. One way to do that is to make sure the object (and any reference to it) goes out of scope before the backing storage is freed. Which sounds doable enough if the backing storage itself follows a stack discipline…

Hey, I can glimpse here a way to allow allocations and statically guarantee a limit on memory usage (barring input dependant allocation amounts). Perhaps even avoiding fragmentation, which would be terrific for embedded use cases.

> There is no reason to use a memory unsafe language anymore, except legacy codebases, and that is also slowly but surely diminishing. I'm still yet to hear this amazingly compelling reason that you just need memory unsafe languages. In terms of cost/benefits analysis, memory unsafety is literally all costs.

Tell that to the authors of new memory unsafe languages (like Zig) and creators of new projects in those languages (like https://tigerbeetle.com) :(

I do tell them that. I see no reason to be memory unsafe.

It is a huge uphill battle to become a new general purpose language, and the smallest thing can kill it. The fact that Zig is memory unsafe means that I, who am not a bleeding-edge adopter, but am an early adopter and in a position to make decisions about what is used at work, have disqualified it and lost all interest. I have no use for such a language for greenfield projects. Simply offering a more convenient onramp to the sorts of problems that C has is not a compelling value proposition for me.

I extremely strongly suspect I am not even remotely alone.

People have language blinders on. It's not like if you only focus on the ones that affect your language specifically, suddenly you're secure. There's still another 16 bug classes to worry about.

If you don't think about the other classes, I'm still gonna escalate privileges, root your box, ransom your data, send spam, charge a half million dollars in cloud spend to your account, steal your customers' PII/PHI, etc etc etc. Without ever using a language specific exploit.

Yes, but such neglect of other bug classes suggests that those developers aren't focusing on security anyways. For those who do want reasonable security, using a memory-safe language suddenly makes the most pervasive errors go away, and then it's easier to focus on building robust applications.
PHP is uniquely vulnerable to things like XSS and others on that list, because it does escape strings that are used in templating.

Escaping by default has become a standard practice with HTML templating languages, see the Go html template standard library for a very detailed breakdown of what is escaped where.

More modern PHP frameworks like Laravel provide their own templating solution in part because of this. But the vast majority of websites run on default PHP templates, so it's not surprising that these kinds of vulnerabilities are so high up in the list.

Laravel has had their own share of XSS issues with their Blade templating engine.

The whole problem is that you mix code and data, and that third party resource loading is 'on' by default in browsers, especially for scripts and things that can embed scripts. This is not something you can fix once and for all at the library level.

Isn’t #17 the same as #1 and #7 combined?