Hacker News new | ask | show | jobs
by waihtis 1073 days ago
The gap from knowing what a CWE is and actually knowing, on code level, how it manifests and how you avoid these things is very large. Given how much the software industry has grown in the past 10 years it's not particularly surprising.
3 comments

> and actually knowing, on code level, how it manifests and how you avoid these things

You avoid them by using tools that make it difficult or impossible to introduce such vulnerabilities to begin with. Such as modern, memory safe programming languages.

For many decades, carpenters have been educated about table saw safety. But what finally stopped thousands of fingers getting chopped off every year was the introduction of the SawStop, and similar technologies.

Safety is a matter of using the right tools, not of "taking better care".

> For many decades, carpenters have been educated about table saw safety. But what finally stopped thousands of fingers getting chopped off every year was the introduction of the SawStop, and similar technologies.

Afaik the technology isn’t widespread and there are still 10s of thousands of injuries per year.

You mean tecnhology like bounds checking, invented during the 1950's decade, with the creation of Fortran, Lisp and Algol, and every other language derived from them, with exception of C, C++ and Objective-C?
And why the whole world wrote so much code in C, C++ and Objective-C when bound checking existing long before these languages without boundcheck?
It started like this,

"Although we entertained occasional thoughts about implementing one of the major languages of the time like Fortran, PL/I, or Algol 68, such a project seemed hopelessly large for our resources: much simpler and smaller tools were called for. All these languages influenced our work, but it was more fun to do things on our own."

-- https://www.bell-labs.com/usr/dmr/www/chist.html

Then source tapes with an almost symbolic license price for its time, and a commentary book did the rest.

with bounds checking, out of range index still trigger exception or runtime error. Many of them results in DoS.
Much better than silent data corruption.

Then there is the whole issue of making it more interesting to look elsewhere instead.

When a door is locked I can still break in by throwing a rock to the window, yet most people do lock the door nonetheless, while most thieves only bother to break the window if there is anything actually valuable in doing so.

Yeah at least in the US, it looks like tablesaw accidents that put people in the ER are about as common as they were 15 years ago. I have a buddy who just lost 6 months work because of a tablesaw accident.
Also SawStop doesn't prevent kickback, one of the other major sources of injury from a table saw.
Wow! SawStop is incredible tech. the blade stops within 5ms. That's insane.
Those two things aren't mutually exclusive. I'll bet a non-trivial number of XSS and SQL injection vulnerabilities came from people disabling input and output sanitation on solid frameworks and libraries because they didn't know why they shouldn't. Tools won't solve all of your problems-- you need knowledge, diligence, and tools that make doing the right thing easy.
> I'll bet a non-trivial number of XSS and SQL injection vulnerabilities came from people disabling input and output sanitation on solid frameworks and libraries because they didn't know why they shouldn't.

I will take this bet.

Searching Google for disabled sanitation "vulnerability", the first two hits are articles admonishing developers to not do it, and the third is a CVE, CVE-2023-1159, from a month ago that affects WordPress installations on which the developer disabled unfiltered_html, which is it's built-in sanitation functionality.
Memory safety won't stop you writing SQL queries or dynamically generating HTML that accepts unsanitised user input.
You're right. Those things are stopped by other tools, such as query builders and web frameworks.
>Those things are stopped by other tools, such as query builders and web frameworks.

No. All tools can be used with an improper attitude which leads to the creation of weak points.

The proper way is to have a deep understanding of the role of design rules.

A programmer who does not pay attention to design (very basic principles of the design process) can create a good game, and even if this game contains weaknesses the risk related isn't a reason to not use it. The same programmer when creating critical infrastructure software is a source of potential nightmare.

Unfortunately, software business accepts such specialists for projects both of kinds. Why? Who knows? Perhaps because of legal regulations? Why when an engineer designs a car they don't try to "Move fast and break things"?

> Why when an engineer designs a car they don't try to "Move fast and break things"

They do when they design submersible or rockets

C is the table saw of programming. C++ is the band saw.
I've used both saws and both programming languages. I still don't know which is worse.
XSS and SQLi can happen independently of the memory safety of your chosen programming language. You can use relatively safe frameworks or ORMs to generate HTML and interact with your DB, but there will sometimes be complex use cases that require you to extend or otherwise not use those safeguards.

Similarly, I imagine that there are cases where someone needs to do complex wood working tasks that involve dangers which are a less obvious than with a table saw.

I agree 100%, but in reality most people work with the language they are presented with.
XSS is a great example of that. On paper a ton of people know exactly what XSS is and does. In practice... simply don't allow user-controlled input to be emitted unescaped, ever. Good luck!

The reason XSS (and CORS) are tricky is because they fundamentally don't work in a world where a website may be spread over a couple different domains. I get a taste of this in my dayjob where we have to manage cookie scoping across a couple different region domains and have several different subdomains for different cookie behaviors. It's easy to be clean on paper up until you need to interface with some piece of software that insists on doing it its own way - for example the Azure excel embedded functionality requires the ID token to be passed in the request body, meaning you have to pull in the request body and parse it in your gateway layer (or delegate that to a microservice)... potentially with multi-GB files being sent in the body as well!

It's super easy on paper to start from greenfield and design something that is sane and clean, bing boom so simple. But once you acquire a couple of these fixed requirements, the cleanliness of the system degrades quite a bit, because that domain uses a format that's not shared by anything else in the system, and it's a bad one, and we can't do anything about it, and now that's a whole separate identity token that has to be managed in parallel.

Anyway, you could say that buffer overflow or use-after-free are kind of an impedence mismatch for memory management/ownership in C. Well, XSS and CORS are an impedence mismatch for domain-based scoping models in a REST-based world. Obviously the correct answer is to simply not write vulnerable systems, but is domain-based scoping making that easier or harder?

Great examples. 1) You have to deal with your own complex systems where it becomes difficult and 2) you have to deal with external complex systems which enforce bad practice on you. One can see how it becomes borderline impossible not to slip once in a while.
Two of those four are things there's no need to make easy to do by mistake, but two popular programming languages choose to do so anyway and they reap the consequences.

Actually the SQL one is arguably in that category too, to a lesser extent. Libraries could, and should, make it obvious how to do parametrized SQL queries in your language. I would guess that for every extra minute of their day a programmer in your language must spend to get the parametrized version to work over just lazy string mangling, you're significantly adding to the resulting vulnerability count because some of them won't bother.

Bonus points if your example code, which people will copy-paste, just uses a fixed query string because it was only an example and surely they'll change that.

I feel there would be some value in SQL client libraries that just flat out ban all literals.

I know it's the nuclear option, but decades of experience has shown that the wider industry just cannot be trusted. People won't ever change[1], so the tools must change to account for that.

[1] Unfortunately, LLMs learned from people... so... sigh.