Hacker News new | ask | show | jobs
by nixos 3493 days ago
The problem is that software engineering is hard.

Immensely so.

On a scale of engineering "hardness" (meaning, we can predict all side affects of action), software engineering is closer to medicine than to, say, civil engineering.

We know stresses, materials, and how they interact. We can predict what will happen, and how to avoid edge cases.

Software? Is there any commonly used secure software? Forget about Windows and Linux. What about OpenBSD?

Did it ever have a security hole?

And that's just the OS. What about software?

There are just too many variables.

So what will happen?

There will become "best practices" enshrined by law. Most will be security theater. Most will remove our rights, and most will actually make things less safe.

Right now, the number one problem of IoT security is fragmentation. Samsung puts out an S6, three years later stops updating it, a hole is found, too bad. Game over.

The problem is that "locking firmware" is common "security theater", which, if there'll ever be a legal security requirement on IoT, it'll require locked bootloader and firmware.

And you can't make a requirement to "keep code secure", because then the question will be for "how long"? Five years? 10 years?

6 comments

> On a scale of engineering "hardness" (meaning, we can predict all side affects of action), software engineering is closer to medicine than to, say, civil engineering.

This level of hubris is pretty revolting. Software engineering is easy. Writing secure software is easy. The difference between civil engineering or medicine and software engineering is that practitioners of the former are held responsible for their work, and software engineers are not and never have been.

Nothing will improve until there are consequences for failure. It's that simple.

It's not hubris. Software really is hard - that's why it looks more like voodoo than respectable engineering discipline. It has too many degrees of freedom; most programmers are only aware of a tiny subspace of states their program can be in.

I agree lack of consequences is a big part of the problem. But this only hints at a solution strategy, it doesn't describe the problem itself. The problem is that software is so internally complex that it's beyond comprehension of a human mind. To ultimately solve it and turn programming into a profession[0], we'd need to rein in the complexity - and that would involve actually developing detailed "industry best practices"[1] and stick to them. This would require seriously dumbing down the whole discipline.

--

[0] - which I'm not sure I want; I like that I can do whatever the fuck I want with my general-purpose computer, and I would hate it if my children couldn't play with a Turing-complete language before they graduate with an engineering degree.

[1] - which we basically don't have now.

Software really is hard - that's why it looks more like voodoo than respectable engineering discipline. It has too many degrees of freedom;

No, sorry, software does not inherently have more degrees of freedom than e.g. building a bridge has. The reason other engineering fields are perceived as "limiting" is exactly because they have standards: they have models about what works and what not, and liability for failing to adhere to those standards.

I would argue that the lack of standards is exactly what makes software engineering look like voodoo -- but it is because of immaturity of the field, it's not an inherent property. Part of the reason software is so complex is exactly because engineers afford themselves too many degrees of freedom.

And I disagree that establishing standards constitutes a dumbing down of the discipline, in fact the opposite: software engineering isn't, exactly because every nitwit can write their own shoddy software and sell it, mostly without repercussions. That lack of accountability is part of what keeps software immature and dumbs down the profession. As an example, compare Microsoft's API documentation with Intel's x86 Reference Manual: one of the two is concise, complete, and has published errata. The other isn't of professional quality.

I push engineering methods for software. It really is hard for systems of significant complexity. Just a 32-bit adder takes 4 billion tests to know it will always work. The kind of formal methods that can show heap safety took a few decades to develop. They just did an OS kernel and basic app a few years ago. Each project took significant resources for developing the methods then applying them. Many failed where the new methods could handle some things but not others. Hardware is a precautionary tale where it has fewer states plus an automatable logic. They still have errata in CPU's with tons of verification.

So, it's definitely not easy. The people that pull it off are usually quite bright, well paid, have at least one specialist, and are given time to complete the task. The introduction of regulations might make this a baseline with lots of reusable solutions. We'd loose a lot of functionality that's too complex for full verification with slower development and equipment, though. Market would fight that.

Agreed, I never meant to imply that it was easy. I just meant that a "professional" software engineering discipline is neither a pipe dream, nor undesirable.
> Nothing will improve until there are consequences for failure. It's that simple.

Of course it's not that simple. Clearly you've never written much, if any, real software.

You want to make an SSL connection to another web site in your backend. You use a library. If that library is found to contain a vulnerability that allows your site to be used in a DDoS, where do the "consequences for failure" lie? You used a library.

Do you think people will write free libraries if the "consequences" fall back on them? If not, have you even the slightest understanding of how much less secure, less interoperable and more expensive things will be if every developer needs to implement every line themselves to cover their backs? Say goodbye to anyone except MegaCorps being able to write any software.

Where does this end? Would we need to each write our own OSes to cover ourselves against these "consequences", our own languages?

The same could be said for any industry.

Anyone can practise carpentry, but if someone is going to do so professionally and build structures that can cause injury or damage if they fail, then they should be accountable for the consequences. This is why indemnity insurance exists.

In software, a lack of rigour is fine for toy applications, but when livelihoods and safety become involved, we need to be mindful of the consequences and prepared to take responsibility, just like everyone else in society is expected to do.

The problem is identifying potential risks. It's obvious if I build a building it might fall down. It's not obvious if you sell web cams they might be used to take part in massive DDoS attacks.
Well now it is obvious, and honestly it has been so for a while. The reason we have shitty security is not because the risks are unknown.
Here's some risks:

1. Your system might be hacked if connected to a hostile network. Avoid that by default.

2. If connected, use a VPN and/or deterministic protocols for the connections. Include ability to update these. No insecure protocols listening by default. Sane configuration.

3. Certain languages or tools allow easy code injection. Avoid them where possible.

4. Hackers like to rootkit the firmware, OS, or application to maintain persistence. Use an architecture that prevents that or just boot from ROM w/ signed firmware if you cant.

5. DDOS detection, rate-limiting, and/or shutdown at ISP level. Penalties for customers that let it happen too often like how insurance does with wrecks.

That's not a big list even though it covers quite a lot of hacks. I'm with the other commenter thinking all the unknowns may not be causing our current problems.

You use a library.

On what basis did you choose that library? Did robustness of the software come in to your evaluation? Did you request a sample from the supplier, and performed stress testing on it? Did you check for certifications/audits of the code you were including in your project?

If that library is found to contain a vulnerability that allows your site to be used in a DDoS, where do the "consequences for failure" lie?

With you, unless you have a contract with your supplier stating otherwise.

   > On what basis did you choose that library? Did robustness of the software come in to your evaluation? 
   Did you request a sample from the supplier, and performed 
   stress testing on it? 
   Did you check for certifications/audits of the code you were including in your project?
Even if, you did everything on this list, you could still get a library that has a potential bug, because software is just that complex. Microsoft puts millions of dollars into security and it still has regular vulnerabilities discovered.

And even if, you implement rigorous audit of code, that means you can't update, because you have to go through the same audit rigamarole, each time a bug is found. By the time you audit your software, a new vulnerability will probably be discovered.

Not to mention this essentially makes open sources software nonviable.

There's a finite number of error classes that lead to codd injection that causes our biggest problems. Some languages prevent them by default, some tools prove their absence, some methods react when they happen, and some OS strategies contain the damage. There's also CPU dedigns for each of these. Under regulations, companies can just use stuff like that to vastly simplify their production and maintenance of software with stronger security.
I disagree there are finite number of error classes that lead to attackers disrupting your software/hardware. Code injection is just one of many possible ways to gain control of your computer.
Writing secure software is far from easy. It's super, super hard. The fact that you are saying this, makes me wonder if you ever attempted to write secure software?
Do you write code ?
Regarding secure software, there are at least some efforts to make writing formally verified software more approachable.

The seL4 project has produced a formally verified microkernel, open sourced along with end-to-end proofs of correctness [0].

On the web front, Project Everest [1] is attempting to produce a full, verified HTTPS stack. The miTLS sub-project has made good headway in providing development and reference implementations of 'safe' TLS [2].

These are only a few projects, but imo they're a huge step in the right direction for producing software solutions that have a higher level of engineering rigor.

[0] https://wiki.sel4.systems/FrequentlyAskedQuestions

[1] https://project-everest.github.io

[2] n.b. I'm not crypto-savvy, so I can't comment on what is or isn't 'safe' as any more than an interested layperson.

I don't really think the main problem is that software engineering in general is hard. I think the problem we're facing right now is that writing secure software using the tools we have available now isn't realistically feasible.

We need to ruthlessly eradicate undefined behavior at all levels of our software stacks. That means we need new operating systems. We need new programming languages. We need well-thought-out programming models for concurrency that don't allow the programmer to introduce race conditions accidentally. We need carefully designed APIs that are hard or impossible to mis-use.

Rust is promising. It's not the final word when it comes to safety, but it's a good start.

An interesting thought experiment is what would we have left if we threw out all the C and C++ code and tried to build a usable system without those languages? For me, it's hard to imagine. It eliminates most of the tools I use every day. Maybe those aren't all security critical and don't all need to be re-written, but many of them do if we want our systems to be trustworthy and secure. That's a huge undertaking, and there's not a lot of money in that kind of work so I don't know how it's going to get done.

Can we remove undefined features? We can get rid of the GCC optimizations which rely on the premise of undefined behavior to break code to win a speed prize or something, but undefined behavior exists for a reason:

It depends on the CPU.

The problem is that C was designed to be as close as possible to hardware, and some places (RTOS? Kernel?) speed is critical.

We can abstract the CPU away. However, undefined behavior is just the tip of the iceberg. You can fix it all you want but we'll be stuck with logic bugs, side channel attacks, info leaks, bad permissions & malconfigured servers, poor passwords, outdated & broken crypto schemes, poor access control schemes and policies, human error or negligence, etcetra.

There is a huge amount of ways security can go haywire even with perfectly defined behavior. Make no mistake, I love watching as unsafe unbehavior is slowly getting fixed, but I think language nerds are too fixated on the UB to see that it's not the big deal and won't get rid of our problems.

Another problem language nerds miss is that we can adapt existing code and tools (in "unsafe") languages to weed out problems with undefined behavior. It's just that people aren't interested enough for it to be mainstream practice. Yet the bar is much lower than asking everybody to rewrite everything in a whole new programming language. So why do they keep proposing that a new programming language is going to be the solution? And if people just don't care about security, well, we would have all the "defined behavior" security flaws in the new code written in the new shiny programming language.

I don't think that better languages will fix all the security problems. (One can, after all, create a CPU simulator to execute compiled C programs in any reasonably powerful "safe" language.) I just think that C and C++ are specifically unsuitable for building secure systems, and we won't make much meaningful progress as long as we're dependent on enormously complex software written in languages that don't at least have some degree of memory safety as a basic feature.
This is only partially right. Software engineering is hard. But trust is harder. Much much harder. And most things you have to trust people with just doesn't matter.

However, in the future where software can do everything, there is no such thing as "limited trust." If you trust someone to operate on your car, you are trusting them with everything the car interacts with. Which... quickly explodes to everything.

software itself isn't intractable, it's that the field is young, and we are stuck with choices made when nothing was understood, and its gonna take a while to turn the ship. but i think we have a pretty good idea of where we are trying to go wrt writing secure software.
> it's that the field is young

The opposite. When the field was in its infancy, one was able to keep whole stacks in his head.

How complicated were CPUs in the 1960s?

How many lines of assembler was in the LM?

How many lines is Linux or FreeBSD kernel? Now add libc.

Now you have a 1970s C compiler.

Now take into account all the optimizations any modern C compiler does. Now make sure there's no bugs _there_.

Now add a Python stack.

Now you can have decent, "safe" code. Most hacks don't target this part. The low hanging fruit is lower.

You need a math library. OK, import that. You need some other library. OK, import that.

Oops, there's a bug in one module. Or the admin setup wasn't done right. Or something blew.

Bam. You have the keys to the kingdom.

And this is all deterministic. Someone _could_ verify that there are no bugs here.

But what about Neural Networks? The whole point of training is that the programmers _can't_ write a deterministic algorithm to self drive, and have to have a huge NN do the heavy lifting.

And that's not verifiable.

_This_ is what's going to be running your self-driving car.

That's why I compared software engineering to biology, where we "test" a lot, hope for the best, and have it blow up in our face a generation later.

The need to hold whole stacks in the head is the problem. That's not abstraction. That's not how math works. The mouse doesn't escape the wheel by running faster.
I'd say the main problem is developpers carelessness and incompetence.

New SQL injection vulnerabilities are being introduced every day. Passwords being MD5. Array boundaries being sourced from client data. I mean there are perhaps 5 to 10 coding errors that are generating most of the vulnerabilities.

That's not the only problem. We also need to trust the users, who are either careless or malicious. But I'd like at the very least to be able to trust our systems.