Hacker News new | ask | show | jobs
by alex_smart 482 days ago
> But as they get larger you often can't keep track of every last piece simultaneously

Sure, but then you divide the larger system into smaller components where each team is responsible for one or few of these individual pieces and the chief architect is responsible for making sure of how the pieces are put together.

> At least in theory the tests are supposed to cover the observable behavior that matters. So I figure if the tests pass all is well. If I still find something broken then I need to add a test case for it.

But you sure as hell hope that the engineer working on implementing your database has a decent mental model for the thread safety of his code and not introduces subtle concurrency bugs because his tests are still green. You also hope that he understands that he needs to call fsync to actually flush to data to disk instead of going yolo (systems never crash and disks never fail ). How are you supposed to cover the user observable behavior in this case? You cut off the power supply to your system/plug off your disk while writing to the database and assert that all the statements that got committed actually persisted? And how many times you repeat that test to really convince you that you are not leaving behind a bug that will only happen in production systems say once every three years?

I am only giving database and multithreading as examples because they are the most obvious, but I think the principle applies more generally. Take the simplest piece of code everyone learns to write first thing in uni, quicksort. If you don't have a sufficient mental model for how that algorithm works, what amount of tests will you write to convince yourself that your implementation is correct?

2 comments

> then you divide the larger system into smaller components where each team is responsible for one or few of these individual pieces and the chief architect is responsible for making sure of how the pieces are put together

And then you have ravioli code in the large. It is not going to make it easier to understand the bigger system, but it will make it harder to debug.

https://en.wikipedia.org/wiki/Spaghetti_code#Ravioli_code

If you can reproduce an error, you can fix it. Do that.

If you cannot reproduce it after a day of trying and it doesn’t happen often, don’t fix it.

Off topic but when I first saw ravioli code it was in a positive light, as a contrast to lasagna code. But then somewhere along the line people started using it in a negative manner.

There is some optimal level of splitting things up so that it's understandable and editable without overdoing it on the abstraction front. We need a term for that.

Pizza code? Central components are mostly clearly distinguishable and glued together in a fairly consistent manner.
Probably not the best examples.

Sqlite famously has more test related code than database code.

Multithreading correctly is difficult enough that multiple specialized modeling languages exist and those are cumbersome enough to use that most people don't. In practice you avoid bugs there by strictly adhering to certain practices regarding how you structure your code.

You mention fsync but that famously does not always behave as you expect it to. Look up fsyncgate just for starters. It doesn't matter how well you think you understand your code if you have faulty assumptions about the underlying system.

Generally you come across to me as overconfident in your ability to get things right by carefully reasoning about them. Of course it's important to do that but I guarantee things are still going to break. You will never have a perfect understanding of any moderately large system. If you believe you do then you are simply naive. Plan accordingly (by which I mean, write tests).