Hacker News new | ask | show | jobs
by PaulKeeble 1228 days ago
The idea of fixing a whole class of problems is common in safety critical software. When you find the cause of a bug its not just about fixing the bug but looking for this pattern of failure everywhere and fixing that and then understanding the aspects that led to this class of bugs to begin with and eliminating those. Its just good engineering to solve the class of problems not just the bug in front of you.

But I have also been part of a team that was replaced by another because we weren't heroic enough, because we had no bugs in our software and there was no drama for the business to get what it wanted and needed. Management rarely values this type of engineering.

7 comments

I like to rate engineers across various categories, but really I found that there were two general classes of great engineers: fast and slow. For marketing tests, etc., the whole team would be fast engineers. For payments, the whole team would be slow. Everything else I would try to have tension -- a mix of the two so they learn and appreciate each other.

This is condensing a multidimensional vector into just a line, but effective enough to explain to non-engineers.

If you are a precision (slow) engineer, then you are more likely on the backend, more likely to write tests, more likely to avoid costly errors. This will be wasted on some kinds of tasks (trying out new things, for example, but as a class these tasks can generate a lot of management interest). The only kind of management interest in the precision stuff is failure, and it is usually doomsday failure.

That said, security is always present, and I am noodling ideas right now on how to eliminate whole classes of security and privacy issues while making it even easier faster for all engineer types.

> two general classes of great engineers: fast and slow ... Everything else I would try to have tension -- a mix of the two

To add to this, if you are one of these engineers a great career hack is to find an engineer you respect on the other end of the spectrum and partner with them.

As a "slow" engineer I make sure our key interfaces are abstractions are correct and my "fast" partner ensures that everything gets shipped and that I don't sweat the small stuff. This has lead to a lot more successful and impactful projects than I could manage on my own or with another "slow" engineer.

Interesting. I never thought about dividing engineers this way. I do know that the two best engineers I've ever met were slow. Not just slow in working, but slow in everything. Speaking, moving, etc. This frustrated some people, but these engineers saved everyone else much time and frustration because their work was rock solid, well thought-out, and complete.

And, once you accounted for time spent bug-fixing and validating things, they weren't actually slow.

> And, once you accounted for time spent bug-fixing and validating things, they weren't actually slow.

The constant battle between instant gratification and delayed satisfaction.

“Delayed satisfaction”—I love that.
> I found that there were two general classes of great engineers: fast and slow

There is a third type: those who can adapt to what is appropriate under the circumstances.

Yeah, I gotta say, it seems like one of the key skills that differentiates a mid-level dev from a senior one is knowing whether the current task calls for being a slow engineer vs being a fast engineer.
May I propose terminology that has less negative connotations associated with them? How about "quick" vs "deliberate"?
Yes, I sometimes say fast and precise
Focus-work that is broad and shallow versus narrow and deep.
As a slow engineer I always liked the term “careful”. As in:

Move carefully and fix things; vs

Move fast and break things.

I’d say that most experienced tradespeople are capable of operating in either mode or somewhere along the continuum.

Personality traits, beliefs, perspective, motivations, attitudes, age, and experiences might predispose someone to favor one side or the other.

Since I consider "Move fast and break things" to be one of the worst and most destructive philosophies to find adoption in software engineering, I guess it's obvious which side I favor.
It depends on what you're building, tbh. If you run a large b2c site and want to outpace your competition then you default to shipping. If the costs of failures are much higher, you are better off defaulting to a slow and steady pace.
True, from a certain business point of view. As a customer, though, I want nothing to do with such software and I don't want to work on software that I wouldn't personally use. That's why I say it's obvious which side I fall on.
Some of that is deeply ingrained, but it’s useful to be able to swap between different kinds of problem solving.

Ultimately, there isn’t a single best approach to software development just different tradeoffs. Being able to whip up a bug ridden happy path that only works for some set of data is useful when trying to build an understanding of some new system. At the other end even if most of what you do is short lived demos creating a few rock solid building blocks can save you a great deal of pain.

That's how I see brains work differently: fast and slow - one brain can quickly associate, find information; another brain is slow to load, in the end, you get 100% of what you need. I appreciate slow brains.
This is an interesting observation. I like to think myself as a slow programmer. Do you think discussing this in an interview before you actually got the job in order to evaluate if you are fit to the role joining is a good idea ?
Sure, just use the word “meticulous” instead of slow.
:D
I saw this often.

The heroes who are up late, solving a page or mitigating an outage are often the ones remembered and rewarded.

Meanwhile, a dependable, resourceful, and independent IC who “picks up trash on the floor”, promotes good work habits, and is dead reliable - no praise, they are just “doing their job”

Managers and leaders like drama, most staff and senior engineers gravitate to drama and love talking about it. Their day to to day work, often subpar and often not team players

Yes, this is my problem. I'm seen as slow, because I take my time to deliver my stuff. And I'm rarely the hero of fixing some outage or some critical bug. Because my stuff works, quietly. The heros are always busy; fixing that bug and extinguishing that fire and delivers the new feature in a day (as soon as they clean up the latest incident). What people see is that I take three days to deliver the same feature. What they don't see is that I don't have to spend in total two weeks (spread over some time), fixing bugs and cleaning up corrupt data for the same feature.
Reminds me of my time as a sysadmin managing over 100 individual linux boxes by myself. Nothing ever broke (in dramatic fashion, I mean) and new stuff were delivered on a steady schedule. I didn't spend all my time fighting fires, so the perception was that I wasn't doing anything amazing. I have to constantly point out the uptime, that every single case that needed it, we always had data recovery, but that was just considered part of my job.
I once knew a soldier that had reconfigured his tent heater to be out of the way by placing it against the side of his tent. The tent caught fire. He received and award for putting out the fire.

I have seen this pattern repeated a few times throughout my career. People are rewarded for putting out fires they created (metaphorically), but people who are diligent and don't create problems to solve are overlooked or seen as less than capable.

Everyone is selling something, and stories are how you do it. Can’t have a story without some drama
This is why I need a different job. This pattern sucks energy out of me. I need military minded coders.
I’m interested in understanding what it means to be a military minded coder.
Slow is smooth. Smooth is fast.
Festina lente
On the field there's no time for false drama or fluff, you have resources (time, energy, devices) and you keep doing the best you can at every step. And if you don't think enough about how you plan your operations, you die.

I don't want my teammates to feel on the verge of death, but I really, really work better if I'm operating at high pace and density and if the team also does that, like swarm of people attacking all problems at all levels on the job.

It’s likely a reference to the story in the linked article
Yeah the only way to get rewarded for working on resiliency like this is if you do it very loudly. If you spend an extra day or two refactoring a feature so that it is more robust you will be looked at poorly versus the engineer who slaps a few more conditionals onto it to keep it working.

Raise the issue with higher ups, maybe create some fancy charts about lost engineering time in the future, spin up specific tickets for refactoring, turn it into a two week project and you will get recognition. Management loves a chart about improving X by N% almost more than a shiny feature.

Most of the places I've worked weren't like this, but a couple were. I quit those positions on the grounds that I was clearly a poor cultural fit.
If they dont like that you took more time to refactor instead of bandaid, they arent going to respond well to you boasting about it for weeks.
> because we had no bugs in our software and there was no drama for the business to get what it wanted and needed

I’m confused here. The business needed your software to have bugs for the drama? Surely this isn’t the whole story.

When the org that I was working for was doing TSP (https://segoldmine.ppi-int.com/node/67631), our coach told me a story about another team and an engineer I knew very well. He got high praise for all the late nights and weekends he worked to get a product out the door. But on analysis of what he was working on and the bugs he had to deal with, if he had taken the TSP approach that my team was using, most of those bugs would never have been created in the first place and the product would have been finished much sooner.

Drama gets noticed, just quietly ticking along, producing high-quality output really doesn't.

Exactly. I've worked at software companies where the executives wouldn't believe any work was being done unless they could visibly see activity, and hear the "buzz" of "people doing things" and feel the drama of production emergencies and heroics. It felt like they were listening for intense movie-like typing on keyboards, watching for theatrics in front of whiteboards, project leads calling for standups, and so on. Those were the teams truly DoingThings™ and those teams were rewarded for their performance art. The team silently plugging away at their desks in chat, while they calmly deployed another build that passed all test automation--I'm not sure if leadership even knew who they were.

EDIT: These folks almost certainly overlap with the ones pining for Return To Office instead of remote work: They miss the "hum" and "buzz" of SeriousBusiness™ happening all around them in the physical office.

Its exactly this. Their approach produced lots of weekends and late nights and broken releases and they could be seen to be fixing things and responding like lightening to every issue. My team on the other hand was 9 to 5, everything just worked when we released it with few bugs and we and the business worked like normal human beings. The problem is that its invisible and there are no heroics, its just good solid engineering. Which considering it was a back end system for a bank is the right way for things to be.

Management likes people who make heroic effort even if they are the cause of needing it they are visibly working hard even if they are making less progress.

Management also often doesn't track the amount of drama products created by "fast" teams cause later in production. Because the negative impact is often delayed, those problems are rarely attributed to the original authors of the code, who often move to new projects by that time. I've seen it so many times: a "hero" gets praised for writing a software component in a day and putting it into production quickly, despite a massive evidence showing that the maintenance of the previous N projects done by that person turned out to be literally a PITA and a constant source of drama later.

I'd love to see engineering bonuses / promotions work similar to how hiring bonuses work. You don't get a hiring bonus immediately when you recommend a new hire, but you get it once the new hire stays for N months. You shouldn't get a bonus/praise/promotion for just delivering software quickly. You should get it after it runs consistently and painlessly in production for N months.

Management can be assured of worker productivity by evidence of activity or evidence of output.

In some situations (probably a lot of software engineering situations) output is difficult to measure, and so the habit of tuning in to activity is adopted instead. Some may even forget the difference.

> output is difficult to measure

But in software engineering, it isn't difficult to measure at all. We're developing a deliverable. You can measure if the deliverable happens on time and with acceptable quality.

Not exactly. Your approach works if everyone knows ahead-of-time the exact amount of effort that something will take. In software, it’s rarely the case that a project complexity is fully understood from the beginning; at best, you can make a ballpark estimate.

If you have a rock solid approach and design a system with little drama, which is released on time and with high quality, then you look like it wasn’t a very ambitious project. Or easier than expected. Maybe your team padded the estimates a lot and didn’t need to work that hard.

If you are putting in late nights and weekends and constantly fighting to get features working, maybe management thinks that the project was way harder than expected. They’re so lucky to have someone as hardworking as you or this project never would’ve been done!

Obviously there is a flaw in the logic — it’s possible that those assumptions are correct, and person 1 really was under-ambitious and person 2 is an incredible and dedicated engineer working on crazy hard problems. But it’s also possible that the first engineer was just better, and the second had terrible system design skills and constant spaghetti code that made a simple project seem complex.

It can be really hard to tell the difference. Even if both end up delivering on time, the second looks more ambitious, like they’re taking on harder problems.

> Your approach works if everyone knows ahead-of-time the exact amount of effort that something will take.

No, it really doesn't require that. But we're getting into the topic of project planning, which is a larger subject than we can tackle in the comments here. Fortunately, this is a topic discussed in great detail elsewhere.

> But it’s also possible that the first engineer was just better, and the second had terrible system design skills and constant spaghetti code that made a simple project seem complex.

Right, I was intending to cover that with my "acceptable quality" conditional.

Where is project planning discussed well, in your opinion?
It comes down to "the squeaky wheel gets the grease", if you sit quietly in the corner and get things done, and there are no problems with the work you do. You will be invisible to everyone outside those you work with directly.

The person to makes a lot of noise, good or bad, is noticed

Everyone say they want well engineered solutions, but in practice it's not what makes an impact.

I think OP was trying to say something like, "Management replaced our team because we were quiet and didn't rally around problems like other teams". I do think it was poorly worded and a bit jaded.
Just wanted to say feel your pain. My team was recently disbanded to other teams because our product has no issues and we were considered too slow. Fast forward just released a shiny new front end that is buggy and slow. 4 hot fixes in a month. At my review they said see getting you on a new team was all it took to speed you up. Well ya I guess no unit testing, and no documentation. At least I got the team to start using the linter. Another benefit is I have more family time and been working on a side project to ensure I don’t forget how to do things correctly lol. Always think this place is going to be different but timelines are king and those who hit them will be rewarded. Also can’t get much faster than 4 releases in a month which is all the boss sees on paper…
I agree. I have written a blog article about this: https://fabianzeindl.com/posts/the-codequality-pyramid#testa...
Excellent article; thank you. I have written similarly in my notes, focused on the code maintainability (or component quality) part, but not in a shareable format yet. It is a sort of maturity model (I hope eventually to turn it into a grid with practices on the left side and levels across the top, so the RH column is near-ideal behaviors personally and organizationally). But the concept begins with a self-code review for each commit or PR, followed a code review from someone else (who ideally was not part of the design discussion) who asks questions like: is this code easily understood for a new maintainer? would I want to maintain this? How was it tested? Is any coordination in order with other stakeholders, or does documenation need to be updated? Is it clear why the code exists and the tests do what they do?

And then improving over time. This came up for me after repeatedly encountering code that no one could understand but the original author ("bus count of one", or if one person gets hit by a bus we would be in trouble), or that was needlessly complex for the enjoyment of complexity (really), or that had tests where no one knew why they tested for certain behavior (until fortunately someone else came back from vacation after I was about to commit a change to the test). Etc, all for comprehensibility and maintainability, and reliability.

Ps: overall, this is one of the most enjoyable HN discussions I have seen.

pps: I also one started encouraging our team and others to maintain a set of wiki pages, listing all projects for which we were responsible, then for each one at least one simple page of documenation listing things like who are the stakeholders, where is the source code, any odd build or deployment steps, the key (one sentence) inputs and outputs, where it runs, who does backups of what, etc. Before that it was haphazard. This is short of a real ops manual and could be done better of course, and would change as other org. practices change, but it was far better than nothing, and could be created in 10 minutes from a template. Great for bringing new teammates on board. I think an organization should have something like that for the whole org, listing teams and each team having such a page, as well as listing everything essential that a new person should know rather than relying on haphazard cultural transmission. We had rules like "no new debt" that were well-adopted and began to slip out of the culture as new people joined.

Thank you for you kind words. Let me know if you could use my system.

I haven't mentioned documentation yet, but this is certainly something important that I might include in a future revision.

Yes, there are often a lot of organizational barriers that inhibit doing the right thing. Most organizations like to talk a big game about how innovative they are, but they really just want to keep doing the same things that they've always done.

I have been in the situation where over a period of time I inherited a lot of code that was written by my boss and absolutely critical to our business. Over time I began to see that they had made a number of poor design choices that made it very difficult to work with some of these internal frameworks. For example, there was a lot of global state passed around that kind of worked in production, but made it impossible to run tests in parallel (as well as making some of them flaky even when run serially). I introduced an internal dependency injection framework that leveraged some unusual host language features to remove the global state without painfully having to add a bunch of parameters to every function/class that would have to be passed around. This allowed the tests to run in parallel and completely removed the flaky failures. At the time I introduced the changes, they reduced the time to run all the tests from about 3 minutes to 20 seconds with no flaky failures. This was a major quality of life improvement not only for me, but for everyone else working on this part of the system.

Was this work rewarded? No. What I didn't consider when undertaking the refactoring was that I was implicitly making my boss look bad by fixing the systemic problems that they'd introduced. Even worse, I used techniques that were unfamiliar to them which was a further blow to their ego, though they wouldn't have admitted this at the time. Instead, they complained that they didn't understand some of my techniques rather than seriously try to learn what I'd done and appreciate the benefits that it brought. Ultimately, it became soul crushing to realize that they were more invested in doing things the way they'd always done them then to learn how to do things better (or at least offer constructive feedback beyond "I don't superficially understand this so it must be bad"). When you are in this situation, advancement becomes almost impossible because you have now become a threat to your boss, who will never make the mistake of allowing you to be promoted to their level or beyond.

I still think we should all strive to do the best work that we can because ultimately you should feel proud of the work that you've done. But this often comes with a major cost (which may ultimately be that you are forced to leave the organization).