Hacker News new | ask | show | jobs
by smadge 2243 days ago
> Businesses are paying exorbitant salaries and providing ping pong tables

Businesses are paying market rate salaries to employees who generate them revenue. If anything engineers should be the ones questioning the exorbitant compensation and perks at the top of the org structure.

> why shouldn't they have visibility into what's being done?

Because they shouldn't be micromanaging. They should be setting high level goals and expectations and giving teams autonomy and trust. Intrusive surveillance and low level metrics create perverse incentives that detract from those high level goals.

4 comments

> Businesses are paying market rates salaries to employees who generate them revenue.

You use this argument to defend engineers but then question c—level executive salaries. The exact same principles apply but at a multiplicative level.

You use this argument to defend engineers but then question c—level executive salaries. The exact same principles apply but at a multiplicative level

That’s simply not true, workers and managers are qualitatively different. There are no shortage of examples of managers of companies losing money and marketshare who nevertheless are paid bonuses every year then leave with a golden parachute.

> but at a multiplicative level.

Why? That multiplier is often what's at issue, and why "the exact same principles" don't apply.

Trust, but verify. I work on the services side and we frequently are in the position begging for more surveillance. If you are really so confident in the speed and quality of your delivery (my org borders on cocky) then the biggest risk we face is being misaligned to client expectations due to poor communication. We always ask for their explicit sign off on story criteria and feedback on features as soon as they are finished to make sure we're aligned and let them see exactly what pace we're setting.
I see what you're getting at, but the Russian proverb that Reagan made famous - "Trust, but verify" https://en.wikipedia.org/wiki/Trust,_but_verify - raises a wry smile each time I read it because it's such an obvious oxymoron. You either trust, or you verify.

There's an earlier version attributed to Mohammed - "Trust in God. But tie your camel first." - so the sentiment has been around for a while.

It feels like it's just a way of reducing cognitive dissonance, which is useful I suppose, but I wish people wouldn't use it because it allows a feeling of resolution without a real resolution of the tension between trusting and verifying.

I don’t think neither of these sayings are in themselves contradictory. The first could be translated as “I trust your work ethics, but I also know that your human nature will lead you to perform better under supervision”. The second basically says: “God has my back on a number of things, but She also expects me to do my part”.
I work in services, so for us it's like "Trust, but write explicit assumptions and disclaimers into the contract". It's like when you commit to paying $2M for a custom piece of software, you expect to pay the fee and get the product. Not spy on your vendor then sue for your money back. But you have to prepared to protect your investment.
And before that God helps those who help themselves

https://en.m.wikipedia.org/wiki/God_helps_those_who_help_the...

But how can they set goals without any visibility into what a team can typically accomplish in a given time period? How can they identify better performers and worse performers?

I'm not saying agile is the right solution, (it probably isn't), but expecting higher-ups to fund a black-box team is kind of naive.

Ye olde SV Scrum clip[1]

"This just became a JOB..." -Gilfoyle

[1]https://www.youtube.com/watch?v=oyVksFviJVE S01E05 Scrum scene

I do like Agile, it's the typical problem of managers misusing it to micromanage, engage in useless metrics etc.

Measure the deliverable, not the deliverers.

Reporting on random internal stuff instead of the actual problem at hand is the #1 problem I see with corporate reporting, everywhere. I see various combinations of people wasting time measuring:

1) The thing that is easy to measure, typically money or time.

2) The things they "understand", typically people for HR, compliance for legal, money for finance, etc...

3) The things their manager wants to know, no matter how irrelevant that is to executing their own job well.

Meanwhile what they should be measuring is the qualities of the end-product or the overall external customer end result.

It doesn't matter one iota if Bob the Developer Guy missed an internal 3-day deadline that John the Manager made up on the spot if the end product is a winner in the market and makes the users ecstatically happy to part with their money.

This happens everywhere, with everybody. For an IT-centric example, the common one I see is:

Helpdesk: "The users are complaining that the app is slow"

Admins: "The load is only 10%, but fine, we'll add more capacity!"

Helpdesk: "The app is still slow!"

Admins: "The load is only 5%! They should have no reason to complain!"

Do you see the issue? No, seriously: do you? Because practically nobody does, in my experience. Take a minute.

What happened here is that the admins measured the thing that is easy for them to measure: the load. There's a cute little bar graph in VMware, or a chart in their network appliance, or whatever. What they should have been measuring is latency from the end-user perspective, but that's hard to measure and practically no product tells you this number out of the box. So their entire process, their reporting, their troubleshooting, their forms, requests, everything becomes focused on the thing that they can see and control. Even if it's pointless, ineffective, and basically a waste of everyone's time and effort.

This happens with developers in exactly the same manner. Software quality is stupid hard to measure. Long term supportability is borderline impossible to measure without a time machine. Technical debt is hard to even explain to a manager, let alone keep tabs on in terms of numbers. So what's easy to measure? Time! Deadlines, sprints, release dates, etc... That's super easy.

That's why inevitably the unimportant internal time metrics become critical to everybody, but the actually important metrics aren't even measured and become invisible to management until it's far too late.

As a manager, I want to measure leading indicators of success and failure. Absolutely, feedback control based on the real output is important. I must measure that! But I’m always looking for ways to predict that, so I can steer more gently. What’s a leading indicator of a crisis? A late team struggling to make a deadline. What’s a leading indicator of that? Mismatch between estimates and performance. I need to know about bad point estimates because if I don’t fix it—I mean fix the PM’s misalignment—they’re going to push the team into something dangerous.
The problem is that these "leading indicators of success and failure" aren't. A late team struggling to meet a deadline might be a sign of imminent failure, or it might the team working hard to do something that is genuinely difficult to do.

The core problem with Agile (most forms of software management) is that it massively overweights "first mover" advantage. I keep hearing, as a justification of agile, that software needs to be delivered quickly so that the company can go to market first, and gain marketshare while its competitors are still floundering. But, in practice, that's hardly ever true. I can't name a single product that succeeded solely because it was on the market first. I can name many products that were first to market and failed because they were clunky and difficult to use.

Heck, Apple's entire business model consists of being second to market with a product that is more polished and easier to use than its competition.

Yes, if a developer or team is well and truly stuck (as in spinning their wheels on the same problem, week after week), that's a problem. But you don't need Agile to tell you that. A simple weekly status meeting with incremental demos is sufficient. The only thing Agile does is create a bunch of graphs that allow management to comfort themselves with "story points" and "velocity" so that they don't have to confront the hard reality that they have no idea what it is they want to build.

What about maintaining an environment where the team felt safe to communicate to the pm that there was something wrong?

And for the pm to feel safe enough to communicate to you something is wrong.

This feels like a Taylorism. Knowledge work isn’t factory work.

I totally understand keeping tabs on delivery speed to enable the team to benchmark themselves but the act of identifying a problem from that (if there is one) should be the teams responsibility imo.

As a manager, my job is to enable other people to do their job the best they can.

Part of the reason that DevOps culture became a professional movement was to treat both as an end-to-end problem.

Which stops misunderstandings like throwing the code to “Ops” and expecting the VMware admin to understand how an app behaves.

User response time is a fairly simple metric to measure if you’re a developer. They should expose that, and other metrics, that the customer values.

This is one of the reason I’ve enjoyed some “true” agile teams with a strong product owner: they encourage an end-to-end focus on outcomes.

I feel like Jonathan Blow's jai language looks to do things like this as part of the language, at least to the developer.
"You can't manage what you don't measure"

vs.

"Not everything that can be counted counts, and not everything that counts can be counted"

I think both statements are true.
People manage things using their judgement and qualitative observations, all the time. We can accuse them of bias, but that has to be weighed against the fidelity of the metrics.
This is a good breakdown of this really common dynamic.

It is recognizable to many people, some of whom use it for their benefit, which can be very effective. When I learned that last fact a lot of things made more sense.

This makes important points.

But time and money are the things by which companies live and die. Not keeping tabs on them is suicidal.

OTOH measuring time and money consumption of some technical internal steps is just uninformative.

What happened here is that the admins measured the thing that is easy for them to measure: the load.

Nah, you have it completely backwards. If the users said “this specific job took 5 minutes today but was only 1 minute yesterday”, that’s actionable, you can e.g look at what changes were deployed overnight.

But users always say “the system is slow”, even if they have only the vaguest idea of what “the system” is, and even if it’s actually faster than yesterday. It’s not really clear what any sysadmin can do other than spending hours every day painfully extracting the details from the user only to find nothing is wrong. Every day, forever.

> It’s not really clear what any sysadmin can do

That's not true. It's just that most sysadmins don't bother to upskill to find out what they can and should be doing.

> painfully extracting the details from the user

Asking users for any information is a recipe for disaster. Much like witnesses to a murder that can't agree on the most basic details, users inevitably conflate totally unrelated things. E.g.:

"Citrix is slow?"

"Okay, how so... are button presses slow to respond to a click?"

"I couldn't log on. Something to do with my password. It's slow."

"ಠ_ಠ"

So don't ask. Don't rely on your users at all. Build synthetic transaction tests that act like users. Measure end-to-end latency. Sit down with them and watch them work. Don't rely on their verbal feedback, use your own eyes. Use your tools. Measure. Then measure some more.

Conversely, capacity metrics are largely irrelevant in the era of 10 Gbps networks and 64-core server CPUs. Focus on latency. Look for delays. Timeouts. Deadlocks. Firewall packet drops. That kind of thing.

> only to find nothing is wrong. Every day, forever.

Of course something is wrong! Something is practically always wrong, that's why the users are complaining!

Here's a fun rule of thumb for you: For every 1 user that complained, there are between 100 and 1,000 that had the same issue but shrugged it off and didn't call support.

I got that from a scientific paper. I couldn't believe it, so I measured it in a large 10K user system. The error-to-call ratio was about 500-800 in ours. It blew my mind, and it blew the minds of a lot of people in IT management.

We started gathering every error, tracking every possible latency measurement we could, and it was a horror show. 30K app crashes per day. I shit you not. That's about 3 per user per day! Data loss. Hangs. Login failure rate of nearly 50%.

It tooks months to triage the issues, push patches, and apply workarounds. We had to rewrite several components. We eventually got the errors down to less than a hundred per day. Believe me, that was a real achievement.

Users were so happy they were begging to be migrated to the new system instead of pushing back and refusing to upgrade.

If the users are complaining, something is probably very wrong and you just don't know it. Go look.

For every 1 user that complained, there are between 100 and 1,000 that had the same issue but shrugged it off and didn't call support.

I wish I had your user community. Here in Wales the ratio is reversed, I guarantee it.

By assigning problems to specific responsible individuals, and noticing the existence / quality of the solutions they produce? If anything, Agile obscures individual performance, in that it treats everyone as fungible and every part of the system a commons.
Eh, trust is earned (in both directions of course). I’ve been badly burned by devs and managers alike over the course of my career.