Hacker News new | ask | show | jobs
by tguvot 1965 days ago
i am very senior architect astronaut. it will take me some effort to computer the average of a list of numbers in c/c++ as i didn't touch those languages in many years.

from the other side, while reviewing designs (not code) of various components I can spot locations where code handles sockets incorrectly or a couple of dozen of scenarios that system will fall apart under because engineers don't think that far.

value that I bring is not in averaging list of numbers. my value is 25 years of making mistakes, learning from them and knowing how to avoid them.

2 comments

Algorithm interviews should always let the candidate choose language to avoid these bad excuses, or in worse case even fall back to pseudo-code. If you can't calculate the average of a list of numbers in pseudo, then you are out regardless of how many years of mistakes you are bringing to the table.
Exactly. Maybe there are companies facing problems so unimaginably vast that architects who can’t write any code are critically needed. (Not who don’t which would be okay for a lot of companies, but who can’t.)

To me, this seems more like an executive chef who doesn’t know (or has forgotten) how a knife works rather than an Elon Musk or Henry Ford who doesn’t know how to weld a body panel.

my current job is company with 1B in sales, with code base of one of the the main products started about 20 years ago, few additional acquired and semi integrated companies/products, high hiring rate of young and eager developers that know how to code very fast but don't know how to keep thing from blowing up in production environment.

what i do spans from feature set and scope negotiations with product (because devs don't have either patience or knowledge to do so), design reviews of any semi major changes or additions, helping with designs, revamping sre/devops practices and company infrastructure, figuring out what land mines we got in our systems in the past and how to remove them and more generally what changes in architecture are needed in order to carry company forward given it's rapid growth.

in overall I participate or oversee activities of few hundred developers across few continents and equal amount of sre and infra/tool people and use three spoken languages in order to do so (could use another one, but never got to learn it.)

hence, i am not musk or ford that don't know how to weld. i know welding, it's just not the best way to use me. my job is to make sure that rest of assembly process will be smooth and car won't explode when it hits the road (and it's not something that welder does).

Oh, I am sure that you do lots of work. And your company is organized in such a way that your role is critical path for lots of things.

But if a company the size of Google can get by without software architects, why does your company need to be organized in such a way that you require that role?

Back when I worked there, one of the universal shocks when people go to work for Google is that the overall architecture was light years ahead of anywhere else. The person usually given credit for that is https://research.google/people/jeff/. Based on results, he may well be the best software architect alive.

If you want to learn from his example, I recommend that you start with "a running prototype beats a whiteboard design".

hence, i am not musk or ford that don't know how to weld.

What sheer irony.

BOTH Musk and Ford knew how to weld, and considered that knowledge essential to being able to do their jobs. Sure, they didn't do a lot of welding day to day. But how could they make correct decisions about how to build new machines if they didn't know how those machines are put together?

>But if a company the size of Google can get by without software architects, why does your company need to be organized in such a way that you require that role?

I have a few notes here:

0. Google maybe is exception. But i never worked in google, so i don't know how good it is really.

1. Not all companies can pay same wages and get same talent as Google

2. Every time there is talks about GCP/AWS/Azure a lot of people say that all of those services seems to designed by completely different people and don't mesh together nicely.

3. I worked in one of the FAANGS , and while it's widely known for it's excellent engineering it was a massive dumpster fire. There were excellent people that did amazing things on lower infra level, but in the moment that you move a bit higher up it was disorganized mess of multiple teams not knowing how to collaborate properly what resulted in outages, sometimes rather big, which were not noticed from outside mostly due to size of production environments that allowed to shift users to still functioning parts of the system. Those outages were mostly result of not having somebody who will look at system end to end and will identify potential cascading failures or weak points in general.

>BOTH Musk and Ford knew how to weld, and considered that knowledge essential to being able to do their jobs. Sure, they didn't do a lot of welding day to day. But how could they make correct decisions about how to build new machines if they didn't know how those machines are put together?

I didn't say that I don't know how to code. I said that I won't pass coding interview. Companies that I work in have coders much better than me, but I am much better than them in understanding how system supposed to work end to end and how to prevent it from collapsing under various unpleasant scenarios.

I did work at Google, and their engineering architecture is insanely good. Their product design, not so much. Both aspects are visible when you use their site.

I acknowledge that Google is able to hire incredible people. They also got a lucky break in hiring Jeff Dean early. However it is my considered belief that when you call Google an exception, you're reversing cause and effect. When people who come up with the architecture have to write code, work with their system, and see from experience what is wrong with it, they produce better architectures.

It is not whether coding produces more value than architecture. It is that continuing to code informs their architectural decisions.

For a similar idea in a very different industry, when Robert Townsend was CEO of Avis back in the 1960s he turned the company around. He made it profitable, and made it grow.

One of the things he did that he says made a huge difference was to make it a rule that everyone worked the rental desk. Didn't matter whether you were the CEO, VP, big manager or whatever. One day a month you stood behind the rental desk and had to deal with live customers. And having that regular experience meant that problems in the organization that would otherwise go missed became instantly visible. Just like how actually having to code and debug things in your architecture makes architecture problems visible that otherwise you'd discount. It made Robert Townsend a better CEO. The corresponding exercise would make you a better architect.

See https://hbr.org/2010/02/make-the-change-be-an-undercov for more.

in pseudo i can. or in python. avg(list) :)

anything more complex, not sure. i touched last time linked lists in 2001. never learned algorithms and in fact didn't even graduated from school. and guess what - everybody is okay with all of it. because this is not the value that I am expected to deliver to the company.

edit: when people are interviewed for system architect positions, algo/coding interviews are not part of the process. because it's not skill set that company is looking for.

For the record, with that interview, even saying “in python, I’d use the standard library avg(list), because it would be ridiculous to re-implement provided and tested ecosystem functionality” would have served him well in terms of indicating he wasn’t an utter fraud. That alone wouldn’t land the job, but it would be a strong positive.
depends on the company. i once (15 years ago) was interviewing with a company and they essentially asked me to whiteboard some regex like pattern matching flow. i did it, and after completing it to their satisfaction i said that in real world i wouldn't do it, as it's waste of time and will use lib* whatever. I didn't get this job :) a few years later, i discovered that few guys on one of the teams were from this very company. what was standing out about them, it's that they always had to reinvent any standard functionality/library with peak of it been their own xml-rpc protocol that they came up with, because xml-rpc wasn't good enough. I guess ethos of this company and all people that they hired, was to reinvent wheels, bicycles, etc and I didn't fit that ethos well .

this was also the only whiteboard coding interview in my life

To be fair, xml-rpc truly is terrible. To start with, XML is extremely verbose and you add a lot of network overhead for that, plus a lot of parsing overhead, which makes rpcs significantly more expensive than they should be. Use https://capnproto.org/ instead. (Based on protobuff, which is an open sourced version of what Google uses internally.) As far as I'm concerned, the only valid reason to use xml-rpc is because you're interfacing with someone else's system and they have chosen to use it.

Moving on, here is an excellent example of an important architecture system that almost everyone gets wrong. In any distributed system you should transparently support having rpc calls carry an optional tracing_id that causes them to be traced. Which means that you log the details of the call with a tracing id, and cause all rpcs that get emitted from that one to carry the same flag. You then have a small fraction of your starting requests set that flag, and collect the logs up afterwards in another system so that you can, live, see everything that happened in a traced rpcs. To make this easy for the programmer, you build it in to the rpc library so that programmers don't even have to think about it.

You then flag a small random small fraction of rpcs at the source for tracing. This minimizes the overhead of the system. But now when there is a production problem that affects only a small percentage of RPCs you just look to see if you have a recent traced RPC that shows the issue, look at the trace, and problems 3 layers deep are instantly findable.

Very few distributed systems do this. But those that do quickly discover that it is a critical piece of functionality. This is part of the secret sauce that lets Google debug production problems at scale. But basically nobody else has gotten the memo, and no standard library will do this.

Now I don't know why they reinvented xml-rpc for themselves. But if they had that specific feature in it, I am going to say that it wasn't a ridiculous thing to do. And the reason why not becomes obvious the first time you try to debug an intermittent problem in your service that happens because in some other service a few calls away there is an issue that happens 1% of the time based on some detail of the requests that your service is making.

It happened 13 years ago and it was exactly same but with different xml syntax :) It was way before microservices, etc. purely point to point.
The engineers you work with have to write code that does the equivalent of averaging a list of numbers, many times per day. If you can't (in any language) handle this then I question your ability to understand whether your choices make it easy or hard for the engineers to do their jobs.
in some languages i will write the code. more complex algos not, because i don't know them. and never cared to know them. in reality engineers that i work with, write code that does whatever_library.avg(list_of_numbers). if i will see engineer that many times a day implements code that averages list of numbers I'll have a serious conversation with him about more efficient development practices

and maybe you are questioning my ability, but engineering teams that I work with - not. They come to me so i'll help them with problems that they have.