Hacker News new | ask | show | jobs
by Too 1965 days ago
Algorithm interviews should always let the candidate choose language to avoid these bad excuses, or in worse case even fall back to pseudo-code. If you can't calculate the average of a list of numbers in pseudo, then you are out regardless of how many years of mistakes you are bringing to the table.
2 comments

Exactly. Maybe there are companies facing problems so unimaginably vast that architects who can’t write any code are critically needed. (Not who don’t which would be okay for a lot of companies, but who can’t.)

To me, this seems more like an executive chef who doesn’t know (or has forgotten) how a knife works rather than an Elon Musk or Henry Ford who doesn’t know how to weld a body panel.

my current job is company with 1B in sales, with code base of one of the the main products started about 20 years ago, few additional acquired and semi integrated companies/products, high hiring rate of young and eager developers that know how to code very fast but don't know how to keep thing from blowing up in production environment.

what i do spans from feature set and scope negotiations with product (because devs don't have either patience or knowledge to do so), design reviews of any semi major changes or additions, helping with designs, revamping sre/devops practices and company infrastructure, figuring out what land mines we got in our systems in the past and how to remove them and more generally what changes in architecture are needed in order to carry company forward given it's rapid growth.

in overall I participate or oversee activities of few hundred developers across few continents and equal amount of sre and infra/tool people and use three spoken languages in order to do so (could use another one, but never got to learn it.)

hence, i am not musk or ford that don't know how to weld. i know welding, it's just not the best way to use me. my job is to make sure that rest of assembly process will be smooth and car won't explode when it hits the road (and it's not something that welder does).

Oh, I am sure that you do lots of work. And your company is organized in such a way that your role is critical path for lots of things.

But if a company the size of Google can get by without software architects, why does your company need to be organized in such a way that you require that role?

Back when I worked there, one of the universal shocks when people go to work for Google is that the overall architecture was light years ahead of anywhere else. The person usually given credit for that is https://research.google/people/jeff/. Based on results, he may well be the best software architect alive.

If you want to learn from his example, I recommend that you start with "a running prototype beats a whiteboard design".

hence, i am not musk or ford that don't know how to weld.

What sheer irony.

BOTH Musk and Ford knew how to weld, and considered that knowledge essential to being able to do their jobs. Sure, they didn't do a lot of welding day to day. But how could they make correct decisions about how to build new machines if they didn't know how those machines are put together?

>But if a company the size of Google can get by without software architects, why does your company need to be organized in such a way that you require that role?

I have a few notes here:

0. Google maybe is exception. But i never worked in google, so i don't know how good it is really.

1. Not all companies can pay same wages and get same talent as Google

2. Every time there is talks about GCP/AWS/Azure a lot of people say that all of those services seems to designed by completely different people and don't mesh together nicely.

3. I worked in one of the FAANGS , and while it's widely known for it's excellent engineering it was a massive dumpster fire. There were excellent people that did amazing things on lower infra level, but in the moment that you move a bit higher up it was disorganized mess of multiple teams not knowing how to collaborate properly what resulted in outages, sometimes rather big, which were not noticed from outside mostly due to size of production environments that allowed to shift users to still functioning parts of the system. Those outages were mostly result of not having somebody who will look at system end to end and will identify potential cascading failures or weak points in general.

>BOTH Musk and Ford knew how to weld, and considered that knowledge essential to being able to do their jobs. Sure, they didn't do a lot of welding day to day. But how could they make correct decisions about how to build new machines if they didn't know how those machines are put together?

I didn't say that I don't know how to code. I said that I won't pass coding interview. Companies that I work in have coders much better than me, but I am much better than them in understanding how system supposed to work end to end and how to prevent it from collapsing under various unpleasant scenarios.

I did work at Google, and their engineering architecture is insanely good. Their product design, not so much. Both aspects are visible when you use their site.

I acknowledge that Google is able to hire incredible people. They also got a lucky break in hiring Jeff Dean early. However it is my considered belief that when you call Google an exception, you're reversing cause and effect. When people who come up with the architecture have to write code, work with their system, and see from experience what is wrong with it, they produce better architectures.

It is not whether coding produces more value than architecture. It is that continuing to code informs their architectural decisions.

For a similar idea in a very different industry, when Robert Townsend was CEO of Avis back in the 1960s he turned the company around. He made it profitable, and made it grow.

One of the things he did that he says made a huge difference was to make it a rule that everyone worked the rental desk. Didn't matter whether you were the CEO, VP, big manager or whatever. One day a month you stood behind the rental desk and had to deal with live customers. And having that regular experience meant that problems in the organization that would otherwise go missed became instantly visible. Just like how actually having to code and debug things in your architecture makes architecture problems visible that otherwise you'd discount. It made Robert Townsend a better CEO. The corresponding exercise would make you a better architect.

See https://hbr.org/2010/02/make-the-change-be-an-undercov for more.

It seems to me that in most organisation, the equivalence to 'rental desk' is facing the customer, not having to write some code. So we should be taking developers/architects/etc. and bringing them to interact with end users.
>I acknowledge that Google is able to hire incredible people. They also got a lucky break in hiring Jeff Dean early. However it is my considered belief that when you call Google an exception, you're reversing cause and effect. When people who come up with the architecture have to write code, work with their system, and see from experience what is wrong with it, they produce better architectures.

Totally agree with this. I personally started with installing trumpet winsock and netscape navigator on win3.11. I did customer support while been developer. I went through most of positions that are usually part of software development org and dealt with with most aspects of software. And I consider myself lucky to been able to do so.

But the unfortunate truth it's that in majority of companies that develop software (lets get outside of silicone valley for a moment), developers tend to be very far removed from production environments and from clients. As example, I was hired to work on a new, from scratch product, in one of biggest suppliers of software for telecoms (8B in sales). Over there "core engineering" was tasked with developing "core functionality" and there were totally separate delivery organization that will take this code and adapt it to client needs and will write installers/deployment systems, figure out how it actually integrate, run, etc. Clients weren't accessible to core engineering and internal product management couldn't talk to them either.

Also, the overall approach was that it's perfectly normal that software will crash and leave everything in inconsistent mess, hence there were mandatory tooling to be developed for manual operation to "fix things".

I spent first few months arguing with my boss that it's not normal and that we must deliver software that can be installed and work without crashing left and right. I got to built first devops org in company, first private cloud lab, pretty much first in company integrated ci/cd, jira+confluence (because company used internally developed "agile" tools that were unusable. and i actually got visit from internal auditors questioning me why we spent money on something that already exists in company), lobby to create track for engineers to interact with clients so they will understand what is real operational environment of software they are writing and what are the real requirements so they will be able to do better job, get training for developers so they will learn that there is something outside of "enterprise java" (as anecdote, when i presented rabbitmq as part of architecture, i got asked to what java standard it corresponds and when i said to none, i was told that they don't really understand why would I use it). Project was amazing success and delivered more in less time than was common in the company. (manager got fast tracked to vp/svp/gen manager overseeing 2B in business within 4 years. I got relocation to states :).

my point i guess, it's that most of software development in most companies is disorganized mess with developers been completely detached from realities production environments. And when I am hired, pardon the pathos, my job is to bring order into the chaos or light into the darkness or whatever. In cases when I am out of my depth in some areas (i well aware about limitations of my knowledge) , i usually have enough pull to hire some domain specialists that I can outsource to them designs/implementation.

in pseudo i can. or in python. avg(list) :)

anything more complex, not sure. i touched last time linked lists in 2001. never learned algorithms and in fact didn't even graduated from school. and guess what - everybody is okay with all of it. because this is not the value that I am expected to deliver to the company.

edit: when people are interviewed for system architect positions, algo/coding interviews are not part of the process. because it's not skill set that company is looking for.

For the record, with that interview, even saying “in python, I’d use the standard library avg(list), because it would be ridiculous to re-implement provided and tested ecosystem functionality” would have served him well in terms of indicating he wasn’t an utter fraud. That alone wouldn’t land the job, but it would be a strong positive.
depends on the company. i once (15 years ago) was interviewing with a company and they essentially asked me to whiteboard some regex like pattern matching flow. i did it, and after completing it to their satisfaction i said that in real world i wouldn't do it, as it's waste of time and will use lib* whatever. I didn't get this job :) a few years later, i discovered that few guys on one of the teams were from this very company. what was standing out about them, it's that they always had to reinvent any standard functionality/library with peak of it been their own xml-rpc protocol that they came up with, because xml-rpc wasn't good enough. I guess ethos of this company and all people that they hired, was to reinvent wheels, bicycles, etc and I didn't fit that ethos well .

this was also the only whiteboard coding interview in my life

To be fair, xml-rpc truly is terrible. To start with, XML is extremely verbose and you add a lot of network overhead for that, plus a lot of parsing overhead, which makes rpcs significantly more expensive than they should be. Use https://capnproto.org/ instead. (Based on protobuff, which is an open sourced version of what Google uses internally.) As far as I'm concerned, the only valid reason to use xml-rpc is because you're interfacing with someone else's system and they have chosen to use it.

Moving on, here is an excellent example of an important architecture system that almost everyone gets wrong. In any distributed system you should transparently support having rpc calls carry an optional tracing_id that causes them to be traced. Which means that you log the details of the call with a tracing id, and cause all rpcs that get emitted from that one to carry the same flag. You then have a small fraction of your starting requests set that flag, and collect the logs up afterwards in another system so that you can, live, see everything that happened in a traced rpcs. To make this easy for the programmer, you build it in to the rpc library so that programmers don't even have to think about it.

You then flag a small random small fraction of rpcs at the source for tracing. This minimizes the overhead of the system. But now when there is a production problem that affects only a small percentage of RPCs you just look to see if you have a recent traced RPC that shows the issue, look at the trace, and problems 3 layers deep are instantly findable.

Very few distributed systems do this. But those that do quickly discover that it is a critical piece of functionality. This is part of the secret sauce that lets Google debug production problems at scale. But basically nobody else has gotten the memo, and no standard library will do this.

Now I don't know why they reinvented xml-rpc for themselves. But if they had that specific feature in it, I am going to say that it wasn't a ridiculous thing to do. And the reason why not becomes obvious the first time you try to debug an intermittent problem in your service that happens because in some other service a few calls away there is an issue that happens 1% of the time based on some detail of the requests that your service is making.

It happened 13 years ago and it was exactly same but with different xml syntax :) It was way before microservices, etc. purely point to point.
It was way before microservices, etc.

It was not way before microservices at Google. But it was before there was much general knowledge about them.

Sadly the internal lessons learned by Google have not seeped into the outside world. Here are examples.

1. What I just said about how to make requests traceable through the whole system without excess load.

2. Every service should keep statistics and respond at a standard URL. Build a monitoring system which has scrapes of that operational data as a major input.

3. Your alerting system should supports rules like, "Don't fire alert X if alert Y is already firing." That is, if you're failing to save data, don't bother waking up the SRE for every service that is failing because you have a backend data problem. Send them an email for the morning, but don't page people who can't do anything useful anyways.

4. Every service stood up in multiple data centers with transparent failover. At Google the rule was n+2/n+1. Meaning that your service had to globally be in at least 2 more data centers than it needed for normal load, and in every region it had to be in at least one extra data center. With the result that if any data center goes out, no service should be interrupted, and if any 2 data centers go out the only externally visible consequence should be that requests might get slow.

Now compare that to what people usually do with Docker and Kubernetes. I just have to shake my head. They're buzzword compliant but are failing to do any of what they need to do make a distributed system operationally tractable. And then people wonder why their "scalable system" regularly falls over and nobody can fix it.