Hacker News new | ask | show | jobs
by 01100011 2655 days ago
In my "real world" we normally don't care about things like big-O complexity. We worry about doing dumb things and not taking more time than we have available. I'm not saying big-O is useless or CS wizards are never helpful. It's just that you need one or two of them for a large team of normies, IME.

I have a problem with this notion that knowledge of algorithms is required to be a good engineer though. Case in point: Senior algorithm nerd on my project is going nuts over algorithmic complexity and delaying an early delivery to another team so they can begin using our new feature. In our case, n is fairly small no matter what, so 2n vs n^2 really doesn't matter to us. The code he's optimizing isn't even the long pole in the tent. It calls other routines which take 2/3 of the total time anyway. We could just deliver now and improve later when we have real world data on how people want to use our feature, but nope, we're wasting time on endless rewrites in search of perfection which may not matter if the other team can't deploy our stuff in time.

6 comments

>> Senior algorithm nerd on my project is going nuts over algorithmic complexity

This is me, but luckily where I work I have people who can keep me in check because we generally do design reviews before anything big is built.

However, I have been in situations at previous companies where big(o) was ignored to take short cuts up front, because the "data was small" and suddenly scaling to even just 100 users starts to break things because of poor design decisions when it gets into production.

I guess the lesson here is more the importance of design reviews. Also n^2 is HUGE even for small data if there is IO or api calls involved. Any public api you provide, that is n^2 is not a good idea because you never know who may end up using it for what.

> if there is IO or api calls involved

Right. In my case, the operation was an extra memory comparison. For something already in the cache.

Sure, constraints can change and your assumptions about n<10k may prove unwise, but that's our call to make as engineers. YAGNI. If you know n is never going to grow, then why waste time on it? We're not paid to write pristine code. We're paid to solve problems while hopefully not creating new ones. Pragmatism and all that.

> In my “real world” we normally don’t care about things like big-O complexity. We worry about doing dumb things and not taking more time than we have available.

Big-O is just a formal word for how much time you’re going to take, a way to figure out if you’re likely to take more time than available. So, it sounds like you do care about it, you’re just saying it doesn’t matter if you’re formal about it?

If your team’s n^2 worst case really is in the noise, then you’re absolutely right, the senior is prematurely optimizing.

But without more context, I have to assume there could be reasons your senior “nerd” might be right. Is there a noticeable difference to the user in run time with the largest n you can possibly have? Is the other team you’re delivering to going to call your n^2 routine from one of their own, one that has it’s own O(n) or O(n^2) outer loop?

I’d say big-O notation isn’t difficult, doesn’t take long to learn, doesn’t require school, and knowledge of it doesn’t make someone a wizard. It’s easy. It’s a low enough bar, that I do expect everyone on my team to feel comfortable looking at a function with loops and being able to tell whether something is O(n^2) or O(n^3).

The difficult cases are determining complexity of an architecture full of lots of interconnecting O(1..n) functions, when the complexity isn’t all in one place. I watch unnecessary O(n^3) algorithms get made even on teams of people who all know big-O.

> Big-O is just a formal word for how much time you’re going to take, a way to figure out if you’re likely to take more time than available

This is a common misunderstanding about big-O. It's not about how much time you're gonna take but it's actually a measure of how complexity affects time growth as the data grows.

"Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. "

From: https://en.wikipedia.org/wiki/Big_O_notation

I generally don't like wikipedia, but they got this one right.

The quote from Wikipedia is factually correct, but it seems you misunderstood my comment.

> It’s not about how much time you’re gonna take

It most definitely is about predicting running time, or memory consumption. Big-O is literally a straightforward way to write the formula for best/avg/worst case time or memory of a program, modulo a constant. The second part of your sentence contradicts the first part.

I’m not sure I buy that there’s any common misconception about Big-O either, I’ve never seen much evidence of that. Anyone who’s learned Big-O in school, or has used it at work, or read past the first sentence on Wikipedia, or commented about it on HN, knows what Big-O means and that for small n, O(n) can be bigger than O(n^2). It’s just not that tricky.

I like Wikipedia a lot, and I agree this one’s right. Why do you dislike it, and why is that relevant?

I'm not the person you're replying to, but it seems to me that there may be an example of a common misconception about Big-O in your comment here:

> Big-O is literally a straightforward way to write the formula [describing] best/avg/worst case time or memory of a program

Technically, Big-O notation should only be used to provide an upper bound on the best/avg/worst case[0]. And typically people only use it to discuss the worse case or average case, just because talking about the upper bound on an algorithm when its inputs are restricted to its best possible inputs is typically much less useful.

But providing an upper bound is not quite "writing the formula" (giving a complete picture), because you haven't specified the lower bound (Big/Little-Omega) - very loosely speaking, if you only ever talk in terms of Big/Little[1]-O, you've only provided half the picture. Of course to be fair, Big-O of an algorithm is the side of the picture that is way more likely to bite you in production!

Disclaimer: not a mathematician, so salt to taste :)

0: https://stackoverflow.com/a/12338937/775982 1: https://en.wikipedia.org/wiki/Big_O_notation#Family_of_Bachm...

Ugh.

The “worst case” is the part referring to the upper bound. Big-O is referring to the “order” of the run time, meaning the largest term in the formula for large n.

Again, everyone knows this if they know Big-O at all. I am intentionally not being picky and pedantic with my words, because the context of this thread was people complaining about unnecessary formality and unnecessary and unhelpful over-concern about academic correctness rather than practicality. There is no widespread misconception, but there are people who like to wax philosophical and try to demonstrate their superior knowledge...

My side intention was to detail why @rafiki6 might have been incorrect without knowing it, since they claimed correctness and complained about downvotes.

Not entirely sure why I am getting downvoted here. It's factually correct and relevant to the discussion. I'm starting to think someone is actively trying to downvote my comments.
If you really do need to know that stuff, you can read up on it when it's needed. Been programming professionally for over 15 years and have had to use this stuff once that i can remember. Knowing how to design and index a database well is way more useful.
To give the other side of the coin here- you at least need to know the basic logic behind it if you're going to be able to recognize when reading up on it might be needed.

If you have never considered the benefits of caching certain computational results in a recursive algorithm, than you probably wouldn't be as quick to recognize when that technique would be useful.

Exactly. And also, even if you somehow recognized that "you need to use a DP solution here", it's not as easy as "read up on it" (like GP says). DP is not a trick that you can just learn once and then apply everywhere. It's not a npm package that you can install that solves your problem for you.
> In my "real world" we normally don't care about things like big-O complexity. We worry about doing dumb things and not taking more time than we have available.

Any work in any kind of scale setting has to implicitly deal with this. You might not realize it immediately, but if you are dealing with a network of 100k+ nodes, or some database with 10M+ entries, or anything of that sort, there is a huge difference between doing something in O(N) or in O(N^2) or O(2^N). Just a nested loop where for each node you need to gather information from each other node is completely out of the question (quadratic vs. linear). Or where you try all combinations of something (exponential). Or where you look up something (linear search vs. logarithmic). I deal with such problems every day. And my job isn't anything special. You may call those "dumb things", but under the hood it's just asymptotic complexity, aka "big-O".

It could also be that your "real world" does not contain scale settings. But then please don't generalize; the industry is full of scale problems to be solved on a day-to-day basis.

> Any work in any kind of scale setting has to implicitly deal with this.

Obviously. And most experienced engineers know this. But a lot of us never deal with n > 10k or so. I've worked a lot in embedded, and even when n is large, it's usually just moving around a payload. Even when I've dealt with n>>10k, say writing a network server, I've rarely been concerned with complexity. I focus on doing the minimum amount of work possible. It's basically the same thing, without the academic formalism. The main rule of engineering seems to be "don't do stupid things()."

- except when it doesn't matter.

When n is guaranteed to be very small it does not matter. Even an exponential solution would be fine in these cases. The 'algo nerd' on your team ought to know this.
Sadly, Perfectionists share a trait when it comes to building software, obsession to their craft at the expense of everythng else, shipping times included.
I think we all derive joy from writing 'perfect' software...but then the rules are changed and perfection becomes garbage :)

Like the other piece of software I was working on for this project. Nice little test framework with a beautiful OOP structure... which became a noose as soon as I wanted to add a new feature. Now it's this frankenbeast with a weird appendage glued on.