Hacker News new | ask | show | jobs
by baobabKoodaa 1762 days ago
> Almost everything you will ever need is already implemented in a library for you to reuse, and even reaching for that library is pretty rare.

Developers who have no interest towards algorithms will not even have the skills required to identify opportunities where algorithms should be used, much less the skills required to identify which algorithm should be used. For example, I had a conversation here on HN a few weeks back where somebody said that they will not need to learn anything about sort algorithms, because they can just google bubble sort and copypaste it if they need sorting at work. And that's how you end up with 5-minute load times for GTA.

2 comments

> somebody said that they will not need to learn anything about sort algorithms, because they can just google bubble sort and copypaste it if they need sorting at work

In this example, the missing skillset is not memorizing a book of algorithms as you seem to imply.

If performance isn't good for the use case, they should be profiling the code for root causes. Which will lead them to identifying the sort operation as a problem. At which point they can search for faster sorting algorithms which will quickly lead them to a better solution and libraries containing them (or even how to implement, if they're operating in a constrained environment where no libraries exist).

I don't want to hire software engineers who have memorized the algorithm book, that's not useful to me. I want to hire those who can apply discipline to the work, such as (in this example) be aware of performance and know how to analyze it and find solutions.

> In this example, the missing skillset is not memorizing a book of algorithms as you seem to imply.

No it's not. As I said in the sibling comment, competitive programming (or algorithm skills in general) isn't about memorizing algorithms, it's about developing skills that allow you to understand and create algorithms to solve problems. In this case, it would be sufficient to understand why looking up a phone number in a phone book should inherently be a O(log n) operation, not a O(n) operation.

> I don't want to hire software engineers who have memorized the algorithm book, that's not useful to me. I want to hire those who can apply discipline to the work, such as (in this example) be aware of performance and know how to analyze it and find solutions.

So we're discussing a hypothetical developer who has "no interest towards algorithms", and you're imagining that this person is going to run a profiler to troubleshoot a performance issue and subsequently optimize algorithms to fix said performance issue? This sounds like a fantasy to me. If you want skilled software engineers who have the ability to troubleshoot and optimize algorithms, you probably need someone who has some kind of interest towards algorithms.

> So we're discussing a hypothetical developer who has "no interest towards algorithms", and you're imagining that this person is going to run a profiler to troubleshoot a performance issue and subsequently optimize algorithms to fix said performance issue? This sounds like a fantasy to me.

Our experiences are vastly different then. Nothing fantasy or hypothetical about this. Every great software engineer I've worked with is like this.

Profiling and identifying hot spots is an every-week occurrence in the field, very pragmatic and valuable work. Theoretical algorithms only happen on interview whiteboards, not on the job.

You're arguing that someone who doesn't care to learn anything about algorithms is going to do a good job at optimizing algorithms.
No, not at all!

I am saying that being able to regurgitate an implementation of quicksort (to keep on the sorting example) onto a whiteboard is not a useful skill. There will never be a need to do this in a real life job.

The truly useful skill is being able (and interested!) in using a profiler to find bottlenecks, identify what is being done there (let's say, sorting) and being able to find a better replacement algorithm in the literature and change to code to use a well-optimized implementation (which exists already in just about every ecosystem, no need to reimplement it from memory).

> I am saying that being able to regurgitate an implementation of quicksort (to keep on the sorting example) onto a whiteboard is not a useful skill.

In that case I'm not sure why you're arguing against me, because I've repeatedly expressed my dislike towards algorithm memorization. I don't believe memorizing algorithms is useful at all. It might give a very small edge in competitive programming, but it's not what competitive programming is about either. The best competitors have amazing skills at creating algorithms to solve problems ("create" as opposed to "memorize").

If you look up the parent thread, you started arguing against me when I said "Developers who have no interest towards algorithms will not [be good at some things]".

>And that's how you end up with 5-minute load times for GTA.

was it actually the reason?

Different game, different problems, but I've spent a fair amount of time looking at the source for Civ IV's core game DLL (released by Firaxis for modding).

It was fairly simple to get significant performance gains (without changing AI behavior or results) just by applying straight forward changes. Eliminating macro's hiding chained method calls. Saving the result of nested calls, e.g. save a pointer to the game map instead of making the same call repeatedly in a tight loop.

There's a huge difference between "premature optimization" and just sloppy coding. Too many take the idea too far that the (C++) optimizer is better than you'll ever be and that you should let it do it's job. Yes, the optimizer is/can be very good and can make surprising and unintuitive changes to make code faster. But, it's not a miracle worker. Optimizers work best on clear, concise, code with minimal data dependencies.

FWIW, back on my Core I2 Duo, a single late-game turn on a huge map could take upwards of 30 minutes to complete, most of that spent in AI's turns. I was able to get it down to about 5 minutes without too much effort. Along the way, RAM usage also plummeted. If I had full access to the source, I could have made even greater gains that would have broken the DLL ABI...

Sadly, I never released my changes anywhere (and I only had local SVN at the time), and they haven't survived my PC upgrades since.

No they weren't using bubble sort. When parsing JSON, they were calling sscanf (which internally was calling strlen). In addition, those parsed objects were stored in an array which was iterated through on each insert to check for duplicates.

There was an article [0] and subsequent discussion on HN [1] about reverse engineering and patching the binaries.

[0] https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...

[1] https://news.ycombinator.com/item?id=26296339

No? If you check your own sources, you will find that this is precisely the type of case where a developer is using an O(n^2) algorithm out of ignorance when they should have been using a faster algorithm, and that the high load time was caused by this mistake. The time complexity for bubble sort is also O(n^2), compared to O(nlogn) for reasonable sort algorithms. I don't really understand how you take all this information and then conclude "no".
The way accidental O(N²) comes about isn't "I'm intentionally using an algorithm that's O(N²)" (e.g., bubble sort), but "this method that I think is O(1) is actually O(N)" (whether because it's badly designed or just badly documented).

It's really not in the same vein as using bubblesort instead of a mergesort: failures aren't in algorithm design but in the actual mechanics of the implementation that is ancillary to the algorithm itself.

> failures aren't in algorithm design but in the actual mechanics of the implementation that is ancillary to the algorithm itself.

Strongly disagree. The GTA developer basically designed an algorithm which loops items in an array, and for each iteration in the loop, they parse the entire JSON that contains all items, and they do lookups for duplicates from an array instead of from a set. This is like a quintessential example of an algorithm design failure, so it seems weird to me that you're not attributing it to a failure in algorithm design, but instead you pass the buck to creators of the library that the developer was using. Sure, the library methods could have been documented better, or implemented differently, or whatever, but that seems like an entirely unrelated discussion.

> "this method that I think is O(1) is actually O(N)"

I'm fairly certain the developer in question does not understand / care what big-O notation means. Therefore, this almost certainly is not the thought process that lead to the slow load times in GTA. More likely the developer was just mashing things together until "it works".

Bubble sort provides O(n^2) time complexity for a task that can be easily solved with O(nlogn) time complexity. The 5-minute load time for GTA was caused by a developer using an algorithm with O(n^2) time complexity when they could have used an algorithm with O(nlogn) time complexity. So, yes, I would say it actually was the reason (unless you are specifically asking if the bad algorithm choice was bubble sort, in which case the answer is no).
https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...

tl:dr; O(n^2) string parsing and O(n^2) duplicate removal