| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tomerv 1666 days ago
	Practically speaking, it is not a good advice to just treat log(n) as if it is constant. While it's true that log(n) is always small in practice, the conclusion should not be to ignore it, but rather to notice the constant factor. And in practice, usually data structures with O(log n) complexity also have a bigger hidden constant. For example, std::unordered_map is much faster in practice than std::map. Of course, this is not strictly correct, it's just a heuristics. Quicksort with its O(log n) [Edit: O(n log n)] complexity is a counter-example to this.

4 comments

colanderman 1666 days ago

To be clear, that is not the advice I'm giving -- but rather, when your performance looks like p*log n + q, if q is much greater than p/40 -- that is, the constant term dwarfs the logarithmic term -- then it is safe to consider it constant.

link

xdavidliu 1666 days ago

> p*log n + q, if q is much greater than p/40 -- that is, the constant term dwarfs the logarithmic term

I think you meant to say "if q is much greater than p TIMES 40".

link

colanderman 1665 days ago

Ah good catch, yes you are correct.

link

kadoban 1666 days ago

That seems like a pretty good argument _for_ treating it as constant though and just shifting your focus to how large the constants actually are.

link

jltsiren 1665 days ago

In a virtual memory system, random access to an array of size n takes O(log n) time, and the constant factors in that O(log n) are also nontrivial. Algorithms that do O(log n) computation with O(log n) independent elements tend to take O(log^2 n) time, while those that do O(log n) computation with O(log n) contiguous elements or O(log n) iterations with O(1) elements still take O(log n) time. If the constant factors are small enough, it can be hard to distinguish the latter two from algorithms doing O(1) computation with O(1) elements.

link

HALtheWise 1665 days ago

In practice, for any memory system with caches and limited by the speed of light, random (unpredictable) access to an array of size n takes much closer to O(sqrt(n)), not O(log(n)). There's an excellent article discussing this that you can search for, and it holds both in emperical tests on modern hardware and in the theoretical physical limit.

link

jltsiren 1665 days ago

That depends on your perspective. If multiple levels of memory hierarchy are relevant (such as when scaling from 1 MiB to 1 GiB), you will see something resembling O(sqrt(n)). If you remain within the same level (e.g. from 1 GiB to 1 TiB), the scaling resembles O(log n) more closely. Or, in other words, it depends on whether you assume that cache size grows with n or is independent of it.

link

kwertyoowiyop 1666 days ago

And of course, nothing is important until you’ve profiled your code and measured that it is.

link