Hacker News new | ask | show | jobs
by dgfitz 551 days ago
I opened the link and just started reading. I have a really dumb question that may expose common knowledge I don’t have, about this quote:

> The total amount of space needed to represent this collection of strings is O(k n^2).

I haven’t seen O-notation ever represent ram usage, just algorithm complexity. Is this common?

2 comments

Yes, very common. You've seen "time complexity"; it's very common to talk about "space complexity" as well.
Fun bonus: they can be interchangeable, e.g. increasing space to reduce time.
And any operation that takes n bits as input can be trivially turned into an O(1) time and O(2^n) space algorithm through tabulation.
Assuming you ignore or amortize the time necessary to create the table in the first place, of course.

This is the basis for rainbow tables: precomputed tables for mapping hashes to passwords, with a space-saving “hash chaining” trick to effect a constant factor reduction in table size. Such tables are the reason why passwords must be hashed with a unique salt when stored in a database.

Yes, but total time is never going to be less than total space, when expressed in big-O notation
I’m not sure this definition of Big-O for space complexity is universal. When I’ve seen/used it, the size of the initial data wasn’t relevant, it was more about the additional memory required for the algorithm.

For example, an in-place algorithm like Bubble Sort would have a O(1) space complexity, because it requires no extra memory (and 0 memory is a constant). Merge sort on the other hand is O(n) because it always uses additional memory for its intermediate stages, and that additional memory scales with n.

Doing a quick google, the first few sites I find seem to use a similar understanding https://www.geeksforgeeks.org/time-and-space-complexity-anal...

The confusion is around space complexity vs auxiliary space complexity (extra space).

space complexity is O(n) but auxiliary space complexity uses Theta for notation instead.

But people aren't too picky on the notation and usually say something like "O(1) extra space" instead of using theta.

https://en.m.wikipedia.org/wiki/Space_complexity

That’s not quite accurate. Big O notation and Theta notation are different ways of expressing the growth rates of functions - the choice of using Big O or Theta is independent of whether you’re trying to specify total space complexity or auxiliary space complexity.

Saying something is O(n) tells you it grows at most linearly, but this would also admit e.g. log n.

Saying something is Theta(n) tells you it grows exactly linearly: that is, it is not a slower growth rate like log n, nor a faster growth rate like n^2.

Why couldn’t it?
Because it takes n time to access n units of memory.

Heavily simplified due to caches etc. To the point where people sometimes measure in cache misses instead as that is usually what actually matters.

Let's say you have a lookup table and your algorithm looks up a value and returns it. O(n) space O(1) in time.
What if you allocate a huge chunk of memory and only use a small part of it? For example, checking if a list of numbers contains duplicates using a boolean array.
> Is this common?

Very. For instance if you look at sorting algorithms on wikipedia they pretty much all list performance (best, worst, average) but also worst-case space complexity, in O notation.