| A simple thought experiment suffices here. What is the shape which holds the most physical bits while minimizing the overall latency for random access? It's a sphere. Each bit occupies a space packed within that sphere. The radius of the sphere is the distance that light must traverse, and thus corresponds to latency. "slow" elements of the memory hierarchy are on the outside of the sphere, while faster elements (cache, registers, etc) are layered on the inside, like an onion. Since those spheres are smaller they must, by definition, hold fewer bits, but they are, by definition, faster. The total number of bits you can store is a function of the volume of the sphere. For a given latency level, it's a function of the surface area of the sphere at a given radius. The volume of a sphere is 4/3pir^3. Because latency is a function of the radius (how far it takes light to bounce to the edge of the sphere and back) that means that latency must rise as at least the cube root of the number of bits you want to store. That is the best possible bound. This implies that no algorithm is ever O(1) time for an asymptotically large number of elements accessed randomly--not even hash tables or pointer dereferences. They're at best O(n^1/3). |
O(1) is about number of operations required by algorithm to finish for given data size, not about the time. So latency doesn't matter.
Also: if the amount of information that can be kept in universe is finite (most probably it is) - then you can make algorithm that takes the same amount of operations no matter data size (just always add dummy data to fill up the data to the physical limit). Thus every algorithm is technically O(1).
Proof: let N be the number of bits that we can keep in memory. Every deterministic algorithm either does infinite loop, or finishes the execution after at most 2^N changes of state (otherways it is 2 times in the same state with different follow-up, and he can't, cause it's deterministic). So if we design an algorithm, that for every data fitting into memory calculates the result and then does busy loop for the remaining steps until the step 2^N - this algorithm is O(1) no matter what it does.
There's probably a hole in my understanding somewhere, cause algorithmic complexity would be a really useless definition if that was true :)