Hacker News new | ask | show | jobs
by daveFNbuck 1033 days ago
> Edit: it is trivial to construct an algorithm with this property.

It's actually impossible to construct an algorithm that has its runtime decrease as the input size grows. You can do this for finitely many examples, but we're talking about asymptotics here. With discrete run-times and a lower-bound of 1 operation, the run-time will have to stop decreasing at some point and just stay at the same constant value for all but finitely-many exceptions. This makes it a constant-time algorithm.

A constant run-time is a counter-example to that caption though, as the problem doesn't take longer as the input grows. An example would be checking if a number is divisible by 2. You only need to check 1 bit, so it doesn't take any longer as you add more bits that the algorithm doesn't touch.

3 comments

I guess even if you allow probabilistic algorithms and expected runtime, you still can't have an algorithm that runs faster as the input size grows. Because... it takes O(log n) time to calculate any random variable whose expected value has a denominator of size n? I'm not entirely sure but that seems right to me.
With probabilistic algorithms, you still have the fundamental limitation that there's a lower-bound to how fast an algorithm can run. You can't keep getting faster indefinitely.
You could get faster in expected value. What if the time on input of length n was an expected value of 100 + 1/n?

That isn't possible, but it might not quite be obvious why that isn't possible.

Having a constant runtime means there's a constant bound on the runtime, not that the runtime is an exact constant value. 100 + 1/n would still be constant, as it's bounded above and below by constants.

To have sub-constant time it would have to go to 0 as n increases, as otherwise we could bound the time below by a constant. This isn't possible, as a computer will take at least 1 time unit on every input. So the expected time will always be at least 1.

That's for randomly sampling a large input?

If each piece of input is already independent, then you don't have to calculate that, you can just use the input in order.

Maybe not! With O(log N) such as binary search you are assuming some sort of structure to the data. It is not necessary for an algorithm to scan (otherwise O(N) is as good as it would get)

What if we devise a data structure that becomes more efficient as it grows to do a certain (perhaps useless - but that is OK!) operation.

For example we have a data structure at N=0 with a million disconnected nodes and one of them being the target node.

As N increases add a node with a pointer to the target node. The algorithm to find the target node is a random search through nodes until it finds one with a pointer then halts. As N increases, time on average decreases with complexity O(1-N/(N-K)) where K=1000000

You can't have that because when N=K that's not a valid expression and when N>K you're putting a negative number in the big-O notation. You can't have negative runtime.

All algorithms have a lower-bound on runtime of 1. If a sequence is decreasing and bounded below, it converges to a value [1]. If the sequence is discrete, that means it reaches exactly that limit and never deviates from it.

[1] https://en.wikipedia.org/wiki/Monotone_convergence_theorem

It should have been N+K not N-K
I'm not sure what your big-O expression is supposed to mean, but if a sequence is decreasing and bounded below (which runtime is) then it has a limit. Since the sequence is discrete, it can't get arbitrarily close to the limit without reaching it. So at some point the runtime will stop decreasing.

You can approximate the runtime with a function that decreases forever, but the actual runtime will be constant for large enough n.

It is for a typical case not a specific case. The mean time, if you like.
The bit I glossed over is picking a random number as per sister comment. More thinking needed!
A cheeky way to do it is this, reuse the randomness that we are assuming about the input data:

1. Data structure in pseudo code:

    max = 0
    items = []
    def add(x: int32):
       max = max(x, max)
       items.push(x)

2. Algorithm:

    def doit()
        if max == int32.max return 0
        sleep (1000)
        return 0
As you add items to this data structure, on average over all possible data inputs the time taken for doit() will tend from K+1000ms to a constant K.
That's what I'm saying. The runtime will go down to a constant. It can't keep decreasing forever.
“Tend to” is a mathematical term mean get closer to. Formally, for any e as small as you like there exists an N within e of the value being approached.

This is possible because we are sampling a probability distribution. A dice is discrete but the probability of rolling a 1 as the number of sides increases tends to zero even if there is a whole “1” in a given dice.

An algorithm can't have that property for its expected runtime, as it can only look at a sub-constant amount of the input of it runs in sub-constant time.

it's not possible for it to behave differently as n increases because it doesn't have enough time to distinguish between different values of n once it's large enough.

You could use just the asymptote and call it "constant time".

But that's an extremely limited and misleading analysis, so you should not do that. If the time taken goes from a quadrillion to 7 as the problem size grows, don't call it constant.

"constant time" in complexity theory just means there's a constant bound on runtime. It doesn't have to actually have the exact same runtime down to the instruction for every input. Here, the bound would be a quadrillion.

Of course, showing a constant upper-bound doesn't tell us that it isn't even faster than constant as in the proposition I was responding to. That's why I focused on the constant lower-bound.

I know what it means, and I stand by it being limited to the point of misleading in a case like this.

Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.

> Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.

That's the sort of thing I would usually say to explain why we do things like use asymptomatic analysis and ignore polynomial factors. If you have a different solution, that's great, but you're no longer taking about the kind of runtime the article is about.

It depends on how abstract you want to be. You shouldn't always shove the lever all the way to the most abstract single value.

My suggested solution is "don't delete all nuance". If the runtime doesn't behave nicely, give that a mention. Most things do behave nicely, so that's not much of a burden.

> the kind of runtime the article is about

The article spends plenty of time talking about computational feasibility, which is not about the limits as you approach infinity.

It even equates P with "trivial", which is... not good.

You do if you are a complexity theorist :)