Hacker News new | ask | show | jobs
by tetrazine 1030 days ago
This is an aside. But I twigged on a caption for one of the figures: “Every computational problem takes longer to solve as its input grows larger, but not all problems grow harder at the same rate.”

It’s obvious from CS201 that this phrase is generally true and I have no pedantic objection to its inclusion in the article. However, I’m curious if this is strictly true or if we can find a (nontrivial) counterexample. Are there any algorithms that run faster as input grows? My first intuition is some kind of probabilistic question solved to a given correctness bound.

Edit: it is trivial to construct an algorithm with this property. My precise question is whether any meaningful problems have optimal algorithms with this property.

9 comments

> Edit: it is trivial to construct an algorithm with this property.

It's actually impossible to construct an algorithm that has its runtime decrease as the input size grows. You can do this for finitely many examples, but we're talking about asymptotics here. With discrete run-times and a lower-bound of 1 operation, the run-time will have to stop decreasing at some point and just stay at the same constant value for all but finitely-many exceptions. This makes it a constant-time algorithm.

A constant run-time is a counter-example to that caption though, as the problem doesn't take longer as the input grows. An example would be checking if a number is divisible by 2. You only need to check 1 bit, so it doesn't take any longer as you add more bits that the algorithm doesn't touch.

I guess even if you allow probabilistic algorithms and expected runtime, you still can't have an algorithm that runs faster as the input size grows. Because... it takes O(log n) time to calculate any random variable whose expected value has a denominator of size n? I'm not entirely sure but that seems right to me.
With probabilistic algorithms, you still have the fundamental limitation that there's a lower-bound to how fast an algorithm can run. You can't keep getting faster indefinitely.
You could get faster in expected value. What if the time on input of length n was an expected value of 100 + 1/n?

That isn't possible, but it might not quite be obvious why that isn't possible.

Having a constant runtime means there's a constant bound on the runtime, not that the runtime is an exact constant value. 100 + 1/n would still be constant, as it's bounded above and below by constants.

To have sub-constant time it would have to go to 0 as n increases, as otherwise we could bound the time below by a constant. This isn't possible, as a computer will take at least 1 time unit on every input. So the expected time will always be at least 1.

That's for randomly sampling a large input?

If each piece of input is already independent, then you don't have to calculate that, you can just use the input in order.

Maybe not! With O(log N) such as binary search you are assuming some sort of structure to the data. It is not necessary for an algorithm to scan (otherwise O(N) is as good as it would get)

What if we devise a data structure that becomes more efficient as it grows to do a certain (perhaps useless - but that is OK!) operation.

For example we have a data structure at N=0 with a million disconnected nodes and one of them being the target node.

As N increases add a node with a pointer to the target node. The algorithm to find the target node is a random search through nodes until it finds one with a pointer then halts. As N increases, time on average decreases with complexity O(1-N/(N-K)) where K=1000000

You can't have that because when N=K that's not a valid expression and when N>K you're putting a negative number in the big-O notation. You can't have negative runtime.

All algorithms have a lower-bound on runtime of 1. If a sequence is decreasing and bounded below, it converges to a value [1]. If the sequence is discrete, that means it reaches exactly that limit and never deviates from it.

[1] https://en.wikipedia.org/wiki/Monotone_convergence_theorem

It should have been N+K not N-K
I'm not sure what your big-O expression is supposed to mean, but if a sequence is decreasing and bounded below (which runtime is) then it has a limit. Since the sequence is discrete, it can't get arbitrarily close to the limit without reaching it. So at some point the runtime will stop decreasing.

You can approximate the runtime with a function that decreases forever, but the actual runtime will be constant for large enough n.

It is for a typical case not a specific case. The mean time, if you like.
The bit I glossed over is picking a random number as per sister comment. More thinking needed!
A cheeky way to do it is this, reuse the randomness that we are assuming about the input data:

1. Data structure in pseudo code:

    max = 0
    items = []
    def add(x: int32):
       max = max(x, max)
       items.push(x)

2. Algorithm:

    def doit()
        if max == int32.max return 0
        sleep (1000)
        return 0
As you add items to this data structure, on average over all possible data inputs the time taken for doit() will tend from K+1000ms to a constant K.
That's what I'm saying. The runtime will go down to a constant. It can't keep decreasing forever.
“Tend to” is a mathematical term mean get closer to. Formally, for any e as small as you like there exists an N within e of the value being approached.

This is possible because we are sampling a probability distribution. A dice is discrete but the probability of rolling a 1 as the number of sides increases tends to zero even if there is a whole “1” in a given dice.

You could use just the asymptote and call it "constant time".

But that's an extremely limited and misleading analysis, so you should not do that. If the time taken goes from a quadrillion to 7 as the problem size grows, don't call it constant.

"constant time" in complexity theory just means there's a constant bound on runtime. It doesn't have to actually have the exact same runtime down to the instruction for every input. Here, the bound would be a quadrillion.

Of course, showing a constant upper-bound doesn't tell us that it isn't even faster than constant as in the proposition I was responding to. That's why I focused on the constant lower-bound.

I know what it means, and I stand by it being limited to the point of misleading in a case like this.

Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.

> Running any (halting) algorithm on a human computer is constant time, because you're bound by the number of states you can fit into some terabytes, but nobody should actually try to use that as a final analysis.

That's the sort of thing I would usually say to explain why we do things like use asymptomatic analysis and ignore polynomial factors. If you have a different solution, that's great, but you're no longer taking about the kind of runtime the article is about.

It depends on how abstract you want to be. You shouldn't always shove the lever all the way to the most abstract single value.

My suggested solution is "don't delete all nuance". If the runtime doesn't behave nicely, give that a mention. Most things do behave nicely, so that's not much of a burden.

> the kind of runtime the article is about

The article spends plenty of time talking about computational feasibility, which is not about the limits as you approach infinity.

It even equates P with "trivial", which is... not good.

You do if you are a complexity theorist :)
Finding a set of mututally orthogonal vectors. For a given sparsity, after a large enough dimention, two randomly chosen vector will almost always be orthogonal.
That would be a case of higher probability of finding a solution in one step.

But the solution would still need to be checked and another candidates generated until a solution is found.

Average time would be minimized by generating random vectors each time.

But that would increase the worst case to unbounded (effectively infinite) time since an increasingly vanishingly small but finite chance that a solution has not yet been found will exist after any number of steps.

Some kind of simple search would be vastly more bounded, but in practice require more computation.

Of course. The obvious examples are probabilistic, e.g., Monte Carlo methods, as an increase in the problem space decreases the number of samples needed.
> it is trivial to construct an algorithm with this property

Actually, it is impossible to construct an algorithm with this property, at least under any usual computational model.

Every computational model has discrete time units; therefore the amount of time taken cannot be vanishingly small.

(This is assuming the usual notion of complexity, where only the asymptotic behavior matters. It doesn't matter if it gets increasingly faster over the first million input sizes...it only matters what the behavior is as n -> infinity.)

A program cannot be increasingly faster forever.

Consider a hashmap. Hashmaps give expected constant access time, and amortised constant access time on the condition that their size grows at a sufficient rate and that the hash function is a universal hashing function. This is parametric.

Assume that you use a linkedlist for collisions, then for some growth rates, your cost of access is not constant, and is certainly at most O(n).

The point being that the universal hashing function has 1/m probability of collision; if your growthrate is bad. Then m << n, which turns the expected time from constant to at most O(N) because m grows slower than N.

"Faster as the input grows" is perfectly compatible with a minimum number of time units. If your definition insists the minimum of time units must be 0, with infinitesimal time getting involved, then your definition sucks.
For a function to strictly decrease forever, at least one of two statements is true:

1. It decreases by vanishingly smaller amount.

2. It decreases to negative infinity.

So...which is it? Infinitesimal time units, or negative time?

Strictly speaking it either tends to negative infinity, or tends to so real number X, which could be any number positive or negative.
In that latter case, statement 1 is true.
3. Nobody said "strictly".
They said decreases, not stays constant
Plenty of O(n), O(nlogn), O(n^2) algorithms don't strictly increase.

Rounding the decrease to the nearest time unit some of the time is hardly a disqualifier for "decreasing". (Though specifically they said "faster".)

I could imagine some behaviour along those lines in search engines, specifically if you're searching for similar documents. Which none of them seem to — would be useful! Let me know if I'm wrong! — but imagine a search engine that lets you look for documents 'similar to this document here'. Also imagine that's threshold-based; you want equal levels of similarity regardless of the rarity of the input document.

In that case, as the size of the corpus grows it should get easier to find ones in the right range of similarity.

If your computational model requires that the algorithm reads the input then I don't see how it's possible. Even if above some size the algorithm does nothing but exits the time will still grow linearly with the input from reading.

You could imagine an algorithm that takes some astronomical time for size 1, less for 2, etc.. but for any finite time at size 1 there will be some size N where the reading time finally dominates the computation.

Depends what you mean by larger. The example that occurs to me is the priblem of determining whether or not an integer is prime. This can be done relatively quickly for numbers of the form 2^p-1 where p is prime, but would take much longer for a much smaller prime not of this form.
Those would be effectively different problems.

Adding the constraint of only needing to classify a particular form of prime will always result in an algorithm of equal or lesser complexity order.

If you are willing to settle for expected instead of worst case time, there are many string-related algorithms that are O(m/n). Intuition is that as the string size grows it's "more likely" to have some property on expectation.
Same for hash related algorithms and data structures.
caches? the more input the fewer cache misses