Hacker News new | ask | show | jobs
by mortdeus 2947 days ago
So just to better understand the issue of P=NP, am I right in assuming that an NP problem is like trying to build a neural net for image recognition?

In the sense that it takes a huge amount of time and images to train the thing to become smart (in other words, to go through the entire network of neurons one by one and assign better weight values etc) but when it comes to actually verifying if our network is smart all it takes is to run an image straight through the network?

And the issue of trying to prove N=NP is essentially trying to prove that there isn't a magical way to train a neural network with just one image of training data?

4 comments

No, I don't think building a neural net for image recognition qualifies.

NP problems are essentially problems where the following two properties hold:

1. No better algorithm is known than using brute force -- generating every possible result and checking if it's a valid result.

2. Checking that the given result is valid is doable in polynomial time or better. (This is less critical to understanding, but essentially your `isValidSolution(input, solution)` function needs to take `O(n)`, `O(n^2)`, `O(n^3)`, (etc.) time or better.)

Essentially, a NP problem is a problem where you have to brute force the solution, but it's easy to know when you've found the correct solution.

If P = NP that means that there are no problems where this is true -- it would mean that any problem where it's easy to know if you've found the solution also has an algorithm for finding it more efficiently than brute force.

Your neural network example doesn't apply because training a neural network doesn't require brute forcing the solution space of the neural network weights. That would be crazy. So it's a problem in P.

Re1: the problem here goes a bit deeper, classically the NP problem is a problem which requires a non-deterministic turing machine to solve. Ie a machine that can explore multiple outcomes at once, even the entire problem space to some extend.

A non-deterministic turing machine can interprete multiple instructions (write 5 to tape and go to state 2 OR write 3 to tape and go to state 19) and (depending on interpretation) use the instruction that gives the result fastest or explore all instruction branches at once.

Note that the "OR" in there is not a classical if-else, it means literally the machine can pick one. There is no memory value that tells it which is correct.

Basically, problems that are polynomial on a NDTM will likely be NP on a normal machine (with some exceptions depending on the problem).

Re2: As above, verifying is actually easy as you can then simply follow the execution branch the NDTM picked on a normal deterministic turing machine.

Atleast this above is what I learned in CS course 2nd semester.

I've read a little bit about turing machines and there is also a variation that includes what is called an "oracle" machine. Do you know anything about that?
An oracle machine is usually a turing machine that can tell you things about other turing machines.

An example would be the Halting-Oracle, which can tell you if a program will halt or not without running it. (Of course, this oracle doesn't actually work since your program can have the oracle's output as input and simply do the opposite)

A NP/P Oracle might be an oracle that can tell you if a more efficient algorithm for your problem exists.

Such oracles can be setup to prove certain assumptions (or disprove) based on simply logical statements (see halting problem oracle above).

Oracle machines are also not really turing machines, it's usually assumed they can do the oracling in a single operation and they're usually a black magic box (there is no need to explain how they work, they're an abstract concept).

From "TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE":

We consider a 2-layer, 3-node, n-input neural network whose nodes compute linear threshold functions of their inputs. We show that it is NP-complete to decide whether there exist weights and thresholds for the three nodes of this network so that it will produce output consistent with a given set of training examples.

https://papers.nips.cc/paper/125-training-a-3-node-neural-ne...

but trying to devise a program that can predict somebody's password without using brute force is a NP problem? Since its hard to discover what the password is, but trivial to check if it lets you in the vault?

I get the reasoning why people aren't sure if P = NP.

Here is another question.

What defines time? And what about a solution?

Consider the fact that in the theory of general relativity we have the twin paradox.

What if I sent a computer on a rocket at close to the speed of light and then on earth I had the same machine running the calculation that takes a million years, and then when the other computer gets back from it's interstellar vacation, I just have the computer ask the other what it's part of the solution is.

According to the computer that flew away the amount of time that it takes to solve the problem can be considered ~P in a way.

My intuition is telling me that this is probably the wrong way to think about time complexity though.

Time and solution are usually abstract concepts here.

The time a turing machine takes to solve the problem can be considered local and in absense of general relativity, it doesn't matter much.

When you talk about time in NP/P, you usually talk about the O Notation (Big O and friends) which gives you the driving factor of the time needed to solve.

Ie, O(n) = 2 + 3x + 9x^2 + 9^x is n^x complexity because this part of the equation will grow much quicker than the others. In reality of course, it can take a while for the biggest part to overtake which is why searching an array linearly in CPU cache can be faster than a hashmap.

The solution in NP/P is usually reduced to either a predicate being proven true or the answer being boolean or can be reduced to either. (ie, solving a jig-saw puzzle can be reduced to "is this jig-saw puzzle solvable" which can be proven by presenting a solution)

The problem you present is not in the scope of NP/P. You're already involving multiple computers and generally here, people still assume general relativity doesn't exist. Plus NP still holds because one computer simply accessed a blackbox function (oracle) while the other solved it in NP.

Time-complexity isn't the only complexity either, you also have space-complexity.

SC tells you how much memory you need to solve a problem. Or rather, how quickly that memory need will grow. You can trade a lot of NP problems to P problems if you have "NP" space complexity.

SC complexity won't spit out a byte-accurate estimation of memory but it can tell you that O(n) = 100 + 100*x will grow linearly in size (O = x) and O(n) = 100 + x^2 will grow exponentially in memory usage (O = x^2).

> According to the computer that flew away the amount of time that it takes to solve the problem can be considered ~P in a way.

No, since it can only save a constant factor of time. (Also, dragging random physics into a mathematical model based on a different view of the world isn't really helpful to understanding it. "Time" is computational steps, not seconds)

P = NP is trivially true on machines which compute everything instantly (using paradoxes).
Fuzzy analogies will get you nowhere.

> And the issue of trying to prove N=NP is essentially trying to prove that there isn't a magical way to train a neural network with just one image of training data

No it's not. You can't do reasoning on bullshit inaccurate analogies.

I'm trying to summarize what I remember on the topic, so if anyone finds something wrong, I'd appreciate a correcting comment :)

Without knowing too much about neural networks myself:

Theory stuff:

P, NP, NP-hard and NP-complete only refer to classes of complexity (of algorithms), i.e. you can "categorize" decision problems this way (by currently known solutions or mathematical proofs of lower and upper boundaries for complexity).

P contains any problem for which there is an algorithm that can decide it on a deterministic turing machine (read: abstract definition for a computer) in an amount of time that is growing polynomially with the size (n) of its input.

Example: Checking if a list contains a specific item can be achieved in O(n) time - just iterate through the list. Searching in a pre-sorted list (binary search) can be achieved in O(log(n)) time. This means that if your list is longer by a large factor, both algorithms will take accordingly more time, which means that the amount of computation necessary for the sorted case will grow much less than the generic case. Both problems are within P, though.

NP contains all problems that can be solved by a nondeterministic turing machine in polynomial time. If you think of your problem as an automaton with states, this machine could deal with ambiguous state transitions in an efficient manner, "magically" only picking those options that succeed. This is basically the promise of a quantum computer. P is thus a subset of NP, since the nondeterministic touring machine can do anything a deterministic one could do too.

You can still solve those problems on a normal computer, but not in polynomial time (unless P=NP, which is neither officially proven nor disproven until now).

Then there is the idea that you can solve problems using solutions for other problems merely by transforming the input in polynomial time.

NP hard problems are those which you could use to solve any other problem within NP this way. Those don't necessarily have to be in NP - their complexity could be even worse. There is an intersection with NP, though: Problems within NP that you can use to solve all other problems in NP. Those are called NP complete.

If you would now find a way to transform the input for an NP complete problem into one for any P problem in the same way (remember: in polynomial time for a DTM), then you would have effectively proven that P=NP.

On the NN comment:

I've always understood AI in general and especially trained algorithms to be heuristics. If you talk about P and NP, you're talking about mathematically proven solutions, so the comparison doesn't work out in the strict sense. On the same note, NNs are often used to solve problems that don't have strictly "correct" solutions. If your problem is "does image X depict a goat?", you won't get far with traditional means. A NN, on the other hand, trades accuracy (false positives and false negatives will happen) and up-front work (training with a dataset not provided by the actual problem instance) for speed.

It is related to the discussion though, since a generic trainig algorithm for neural networks actually will have a non-polynomial complexity as far as I understand (at least Google tells me so ;-) ). Note however that training a NN is also not a decision problem: The result is not true or false, but the resulting network.

Edit: Replaced a reference to sorting with searching for an item, since sorting is not a decision problem.

What I am trying to understand is why this is considered a solvable problem, when essentially you need to devise a method to find a better solution to the problem in order to prove that there isn't a solution.

Aren't all proofs flawed in this way?

Like for example, 2 = fish. Prove that there isn't an undiscovered mathematical axiom that makes this not nonsense?