|
You make several points. It appears that you don't like my main point but have no good argument against it or replacement for it; I will respond and try to explain my main point again. One of your points seems to be that the selection of 10 best algorithms is not very good. I would agree: I would have selected heap sort instead of quicksort because, given a positive integer n, the execution time of heap sort in sorting n items is proportional to n ln(n) in both average case and worst case and, thus, heap sort meets the Gleason bound for the fastest possible sort by comparing pairs of keys. The execution time of quicksort on average is n ln(n), and in practice faster than heap sort, but in worst case appears to run in n^2. Quicksort seems to do better on locality of reference for a virtual memory system, but there are ways to improve the locality of reference of heap sort. For my main point, that the list of 10 best algorithms was well chosen is not very important. Here is my main point again: "For reasonably complicated problems, the key is some applied math, and we take a corresponding algorithm seriously only because of the logically prior applied math." So, since you don't like this point, I will try to explain in more detail: First, we are considering 'algorithms'. So, let's agree on what we mean by an algorithm: I will accept running code in a common programming language -- C/C++, Fortran, PL/I, etc. -- or something similar in 'pseudo-code'. So, the issue is, given an algorithm, what do we need to take it "seriously", that is, be sure it does what it is intended to do? Second, briefly let's return to sorting: Quicksort and heap sort are darned clever. But, given the algorithms, in the form I just mentioned, it's easy enough to trace the logic informally and confirm that the algorithms actually do sort. For the running times, finding those is more work but not very notable math. So, net, for these sorting algorithms, it is easy enough for us to take them seriously for their ability to do what is promised -- sort or sort in n ln(n). You also mentioned data structures. Well we can say much the same for AVL trees or the data structures used in a fast implementation of the network simplex algorithm for least cost capacitated network flows, etc. For some of the data structures used in, say, dynamic programming, more is needed to take the algorithms for those data structures seriously. Similarly for some uses of k-D trees. So, for some algorithms and data structures, we can take them seriously just 'by inspection'. Third, consider, as in the list of 10 algorithms, trying to solve a fairly complicated problem. Examples could include the discrete Fourier transform, finding eigenvalues and eigenvectors of a symmetric, positive definite matrix, least cost network flows, matching to minimize the most expensive single match used, linear programming, quadratic programming, iterative solution of large systems of linear equations (e.g., via Gauss-Seidel). Maybe we are trying to solve a problem in non-linear programming and want an algorithm to achieve the Kuhn-Tucker conditions. Given an algorithm for one of these problems, to take the algorithm seriously we need more than just 'inspection'. So, since just 'inspection' no longer works, the question is, how can we take such an algorithm seriously? Actually, there is some fairly tricky math needed for us to take seriously the simplex algorithm for linear programming: For the set of real numbers R, a positive integer n, and Euclidean R^n with the usual topology, let F be the intersection of finitely many closed half spaces of R^n, and let linear function z: R^n --> R. Claim: If z is bounded above on F, then z achieves its least upper bound. For us to take the simplex algorithm seriously, we need to know that this claim is true. Note: The proof is not just usual analysis with converging sequences from Rudin's 'Principles'. For more, given a linear programming problem, it may be feasible or infeasible. If the problem is feasible, then it may be bounded or unbounded. If the problem is feasible and bounded, then we want to know that there is an optimal solution and that the simplex algorithm can find one in finitely many iterations. Since the simplex algorithm considers only extreme point solutions, we would like to know that, if there is an optimal solution, then there is an optimal extreme point solution. In the simplex algorithm, there is a sufficient condition for optimality, but there can be optimal solutions and optimal extreme point solutions without this sufficient condition. So, we need to know that the simplex algorithm can achieve the sufficient condition. The nicest solution to these issues I know of is via Bland's rule by R. Bland, long at SUNY. It's not obvious. Again, let's be more clear: Given an algorithm as above, that is, just code or pseudo-code, for the simplex algorithm with Bland's rule, solution of linear equations with double precision inner product accumulation and iterative improvement (Forsythe and Moler), detecting and handling degeneracy (basic variables with value 0), detecting infeasibility, detecting unboundedness, considering the reduced costs as a sufficient condition for optimality, the code will look like just so much gibberish with no good reason to take it seriously. To take the code seriously, we need some prior math where the algorithm just implements the manipulations specified by the math. Yes, apparently computer science regards the simplex algorithm as an algorithm in computer science: E.g., the algorithm is discussed in Chapter 29 of Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, 'Introduction to Algorithms, Second Edition', The MIT Press Cambridge. commonly called 'CLRS'. E.g., recently I encountered a question: For some positive integer n, given n client computers and one frequency band with bandwidth x Mbps, design a wireless network to provide all x Mbps to all n client computers simultaneously. Is such possible? Yes. Does this violate Shannon's information theory? No. So, how to take any such proposal seriously? Sure, some math. Basically the solution is one heck of a strange 'antenna' pattern with n nodes with each client getting just the node with just their data. It all boils down just to linear algebra after taking some Fourier transforms. So, again, given a fairly complicated problem, when trying to evaluate an algorithm to solve this problem where just 'inspection' no longer works, the question is, how can we take such an algorithm seriously? Well, to take the algorithm seriously, we need more than just the algorithm, that is, more than just the code or pseudo-code. As I mentioned, one way to take the algorithm seriously is a lot of empirical evidence. Alas, this can be slow and is not very satisfactory. So, what is left? You mentioned nothing. And computer science has nothing. I gave a solution: Start with the real problem. Get a solution via some applied math complete with theorems and proofs. That is, properties of the real problem provide assumptions for the theorems, and the conclusions of the theorems provide the desired solution. Then the software, that is, the 'algorithm', implements the data manipulations specified by the math. We take the math seriously because of the theorems and proofs. Then we take the algorithm seriously because, and only because, we took the math seriously. Net, without the math, there is little reason to take the algorithm seriously. With the math, to evaluate the algorithm, we should evaluate the math and then confirm that the algorithm does what the math specifies. So, for your "Your knowledge of computer science is evidently shallow. Educate yourself and you might think twice before making such ignorant pronouncements." What is relevant here is what I wrote; whether my "knowledge of computer science" is "shallow" or not is irrelevant. I said nothing "ignorant"; you gave no solution to the question of how to take an algorithm seriously when 'inspection' is not sufficient; and I gave apparently the only good solution we have. |
You don't have to convince me that mathematics is important--my background is in pure mathematics. I just don't think CS people are hapless fools who need applied math people like you from the outside to help them out. In my book, the non-systems part of CS is already a part of mathematics.
Since you bring it up, I've always considered numerical methods as firmly a part of applied mathematics. Combinatorial search and optimization problems are in the heartland of CS. The simplex method straddles numerical optimization and combinatorial optimization and is a bit of a hermaphrodite. Low-dimensional LPs are important in computational geometry and that field has produced some beautiful algorithms. Here's Seidel's randomized algorithm: Throw out a random hyperplane and solve the smaller problem recursively. If this optimum satisfies the excluded constraint then we are done. Otherwise the true optimum must lie on the hyperplane, so we can reduce the dimension of the problem by 1.
> Net, without the math, there is little reason to take the algorithm seriously.
An algorithm _is_ mathematics. The reason I said you seemed ignorant of computer science is this apparent belief that computer science is a random bunch of algorithms without any supporting theory.
> Well, to take the algorithm seriously, we need more than just the algorithm
Your premise is based on a strawman. How do you think algorithms are designed? They aren't generated randomly. They are generated based on exactly the kind of insight that generates theorems and proofs. There may not be tight proofs of every property we'd like to know for certain. That is certainly true for the simplex method! Its worst-case running time is exponential (e.g. Klee-Minty), and there is a small cottage industry devoted to proving its running time is polynomial in the average case (for various definitions of 'average case'). People have been using the simplex method very effectively since its inception despite knowing that its pathological behavior can be very bad indeed. Had you wanted to make your point more effectively, you should have picked interior point methods, which have the additional benefit of working for a much wider class of convex problems like SDPs.
> Claim: If z is bounded above on F, then z achieves its least upper bound. Note: The proof is not just usual analysis with converging sequences from Rudin's 'Principles'.
This has nothing to do with linearity or convexity. It's true for any continuous function f : X -> R on a closed subset F of a complete topological space X. If f is bounded above on F then its supremum on F is achieved at some x in X. By the supremum property and continuity of f, you can find a net in F which accumulates around x. This Cauchy net converges to x since X is complete. Because F is closed it contains its own accumulation points. Hence x is in F.
This proof is exactly at the level of Baby Rudin if you drop back the generality a bit and work with metric spaces.
> For us to take the simplex algorithm seriously, we need to know that this claim is true.
Nonsense. Any implementation of the simplex method is already going to introduce round-off error, so the theoretical difference between optimizing on a set and its closure has no practical consequences. If your point is that the achievement of the supremum ensures termination of the simplex method, that is false. The simplex method is hill climbing on a polytope's vertex graph. Whenever there is a local tie between neighboring vertices you need a tie-breaking pivoting rule. Bad pivoting rules (e.g. lexicographic ordering) will lead to cycling and non-termination. In the absence of such local ties, all you need to know to prove termination is that the number of vertices is finite and that the objective function strictly decreases each step. Even the proof of termination when pivoting with Bland's rule is purely combinatorial.
An example of something that's actually important in practice is how strong Lagrange duality supplies non-heuristic stopping criteria for convex programs via dual feasible points. That way we know what we're giving up by being suboptimal. Duality theory is also much richer and subtler than the kind of mindless definition-chasing freshman homework problem you bought up above.