Hacker News new | ask | show | jobs
by tel 4325 days ago
That's the intuition used to develop the concept, but it becomes increasingly difficult to apply that intuition to in more exotic locales.

Thus, it's important to eventually seek out more abstract ways of characterizing the derivative (and integral). In more advanced mathematics, you usually state that the derivative is any operation which follows two rules

    1. Linearity, d(ax + by) = a d(x) + b d(y)
    2. The Product Rule, d(xy) = x d(y) + d(x) y
and then try to squeeze things until that operation is defined uniquely.[0]

Likewise, it's often valuable to define integration as nothing more than the relationship such that

      I(region, derivative(quantity)) 
    = I(boundary(region), quantity)
which is known as the Generalized Stokes Rule. It basically is the "Fundamental Theorem of Calculus" on steroids and it gives a characterization of integration in terms of nothing more than it's algebraic/topological relationship with derivation... which is itself abstracted as mentioned above.

---

Why do all this? Because you can squeeze most of Calculus so that it depends only upon this "abstract interface" and then apply things you learned from calculus all over the place.

---

Finally, note that this is more like a "proposed" derivative than "the" derivative on natural numbers. The author notes that linearity fails, for instance. Thus, some intuition might "port over" but we shouldn't expect too much of it to do so.

Which echoes back to your original question—there's not really a notion of instantaneous change for us to be talking about... so how much sense does it make to talk about a derivative here?

Apparently, more than no sense at all, but less than you might want.

[0] Note that all we need to state this property of the derivative is a notion of multiplication and addition. This structure is, at its most abstract usually called a ring (but can be made even weaker if needed). An example "exotic" ring might be concurrent processes. If P and Q are two processes then P*Q is P "followed by" Q and P + Q is P and Q "together". Can we write a derivative here? Who knows? (As another comment in this thread suggests, this kind of formulation can be used to consider the "derivative of a grammar" to be a parser! It's also well-known that the derivative of an algebraic data type is its "zipper"!)

6 comments

I agree, there are usefull generalization of the derivative. But an important detail is that the operator that is discussed in this post is not linear, i.e. D(x+y) != D(x) + D(y), so it doesn't have all the expected ("intuitive") properties that the usual derivative has.

I don't know the origin of this notation, but I can make up a possible explanation. If you think that every prime is a function of an abstract variable x, so 2 = two(x) = 2 + x, 3 = three(x) = 3 + x, 5 = five(x) = 5 + x, ... evaluated in x=0.

Then, for example, 60 = sixty(x) = two(x) * two(x) * three(x) * five(x) = two(x)^2 * three(x) * five(x)

A number is a function of the primes of the factorizations. You must change the primes into the functions, but you must leave alone the exponents.

Then D is the standard derivative, plus evaluation.

D(60) = sixty'(x) = 2 * two(x) * three(x) * five(x) + two(x)^2 * five(x) + two(x)^2 * three(x) = 2 * 2 * 3 * 5 + 2^2 * 5 + 2^2 * 3 = 60 + 20 + 12 = 92

(I'm mixing the functions and the values when they are evaluated. It's usually not a good idea. If you do that in a Calculus exam the TA will be rightfully angry. But the notation in only text is horrible, so please forgive the technical details.)

With this idea if you have numbers A, B, C such that A = B * C, then A(x) = B (x) * C(x). But if A = B + C then A(x) != B(x) + C(x). This "explains" why the operator D follows the multiplication rule, but not the sum rule.

For example, ten(x) = two(x) * five(x) and six(x) = two(x) * three (x) then sixty(x) = two(x) * two(x) * three(x) * five(x)

But fifteen(x) = three(x) * five(x) and ten(x) = two(x) * five(x), but twenty-five(x) = five(x)^2 and clearly three(x) * five(x) + two(x) * five(x) != five(x)^2.

Edit: Don't take this "explanation" very literally, for example this idea doesn't extend to D^2 directly

D^2 (p^2) = D(p+p) = D(2p) = D(2)p+2D(p) = p+2

D^2(p^2) = D(p+p) != D(p)+D(p) = 2D(p) = 2

D^2(7^2) != (seven(x)^2)'' = (seven(x)+seven(x))' = seven'(x)+seven'(x) = 2seven'(x) = 2

The implicit transformations of numbers into functions and the evaluations in 0 make cause many problems.

> But an important detail is that the operator that is discussed in this post is not linear, i.e. D(x+y) != D(x) + D(y), so it doesn't have all the expected ("intuitive") properties that the usual derivative has.

I think some care is appropriate here. The property you quote is additivity, not linearity; for linearity, one would like a ground field (or a ground ring if one is discussing linear maps between modules, I suppose). Since one is not considering any vector space / module structure on the natural numbers, this may hint why linearity (or even its weaker sibling additivity) is not thought to be necessary here.

(EDIT: With that said, I like very much your description of transforming numbers to functions.)

> That's the intuition used to develop the concept, but it becomes increasingly difficult to apply that intuition to in more exotic locales.

Then maybe it's better to use a different term for things that are different. Maybe it's better to keep the term "derivative" for the rate of change of one thing with respect to another thing, and to let the generalizations that aren't that be called something else.

Perhaps. Would you also say "Maybe it's better to keep the term 'sum' for combining the count of two sets, and to let the generalizations that aren't that (e.g., adding integers, adding vectors, adding polynomials, ...) be called something else"?
I'd say that you're historically wrong - that "sum" was used for adding integers, and then found to extend naturally to adding vectors, polynomials, matrices, real numbers, complex numbers, quaternions, and so on. If you want to extend it to combining the count of two sets (because you're trying to re-found all of mathematics on set theory), then the word still fits.

Just don't try to make the set theory version the "real" version, and then try to deny the use of the word in other places. Those other places were using it first; you don't have the right to hijack the word.

By "count of two sets", I don't mean to invoke set theory in any imposing, modern sense. Just the observation that, historically, we were adding counting numbers (in particular, non-negative ones with such properties as "The sum of x and y is always at least as large as x itself") long before we were adding integers.

Regardless, the point still stands: why would you allow the word "sum" to fit all those uses (disparate, but with a web of family resemblances), but not grant the same to "derivative"?

Because it seems to me that the web of family resemblances for "derivative" should include the rate of change of one thing with respect to another, not just that the product rule is satisfied. That is, it seems to me that the attempts to extend "derivative" are extending it to the point that the web of family resemblances no longer fits.
Unfortunately, all I can say to comfort that is that choosing "rate of change" as your centralizing analogy for "derivative" has been shown through the history of mathematics to be a great start, but a slow finish.

Frequently mathematics benefits a lot from abstracting to algebra because, at this point, it's purely about how to define elements and operations by their apparent behavior instead of by their metaphor or interaction with a larger idea (such as notions of space, continuity, rate, change... all of those require quite a lot of mechanics to get in place, while algebra is very light-weight).

As an example, there have been a lot of attempts to discretize calculus for computers. Usually, the goal here is to create a scheme of discretization which, in the limit, resembles the smooth computations we'd like to perform. This has been a successful program in practice, but it's known to be fraught with weird edge cases. It's easy to create discretized situations which violate intuition.

Much of the reason these failed is because they attempted to generalize from the notion of "rate of change".

There's also the idea of discrete calculus (not "discretized") which is what you get when you apply the algebraic laws alone to some very standard notions of discrete spaces (oriented simplicial complexes, in particular—the simplest discrete object which "has enough topology" to meaningfully have the algebraic laws of integration applied to it).

What you get in this case is a rich theory of discrete calculus which rederives half of manifold learning and graph theory as a special case. All of the laws follow precisely—and they must, as the entire construction was built to prevent such violations.

Finally, you can examine discrete calculus to find a notion of "rate of change" if you like. But it's alien from that which you might be familiar with from continuous domains. It would have been very difficult to arrive at this point trying to generalize that intuition.

But it's practically inevitable (not to say it's easy, just inevitable) to if you say that you want to take the algebraic structure of derivatives and integration and apply it to oriented simplicial complexes.

It seems to me a sum should be at least as large as each of its summands (or rather, it once did). The world paid no heed, and life trudged on. I don't see a need to pick one particular archetypal trait or another and say the word "derivative" (or any other bit of mathematical jargon) mustn't ever be extended by analogy to a situation no longer directly manifesting that trait. A web of family resemblances doesn't depend on any one fiber running through all of it.

It's not as though the similarity of terminology is chosen with intent to confuse; the intent is to illuminate. The name for the generalization is chosen to match its more familiar relative because it is often _useful_ to think in terms of the analogy, imperfect though it be. [It seems humans are such that we would never find our way to powerful abstractions without such overloading; the combinatorial explosion of names would be too great to comprehend.]

I think the concept of sum predates integers by a substantial margin. It's a bit of a theft to extend it from its true domain of applicability, the (positive) natural numbers.
"Mathematics is the art of giving the same name to different things." - Poincaré
Sometimes it's called "derivation". In Grassman algebra you call it the exterior derivative. Ultimately these are all motivated by trying to find "the" derivative in different contexts and eventually greater and greater generalities... So despite the mild opportunity for confusion they really are in a sense each named "derivative".
Oh man, are you going to hate operator overloading.
To tie some intuition back, here's how

      I(region, derivative(quantity)) 
    = I(boundary(region), quantity)
is just the Fundamental Theorem of Calculus.

What you need to do is take "region" to be an interval of the real line like [a, b]. Now, the boundary of "region" is the set of the end points, {a, b}. Thus, we've now transformed this equation to

    I([a,b], d(quantity)) = I({a, b}, quantity)
If we see I as being a sum, it's clear that the right-hand-side must be the same as a regular sum, though we need to account for the idea that we're summing from a and to b. Ultimately, we do this by negating where we're coming from[1]

    I([a,b], d(q)) = -q(a) + q(b)
Then we just have to recognize I([a,b], _) as the definite integral from a to b

    DefiniteIntegral(a, b, d(q)) = q(b) - q(a)
and this is a statement that the definite integral of a function on an interval [a,b] is the difference of the antiderivative of that function evaluated at the endpoints---the standard fundamental theorem of calculus!

But notice that in this transformation we've destroyed some information. No longer is it so apparent that "boundary" and "derivative" have some kind of kinship. We also cannot easily generalize this notion to higher dimensional spaces (unless we already know the Stokes Law trick).

[1] This is a bit arbitrary. If we chose it the other way the equation would still hold, it'd just not be the standard convention and thus less recognizable. The reason we choose this is because calculus, it turns out, depends upon a notion of orientation. We have to know whether we're going with or against "the flow".

What an excellent reply. All spot on, but worth it just for this bit alone:

> Apparently, more than no sense at all

> Likewise, it's often valuable to define integration as nothing more than the relationship such that

    >  I(region, derivative(quantity)) 
    > = I(boundary(region), quantity)
> which is known as the Generalized Stokes Rule.

I like your explanation very much, but I think that calling the integral 'the relationship' (emphasis on the definite article) satisfying Stokes is an overstatement. There are many such; for example, I(_, _) = 0.

Generally my whole argument is heavy on existence and light on uniqueness, but it's a great point to emphasize it! :)
Great answer, but on a pedantic note, it's not quite true that the derivative of a data type is its zipper. The derivative is the one-hole context, but the zipper is more like a one-substructure context instead. There's a bit more elaboration here: [1].

[1] http://en.wikibooks.org/wiki/Haskell/Zippers

True, I should have been more precise. The zipper just is a technique for using one-hole contexts to make a path as you drill through a recursive type.