Hacker News new | ask | show | jobs
by nilsimsa 4325 days ago
Shouldn't derivatives by respect to change of another variable. For example d/dx of f(x).
6 comments

That's the intuition used to develop the concept, but it becomes increasingly difficult to apply that intuition to in more exotic locales.

Thus, it's important to eventually seek out more abstract ways of characterizing the derivative (and integral). In more advanced mathematics, you usually state that the derivative is any operation which follows two rules

    1. Linearity, d(ax + by) = a d(x) + b d(y)
    2. The Product Rule, d(xy) = x d(y) + d(x) y
and then try to squeeze things until that operation is defined uniquely.[0]

Likewise, it's often valuable to define integration as nothing more than the relationship such that

      I(region, derivative(quantity)) 
    = I(boundary(region), quantity)
which is known as the Generalized Stokes Rule. It basically is the "Fundamental Theorem of Calculus" on steroids and it gives a characterization of integration in terms of nothing more than it's algebraic/topological relationship with derivation... which is itself abstracted as mentioned above.

---

Why do all this? Because you can squeeze most of Calculus so that it depends only upon this "abstract interface" and then apply things you learned from calculus all over the place.

---

Finally, note that this is more like a "proposed" derivative than "the" derivative on natural numbers. The author notes that linearity fails, for instance. Thus, some intuition might "port over" but we shouldn't expect too much of it to do so.

Which echoes back to your original question—there's not really a notion of instantaneous change for us to be talking about... so how much sense does it make to talk about a derivative here?

Apparently, more than no sense at all, but less than you might want.

[0] Note that all we need to state this property of the derivative is a notion of multiplication and addition. This structure is, at its most abstract usually called a ring (but can be made even weaker if needed). An example "exotic" ring might be concurrent processes. If P and Q are two processes then P*Q is P "followed by" Q and P + Q is P and Q "together". Can we write a derivative here? Who knows? (As another comment in this thread suggests, this kind of formulation can be used to consider the "derivative of a grammar" to be a parser! It's also well-known that the derivative of an algebraic data type is its "zipper"!)

I agree, there are usefull generalization of the derivative. But an important detail is that the operator that is discussed in this post is not linear, i.e. D(x+y) != D(x) + D(y), so it doesn't have all the expected ("intuitive") properties that the usual derivative has.

I don't know the origin of this notation, but I can make up a possible explanation. If you think that every prime is a function of an abstract variable x, so 2 = two(x) = 2 + x, 3 = three(x) = 3 + x, 5 = five(x) = 5 + x, ... evaluated in x=0.

Then, for example, 60 = sixty(x) = two(x) * two(x) * three(x) * five(x) = two(x)^2 * three(x) * five(x)

A number is a function of the primes of the factorizations. You must change the primes into the functions, but you must leave alone the exponents.

Then D is the standard derivative, plus evaluation.

D(60) = sixty'(x) = 2 * two(x) * three(x) * five(x) + two(x)^2 * five(x) + two(x)^2 * three(x) = 2 * 2 * 3 * 5 + 2^2 * 5 + 2^2 * 3 = 60 + 20 + 12 = 92

(I'm mixing the functions and the values when they are evaluated. It's usually not a good idea. If you do that in a Calculus exam the TA will be rightfully angry. But the notation in only text is horrible, so please forgive the technical details.)

With this idea if you have numbers A, B, C such that A = B * C, then A(x) = B (x) * C(x). But if A = B + C then A(x) != B(x) + C(x). This "explains" why the operator D follows the multiplication rule, but not the sum rule.

For example, ten(x) = two(x) * five(x) and six(x) = two(x) * three (x) then sixty(x) = two(x) * two(x) * three(x) * five(x)

But fifteen(x) = three(x) * five(x) and ten(x) = two(x) * five(x), but twenty-five(x) = five(x)^2 and clearly three(x) * five(x) + two(x) * five(x) != five(x)^2.

Edit: Don't take this "explanation" very literally, for example this idea doesn't extend to D^2 directly

D^2 (p^2) = D(p+p) = D(2p) = D(2)p+2D(p) = p+2

D^2(p^2) = D(p+p) != D(p)+D(p) = 2D(p) = 2

D^2(7^2) != (seven(x)^2)'' = (seven(x)+seven(x))' = seven'(x)+seven'(x) = 2seven'(x) = 2

The implicit transformations of numbers into functions and the evaluations in 0 make cause many problems.

> But an important detail is that the operator that is discussed in this post is not linear, i.e. D(x+y) != D(x) + D(y), so it doesn't have all the expected ("intuitive") properties that the usual derivative has.

I think some care is appropriate here. The property you quote is additivity, not linearity; for linearity, one would like a ground field (or a ground ring if one is discussing linear maps between modules, I suppose). Since one is not considering any vector space / module structure on the natural numbers, this may hint why linearity (or even its weaker sibling additivity) is not thought to be necessary here.

(EDIT: With that said, I like very much your description of transforming numbers to functions.)

> That's the intuition used to develop the concept, but it becomes increasingly difficult to apply that intuition to in more exotic locales.

Then maybe it's better to use a different term for things that are different. Maybe it's better to keep the term "derivative" for the rate of change of one thing with respect to another thing, and to let the generalizations that aren't that be called something else.

Perhaps. Would you also say "Maybe it's better to keep the term 'sum' for combining the count of two sets, and to let the generalizations that aren't that (e.g., adding integers, adding vectors, adding polynomials, ...) be called something else"?
I'd say that you're historically wrong - that "sum" was used for adding integers, and then found to extend naturally to adding vectors, polynomials, matrices, real numbers, complex numbers, quaternions, and so on. If you want to extend it to combining the count of two sets (because you're trying to re-found all of mathematics on set theory), then the word still fits.

Just don't try to make the set theory version the "real" version, and then try to deny the use of the word in other places. Those other places were using it first; you don't have the right to hijack the word.

By "count of two sets", I don't mean to invoke set theory in any imposing, modern sense. Just the observation that, historically, we were adding counting numbers (in particular, non-negative ones with such properties as "The sum of x and y is always at least as large as x itself") long before we were adding integers.

Regardless, the point still stands: why would you allow the word "sum" to fit all those uses (disparate, but with a web of family resemblances), but not grant the same to "derivative"?

Because it seems to me that the web of family resemblances for "derivative" should include the rate of change of one thing with respect to another, not just that the product rule is satisfied. That is, it seems to me that the attempts to extend "derivative" are extending it to the point that the web of family resemblances no longer fits.
I think the concept of sum predates integers by a substantial margin. It's a bit of a theft to extend it from its true domain of applicability, the (positive) natural numbers.
"Mathematics is the art of giving the same name to different things." - Poincaré
Sometimes it's called "derivation". In Grassman algebra you call it the exterior derivative. Ultimately these are all motivated by trying to find "the" derivative in different contexts and eventually greater and greater generalities... So despite the mild opportunity for confusion they really are in a sense each named "derivative".
Oh man, are you going to hate operator overloading.
To tie some intuition back, here's how

      I(region, derivative(quantity)) 
    = I(boundary(region), quantity)
is just the Fundamental Theorem of Calculus.

What you need to do is take "region" to be an interval of the real line like [a, b]. Now, the boundary of "region" is the set of the end points, {a, b}. Thus, we've now transformed this equation to

    I([a,b], d(quantity)) = I({a, b}, quantity)
If we see I as being a sum, it's clear that the right-hand-side must be the same as a regular sum, though we need to account for the idea that we're summing from a and to b. Ultimately, we do this by negating where we're coming from[1]

    I([a,b], d(q)) = -q(a) + q(b)
Then we just have to recognize I([a,b], _) as the definite integral from a to b

    DefiniteIntegral(a, b, d(q)) = q(b) - q(a)
and this is a statement that the definite integral of a function on an interval [a,b] is the difference of the antiderivative of that function evaluated at the endpoints---the standard fundamental theorem of calculus!

But notice that in this transformation we've destroyed some information. No longer is it so apparent that "boundary" and "derivative" have some kind of kinship. We also cannot easily generalize this notion to higher dimensional spaces (unless we already know the Stokes Law trick).

[1] This is a bit arbitrary. If we chose it the other way the equation would still hold, it'd just not be the standard convention and thus less recognizable. The reason we choose this is because calculus, it turns out, depends upon a notion of orientation. We have to know whether we're going with or against "the flow".

What an excellent reply. All spot on, but worth it just for this bit alone:

> Apparently, more than no sense at all

> Likewise, it's often valuable to define integration as nothing more than the relationship such that

    >  I(region, derivative(quantity)) 
    > = I(boundary(region), quantity)
> which is known as the Generalized Stokes Rule.

I like your explanation very much, but I think that calling the integral 'the relationship' (emphasis on the definite article) satisfying Stokes is an overstatement. There are many such; for example, I(_, _) = 0.

Generally my whole argument is heavy on existence and light on uniqueness, but it's a great point to emphasize it! :)
Great answer, but on a pedantic note, it's not quite true that the derivative of a data type is its zipper. The derivative is the one-hole context, but the zipper is more like a one-substructure context instead. There's a bit more elaboration here: [1].

[1] http://en.wikibooks.org/wiki/Haskell/Zippers

True, I should have been more precise. The zipper just is a technique for using one-hole contexts to make a path as you drill through a recursive type.
The standard way of generalizing the derivative is to require it to be linear and to satisfy a version of the product rule. See, for instance, derivations:

http://en.wikipedia.org/wiki/Derivation_(differential_algebr...

In the case of this article, the proposed definition is not linear, so it is indeed a bizarre candidate for a derivative.

In a sense this is d/d{primes}. D(8) is how fast 8 changes with respect to 2, while D(60) is how fast 60 changes if 2, 3, and 5 simultaneously increase (at the same rate).
So... we can use it to determine the age of the universe, if we know at which point in time 6 * 9 = 42 held, by extrapolating back to at what point 6 * 9 = 0? ;-)

[ed: looks like my multiplication signs got eaten by hn]

No, it can be an operator on any kind of (usually) ring or more general algebraic structure. What you refer to is the "usual" derivative of functions of one variable, just one of many derivatives one can define.
You can even take the derivative of a grammar to get a parser!

http://matt.might.net/papers/might2011derivatives.pdf

I implemented Brzozowski's regex derivatives to build a regex implementation back-end. That back-end is used whenever exotic constructs (negation, intersection) appear in the abstract syntax of the regex; in their absence, the implementation falls back on the NFA-graph-based back end.
Yes, this seems to build on the idea of regex derivatives. If regex derivatives can be used to transform a regular expression into a recognizer for strings, why not transform a more general grammar into a recognizer of strings.
Where would it be used? It seems like declaring that the derivative of every prime number = 1 is entirely arbitrary.
Yes, it is completely arbitrary. However, this arbitrary definition follows certain rules and properties and can therefore be used for certain types of mathematical reasoning.

This is why, when we talk about rings and fields and such we say "multiplication-like" or "addition-like" operators. The operators defined for the algebraic structure may not be exactly like "standard" operators, but they still follow rules and you can still do cool things with them.

Not entirely: it is the only way to do it coherently, but it is not so easy to explain.
>it is the only way to do it coherently

There are many consistent ways to define the derivative of a number.

The way we are all familiar with is to define a number as a zeroth-order polynomial.

If you like, think of this as first turning a number into the unique monic polynomial with negated prime roots whose value at 0 is that number, then taking the derivative of that polynomial at 0.

...Not that there's any particular reason you should like this.

Think of Derivative as a function which inputs a function and outputs a function. When you think "derivative of 5 is 0", you imply "derivative of f(x)=5 is f'(x)=0"

This article seems to define a function called Derivative which inputs a number and outputs a number. In this case, "derivative of 5 is 1" actually translates to "derivative of 5 is 1".