Bellaard.com

Algebraically Manipulating the Differential

By: Gijs Bellaard

I recently came across this paper by Jonathan Bartlett and Asatur Zh. Khurshudyan titled Extending the algebraic manipulation of differentials. I have always been interested in the notation of math. Great notation makes our work easier and guides us in the right direction, bad notation obscures and creates confusion. This paper shows that we can do a lot better with our notation of differentials.

Let \(f, g\) be two functions and define the variables \(z = g(y)\) and \(y = f(x)\). We have the following two calculus rules, written in both Leibniz and Lagrange notation:

Lagrange Leibniz
Chain Rule \((g \circ f)' = (g' \circ f) \cdot f'\) \(\frac{dz}{dx} = \frac{dz}{dy} \cdot \frac{dy}{dx}\)
Inverse Rule \((f^{-1})'(y) = \frac{1}{f'(x)}\) \(\frac{dx}{dy} = \frac{1}{\frac{dy}{dx}}\)

Leibniz notation has several advantages over Lagrange if one allows himself to algebraically manipulate differentials. Indeed, one can imagine ''cancelling'' the \(dy\) in the numerator and denominator in the chain rule. Similarly, for fractions it is true that \(1/(a/b) = a/b\), which can be used to show the inverse rule. To a hardcore formalist Leibniz notation is something that should be avoided. They will say things such as ''You can't algebraically manipulate the differentials! They're just symbols!''. They are right of course, algebraically manipulating the differentials quickly falls apart once we move into the realm of higher derivatives. For example, consider the following non-result: \[ \frac{d^2z}{dx^2} \text{''=''} \frac{d^2z}{dy^2} \cdot \Par{\frac{dy}{dx}}^2 \] which is ''algebraically'' correct but demonstrably false. Take for example \(z = y^3\) and \(y = x^2\). We have \(\frac{d^2z}{dx^2} = 30x^4\), \(\frac{d^2z}{dy^2} = 6y\), and \(\frac{dy}{dx} = 2x\), plugging this in our non-result shows its incorrectness: \[ 30x^4 \neq 6y \cdot (2x)^2 = 24x^2 \] Even though algebraically manipulating the differentials doesn't work for higher derivatives, one cannot deny the fact that it is an extremely handy tool when working with ''first-order'' problems. It would be nice if we could find some way to make it work generally...

It turns out, as shown in Barlett's and Khurshudyan's paper, that this failure of being algebraically manipulated can be fixed! And the only thing we need to change is the notation of the higher derivatives! Before we give this new notation we first loosely ''define'' how the differential operator \(d\) algebraically acts. It is ''helpful'' to not give the symbols any meaning other than algebraic, that is, consider \(x\) and \(y\) as ''any algebraic expression''. We employ the following shorthand notation: \[ dx := d(x), \quad d^2x := d(d(x)), \quad dx^2 := (dx)^2 \] We define the differential operator \(d\) as a linear operator that obeys the product rule: \[ d(ax + by) = adx + bdy\] \[ d(xy) = dx \cdot y + x \cdot dy\] Surprisingly, from just these two axioms everything we need can be deduced. Namely, from the product rule we can show that \(d(1) = 0\): \[ d(1) = d(1\cdot 1) = d(1)1+ 1d(1) = 2d(1) \implies d(1) = 0 \] Using this the reciprocal rule can be derived: \[ 0 = d(1) = d(1/y \cdot y) = d(1/y) y + dy/y \implies d(1/y) = - \frac{dy}{y^2} \] and from the product and reciprocal rule the quotient rule can be derived: \[ d(x/y) = d(x \cdot 1/y) = \frac{dx}{y} - x \frac{dy}{y^2} = \frac{dx \cdot y - x \cdot dy}{y^2}\]

With these rules we are ready to continue. We define the derivative operator \(D_x\) with respect to \(x\) as \[ D_x(y) := \frac{dy}{dx}\] So, as usual, the second derivative operator \( D_x^2 \) is defined as the derivative of the derivative: \[ D_x^2(y) := D_x(D_x(y)) = \frac{d\Par{\frac{dy}{dx}}}{dx}\] If we now use the quotient rule we see that: \[= \frac{d^2y dx - dy d^2 x}{dx^3} = \frac{d^2y}{dx^2} - \frac{dy}{dx} \frac{d^2x}{dx^2}\] Notice that how we usually write down the second derivative, namely \(\frac{d^2y}{dx^2}\), appears in the first term. To show this new notation can be algebraically manipulated consider the following result. By swapping the roles of \(x\) and \(y\) in the formula above we get: \[ D^2_yx = \frac{d^2x}{dy^2} - \frac{dx}{dy} \frac{d^2y}{dy^2}\] Now we see, purely algebraically: \[ D^2_x y = - \frac{dy^3}{dx^3} D^2_y x\] which we can read, in Lagrange notation, as: \[ f''(x) = -(f'(x))^3 \cdot (f^{-1})''(y) \] which, indeed, can be proven to be correct.