I ask because, as a first-year calculus student, I am running into the fact that I didn’t quite get this down when understanding the derivative:
So, a derivative is the rate of change of a function with respect to changes in its variable, this much I get.
Thing is, definitions of ‘differential’ tend to be in the form of defining the derivative and calling the differential ‘an infinitesimally small change in x’, which is fine as far it goes, but then why bother even defining it formally outside of needing it for derivatives?
And THEN, the bloody differential starts showing up as a function in integrals, where it appears to be ignored part of the time, then functioning as a variable the rest.
Why do I say ‘practical’? Because when I asked for an explanation from other mathematician parties, I got one involving the graph of the function and how, given a right-angle triangle, a derivative is one of the other angles, where the differential is the line opposite the angle.
I’m sure that explanation is correct as far it goes, but it doesn’t tell me what the differential DOES, or why it’s useful, which are the two facts I need in order to really understand it.
Originally, “differentials” and “derivatives” were intimately connected, with derivative being defined as the ratio of the differential of the function by the differential of the variable (see my previous discussion on the Leibnitz notation for the derivative). Differentials were simply “infinitesimal changes” in
whatever, and the derivative of y with respect to x was the ratio of the infinitesimal change in y relative to the infinitesimal change in x.
For integrals, “differentials” came in because, in Leibnitz’s way of thinking about them, integrals were the sums of infinitely many infinitesimally thin rectangles that lay below the graph of the function. Each rectangle would have height y and base dx (the infinitesimal change in x), so the area of the rectangle would be ydx (height times base), and we would add them all up as Sydx to get the total area (the integral sign was originally an elongated S, for “summa”, or sum).
Infinitesimals, however, cause all sorts of headaches and problems. A lot of the reasoning about infinitesimals was, well, let’s say not entirely rigorous (or logical); some differentials were dismissed as “utterly inconsequential”, while others were taken into account. For example, the product rule would be argued by saying that the change in fg is given by
and then ignoring dfdg as inconsequential, since it was made up of the product of two infinitesimals; but if infinitesimals that are really small can be ignored, why do we not ignore the infinitesimal change dg in the first factor? Well, you can wave your hands a lot of huff and puff, but in the end the argument essentially broke down into nonsense, or the problem was ignored because things worked out regardless (most of the time, anyway).
Anyway, there was a need of a more solid understanding of just what derivatives and differentials actually are so that we can really reason about them; that’s where limits came in. Derivatives are no longer ratios, instead they are limits. Integrals are no longer infinite sums of infinitesimally thin rectangles, now they are limits of Riemann sums (each of which is finite and there are no infinitesimals around), etc.
The notation is left over, though, because it is very useful notation and is very suggestive. In the integral case, for instance, the “dx” is no longer really a quantity or function being multiplied: it’s best to think of it as the “closing parenthesis” that goes with the “opening parenthesis” of the integral (that is, you are integrating whatever is between the ∫ and the dx, just like when you have 2(84+3), you are multiplying by 2 whatever is between the ( and the ) ). But it is very useful, because for example it helps you keep track of what changes need to be made when you do a change of variable. One can justify the change of variable without appealing at all to “differentials” (whatever they may be), but the notation just leads you through the necessary changes, so we treat them as if they were actual functions being multiplied by the integrand because they help keep us on the right track and keep us honest.
But here is an ill-kept secret: we mathematicians tend to be lazy. If we’ve already come up with a valid argument for situation A, we don’t want to have to come up with a new valid argument for situation B if we can just explain how to get from B to A, even if solving B directly would be easier than solving A (old joke: a mathematician and an engineer are subjects of a psychology experiment; first they are shown into a room where there is an empty bucket, a trashcan, and a faucet. The trashcan is on fire. Each of them first fills the bucket with water from the faucet, then dumps it on the trashcan and extinguishes the flames. Then the engineer is shown to another room, where there is again a faucet, a trashcan on fire, and a bucket, but this time the bucket is already filled with water; the engineer takes the bucket, empties it on the trashcan and puts out the fire. The mathematican, later, comes in, sees the situation, takes the bucket, and empties it on the floor, and then says “which reduces it to a previously solved problem.”)
Where were we? Ah, yes. Having to translate all those informal manipulations that work so well and treat dx and dy as objects in and of themselves, into formal justifications that don’t treat them that way is a real pain. It can be done, but it’s a real pain. Instead, we want to come up with a way of justifying all those manipulations that will be valid always. One way of doing it is by actually giving them a meaning in terms of the new notions of derivatives. And that is what is done.
Basically, we want the “differential” of y to be the infinitesimal change in y; this change will be closely approximated to the change along the tangent to y; the tangent has slope y′(a). But because we don’t have infinitesimals, we have to say how much we’ve changed the argument. So we define “the differential in y at a when x changes by Δx“, d(y,Δx)(a), as d(y,Δx)(a)=y′(a)Δx. This is exactly the change along the tangent, rather than along the graph of the function. If you take the limit of d(y,Δx) over Δx as Δx→0, you just get y′. But we tend to think of the limit of Δx→0 as being dx, so abuse of notation leads to “dy=dydxdx“; this is suggestive, but not quite true literally; instead, one then can show that arguments that treat differentials as functions tend to give the right answer under mild assumptions. Note that under this definition, you get d(x,Δx)=1Δx, leading to dx=dx.
Also, notice an interesting reversal: originally, differentials came first, and they were used to define the derivative as a ratio. Today, derivatives come first (defined as limits), and differentials are defined in terms of the derivatives.
What is the practical difference, though? You’ll probably be disappointed to hear “not much”. Except one thing: when your functions represent actual quantities, rather than just formal manipulation of symbols, the derivative and the differential measure different things. The derivative measures a rate of change, while the differential measures the change itself.
So the units of measurement are different: for example, if y is distance and x is time, then dydx is measured in distance over time, i.e., velocity. But the differential dy is measured in units of distance, because it represents the change in distance (and the difference/change between two distances is still a distance, not a velocity any more).
Why is it useful to have the distinction? Because sometimes you want to know how something is changing, and sometimes you want to know how much something changed. It’s all nice and good to know the rate of inflation (change in prices over time), but you might sometimes want to know how much more the loaf of bread is now (rather than the rate at which the price is changing). And because being able to manipulate derivatives as if they were quotients can be very useful when dealing with integrals, differential equations, etc, and differentials give us a way of making sure that these manipulations don’t lead us astray (as they sometimes did in the days of infinitesimals).
I’m not sure if that answers your question or at least gives an indication of where the answers lie. I hope it does. Added. I see Qiaochu has pointed out that the distinction becomes much clearer once you go to higher dimensions/multivariable calculus, so the above may all be a waste. Still…
Added. As Qiaochu points out (and I mentioned in passing elsewhere), there are ways in which one can give formal definitions and meanings to infinitesimals, in which case we can define differentials as “infinitesimal changes” or “changes along infinitesimal differences”; and then use them to define derivatives as integrals just like Leibnitz did. The standard example of being able to do this is Robinson’s non-standard analysis Or if one is willing to forgo looking at all kinds of functions and only at some restricted type of functions, then you can also give infinitesimals, differentials, and derivatives substance/meaning which is much closer to their original conception.