When we last saw our intrepid hero, he was differentiating functions. Here's a review of Chapter 3: First, derivatives are linear. This means d/dx (A + B) = dA/dx + dB/dx.
The chain rule tells us how to differentiate cascaded functions. Suppose you have some function, Y = A(X), and some other function, Y = B(X). Now, since A(X) is just a number, and B(X) takes in a number for its argument, you can imagine cascading them to form some new function, Y = B( A(X) ). That was a little abstract, so here's an example, Y = -X and Y = eX can be cascaded to form Y = e(-X). Here's the rule for differentiating this sort of thing: dY/dX = dY/dA * dA/dX. We know how to differentiate ex, the derivative is just ex. And, here's the best part, it doesn't matter what "x" is, it can be any kind of complicated thing you can imagine. We also know how to differentiate Y = -X, the derivative is -1. So, the derivative of e(-x) is e(-x) * d/dx (-x) = e(-x) * (-1) = -e-x.
4! (4 factorial) is 4*3*2*1. 5! is 5*4*3*2*1. 0! is 1, by definition. Negative numbers don't have factorials. Neither do fractional numbers. Only positive integers have factorials.
The exponential function, ex, which we read out loud as "e to the x" is:
X2 | X3 | X4 | X5 | |||||
Y = eX = 1 + X + | + | + | + | + . . . | ||||
2! | 3! | 4! | 5! |
dY
––– = eX
dX
This is the definition of ex: the infinite series is ex. The derivative of ex is ex.
anything2 | anything3 | anything4 | anything5 | |||||
eanything = 1 + anything + | + | + | + | + . . . | ||||
2! | 3! | 4! | 5! |
cosh(x) = 1/2 (ex + e-x)
sinh(x) = 1/2 (ex - e-x)
That's everything from Chapter 3.
Why do we do calculus? Well, that's a good question. Perhaps we're a little late in asking it, but, here we are, and better late than never. Here's some interesting things you can do with calculus.
We can find a particular value in a function. Let's return to our hero, the guy who has to get to his mom's house in time for dinner. Suppose it's 3:00, and the guy wants to know where he'll be at 5:00. He's at about 105 miles. We can get his speed off the derivative (he gets it off his speedometer, of course), it's about 75 mph. We can project forward using the derivative. The derivative is dF/dX, the change in F when there is a change in X. If we have a moderate change in X, call it ΔX (read that out loud as delta-x), we multiply the derivative by ΔX to get a ΔF. 2 hours times 75 mph - 150 miles, which is past Mom's house. Our traveller is in good shape, he'll be able to slow down when he gets a bit closer.
You can always project a curve forward a little bit like this, F(X + ΔX) = F(X) + ΔX * dF(X)/dX. This is not exact, of course, but it's pretty close if ΔX isn't too large.
Suppose we want to know exactly what time it was when he had gone 120 miles. Well, we can look at the graph and see that it was at about 3:10, more or less. However, with calculus and a computer, we can do better. We use what is called Newton's method: you pick a point, any point on the graph as a starting point. For example, here we could say we guess he hits 120 miles at about 3:00. We evaluate the function a 3:00, and find he's actually gone about 105 miles - we're about 15 miles short. So, here's what we do. We find the derivative at 3:00, which is about 75 mph. We divide 15 miles by 75 mph, and get 1/5 of an hour, about 12 minutes. Our next guess will be 3:12. This is Newton's method - if we know ΔF, we can use the derivative to estimate ΔX. Here, ΔF is 15 miles, and therefore ΔX is 12 minutes.
Newton's method has what is called quadratic convergence - that means each time you do this, your error is the square of the preceding error. This is very cool - a 10% error turns into a .1*.1 = 1% error with one operation. A 1% error turns into a .01% error with one more operation. Computer programmers use Newton's method all the time.
Here's Newton's method written out a bit more formally. Let's call our function F, and our first guess G1. The final answer we're looking for will be A. Then G2, our second guess, will be G2 = G1 + ( A - F(G1) ) / (dF(G1)/dX). In words, our second guess is the correct answer minus the function value at guess one, divided by the function derivative at guess one. In the problem above, our first guess was 3:00. F(G1) was 105 miles. (A - F(G1)) = (120-105) = 15 miles. The derivative at 3:00 is 75 miles per hour. (A-F(G1))/(dF(G1)/dX) = 15/75 = 1/5 hour = 12 minutes. G1 + 12 minutes = 3:00 + 12 minutes = 3:12.
What Newton's method does is approximate the function by its derivative. This is an approximation, because the derivative is a straight line, and the actual function is curvy. But, as you get closer and closer to the right answer, the approximation gets better and better. In our example above, what we're doing is guessing that the guy's speed at 3:00, 75mph, is his speed for the entire 12 minutes. Of course, even on cruise control his speed will vary a little bit, and on a real freeway he's likely to have to slow down and speed up for traffic. But we can ignore all those small changes and still do very well.
We recall from the last chapter that ex is:
( | x | ) | n | ||||
ex = | Limit | 1 + | |||||
n→∞ | n |
Newton's method tells us how to take a very small step - you multiply the derivative by the amount you want to step, and that gets you pretty close to the right answer. When we use Newton's method, we're approximating the curve by the best fit stright line, which is the slope, the first derivative. We can imagine improving our guess by using the accelleration too - the second derivative. Actually, we already know how to do this. The small step is dF/dX. The big step is exp( dF/dX ) which we worked out in the previous chapter as:
(d/dX)2 | (d/dX)3 | (d/dX)4 | (d/dX)5 | |||||
exp( d/dX ) = 1 + (d/dX) + | + | + | + | + . . . | ||||
2! | 3! | 4! | 5! |
So the first term in the exponential is Newton's method, F(G1) + ΔX * dF(G1)/dX. If we wanted to use two terms from the exponential, we would use F(G1) + ΔX * dF(G1)/dX + (ΔX / 2) * d/dX (dF(G1)/dX). We can keep on using more and more terms from the exponential, which we recall is called the Taylor Series, to get better and better results. In practice, one rarely does this. It's faster and generally more effective to use Newton's method a couple times than to calculate the higher derivatives.
Suppose we don't know the function, and we can't calculate the derivative. No problem, if we can evaluate the function, we're still in business. If we don't know dF/dX, we can estimate it using the technique of chapter 1 - we just say
dF(G1) | F(G1 + .001) -F(G1) | |
=; | ||
dX | .001 |
So, we can make an approximation to the derivative, and then use that approximation to make an approximation to the function.
So, is Newton's method the greatest thing in the world? Well, it's pretty neat, but it has some rather severe limitations. In two dimensions, it works pretty well, but by looking at the graph above we can easily see some issues. If our first guess had been at 2.5, then the derivative there is zero, and we would have divided by zero. So, we need to guard against this. Further, a simple fix for this problem is to say instead of zero, the derivative at 2.5 is something small, like .000001. However, this means the error at this guess, about 50, would be divided by .000001, which means our next guess would be 50,000,000 hours. This is not a very good second guess. So, Newton's method is very slick if you have a pretty good first guess, but if your first guess is not so good, Newton's method can start shooting you off the edge of the world. But these are programmer's issues, not issues for a first course in calculus. In practice programmers use something called a binary search to get close, then Newton's method to polish up the solution quickly.
Next, suppose we're trying to find the maximum or minimum of a function. Let's consider the function below:
In Figure 4.2, there's a curve which has a maximum at about 3. The precise curve graphed here is -x3/3 + x2/4 + 7.5x. How can we find out what the maximum is, precisely, and where it happens? Well, the first thing we notice is that at the maximum, the derivative is zero. After we think about this for just a second, we see this must obviously be true: if the derivative pointed in any direction other than purely horizontal, we could follow the derivative uphill for a little while and be at a higher spot on the curve. Only where the derivative is horizontal, that is zero, are we at a spot where following the slope in either direction doesn't help us move up or down the curve.
So, what we want to do is find the slope of this curve everywhere, and then find out where the slope is zero. We can differentiate the curve pretty easily:
Y = -X3/3 + X2/4 + 7.5x
dY
––– = -X2 + X/2 + 7.5
dX
Now, the slope is a quadratic, and we can solve those easily using the quadratic formula: if the quadratic is aX 2 + bX + c, the solution is X = ( -b ± sqrt( b2 - 4ac ) ) / 2a. Here, a is -1, b is 1/2, and c is 7.5. So, the answer is ( -1/2 ± sqrt( 1/4 + 30 ) ) / -2, which is ( -1/2 ± 5.5 ) / -2 which is -2.5 and 3. So, the graph looks like it has a maximum at 3, and indeed it does. At 3, the curve's value is -9 + 9/4 + 22.5 = 15.75.
Notice that our curve at x = -300 has a really quite large value - much larger than 15.75. So we see that 15.75 is what we call a local maximum. We also calculated that the curve has a zero derivative at x = -2.5. We know what third order polynomials look like, they have two humps, one up and one down. The point at x = -2.5 will be a local minimum. We know it will be a local minimum because this curve goes down without limit as x increases. This curve at x = 300 has a very large negative value.
You can't tell just from the first derivative if you found a local maximum or a local minimum. You have to look at the curve or look at the second derivative. If it's a local maximum the second derivative will be negative. If it's a local minimumthe second derivative will be positive.
What if the curve happens to not be a nice, simple polynomial like in 4.2? For example, in Figure 4.3 below is a curve that I just drew freehand. We can see that there's a maximum near 2.5, but we're not sure where it is precisely. How could we go about finding it?
Well, we know how to find the derivative - we learned that in chapter 1. And, we know how to find a particular value of a function, we learned that about 3 pages ago, using Newton's method. Here's what we could do. Let's call the curve about Y(X). We can find the derivative at any point X by saying dY/dX = ( Y( X+.001) - Y(X) ) / .001. We'll call this new function Y' (pronounced Y-prime). Now, we want to find the place where Y' = 0. We know how to do this with Newton's method: all we need is a first guess, and the derivative. But, remember, here we're trying to find the place where the derivative of Y is 0, so to use Newton's method we need the derivative of Y'. How can we find the derivative of Y'? Again, we can use what we learned in Chapter 1. We'll call dY' / dX = Y" (pronounced Y-double prime). Y"(X) = ( Y'(X + .001) - Y'(X) ) / .001. Now, the only thing we actually know is Y(X), so we can figure out Y" in terms of Y'.
Y'(X) = 1000 * ( Y(X + .001) - Y(X) )
Y'(X + .001) = 1000 * ( Y(X + .002) - Y(X + .001) )
Y"(X) | = 1000 * ( Y'(X + .001) - Y'(X) ) |
= 1000 * [ 1000 * ( Y(X + .002) - Y(X + .001) ) - 1000 * ( Y(X + .001) - Y(X) ) ] | |
= 1,000,000 * [ Y(X + .002) - 2*Y(X + .001) + Y(X) ] |
So, now we can proceed. We start with some initial guess, maybe in this case we'd start at about 2.2, and then just look for a place where Y' = 0. This is just like before. We'll call our first guess G1 = 2.2. The value we're looking for is A = 0. Now, G2, our improved guess, is G2 = [ A - F(G1) ] / F'(G1). In this case, F is Y', and F' is Y". So, G2 = - Y'(G1) / Y"(G1). G3, our third guess, would be G3 = -Y'(G2) / Y"(G2). Something like three to five guesses would probably be enough to nail down this maximum perfectly well. Remember, when Newton's method works, it works really quickly.
Now, back to our intrepid traveller. Below is his distance v. time chart, letting us know how he did in his trip to Mom's house. We've already seen that at any point we can find the slope of his driving chart, and that tells us his speed. We can find the slope (speed) everywhere, and graph it. Below, in figure 4.4, I've done just that. You can see at a glance what his speed is at any particular time. The most interesting thing about this is to notice that his speed is just another function, just another graph, and we could differentiate his speed, too. If we differentiate his speed, we get the rate of change of his speed in time, which we call his acceleration.
So, back in chapter 1, we learned that speed = d Distance / dt, and now we've learned that d Speed / dt = acceleration. I suppose you could reasonably wonder what the time derivative of acceleration is called. Well, university professors with tenure have lots of spare time on their hands, so in fact there are proposed names for these things.
Derivative | Name | times Mass |
1st | velocity | momentum |
2nd | acceleration | force |
3rd | jerk | yank |
4th | snap | tug |
5th | crackle | snatch |
6th | pop | shake |
Now, we're going to do something very strange, in fact we're going to break one of our rules for a little while. This is going to look a lot like cheating, but it's not really that bad, as we'll be able to see in about 4 chapters. We're going to use calculus on a circle. This is cheating because we agreed earlier that we could only differentiate functions, and a function has only one Y value for a given X value. Circles, of course, have two Y values for each X value. We're going to get around this problem by ignoring it: we'll only work on the upper half of the circle right now.
Here's our circle:
Now as you can see rather clearly, this circle has a radius of 2, and a diameter of 4, but we're going to pretend we don't know that. We're going to call the radius R. We can figure out pretty easily what the equation of this circle is: the pythagorean theorem tells us that the distance squared to any point is X2 + Y2, and all the points on this circle are a distance R from the center, so the circle is X2 + Y2 = R2. Thus, our function is Y = sqrt( R2 - X2 ). The equation for the bottom of the circle is Y = -sqrt( R2 - X2 ), but we're ignoring that.
Let's try to differentiate this. First, we have to differentiate the square root. If Y = Z½, then dY/dX = ½ Z-½ = 1 / 2Z½. This is just another polynomial - dXn /dX = nXn-1, where n here is ½.
Y = sqrt( R2 - X2 )
dY | 2X | X | ||
= | = | |||
dX | 2 sqrt( R2 -X2 ) | Y |
If we evaluate this at X = 0, right at the top of the circle, we find the slope is 0 / R, which is zero. This is good, we can see just by looking that the slope is 0 at the top of the circle.
Now, let's differentiate this again. We want to find the second derivative, which corresponds to the acceleration.
dY | ||
Y' = | = X (R2 -X2)-½ | |
dX |
Now, we have to learn something new - how to differentiate a product. Here, we have X times (R2 - X 2)-½, and we're not certain just yet how to deal with this. So, suppose we have two functions, Y = A and Y = B, and we want to differentiate their product, Y = A*B. Here's the answer, the product rule for derivatives:
Y = A * B
dY/dX = dA/dX * B + A * dB/dX
Here, we want to differentiate Y' = [ X ] * [ (R2 - X2)-½ ]. We'll use the formula above with A = X and B = (R2 - X2)-½. So, dY'/dX = Y" =
dY | ||
Y' = | = X (R2 -X2)-½ | |
dX |
dY' | d | dY | dX | ||||
Y" = | = | = | (R2 -X2)-½ + X (-½) (R2 -X2)-3/2 | ||||
dX | dX | dX | dX |
Y" = (R2 - X2)-½ + X (-½) (R2 - X2)-3/2
Again, let's look at this when X is zero. The first term, (R2 - X2)-½, is just 1/R when X is zero. The second term, X (-½) (R2 - X2)-3/2, is zero when X is zero, because the X out in front of the term zeros out everything else.
Now, here's something interesting: the second derivative of a circle, which is the acceleration on the circle, is 1/R, where R is the radius of curvature. So, just as the first derivative tells you how fast the function is changing, the second derivative tells you how curvy the function is. 1 / (second derivative) is the radius of the curvature of the function at that point.
This may seem a little unconvincing - after all, I used the magic value X=0 a lot. But, the circle has perfect symmetry, so whatever is true at the top has to be true everywhere. We'll come back to this circle thing in a couple chapters when we have some better machinery developed, and we'll see that all the stuff we did above is in fact correct.
When we wanted to find numbers, we used the first derivative to approximate the curve for short distances - that is, we said the curve was nearly a straight line so long as we didn't go too far out on the curve. This is the essence of Newton's method. Using the second derivative, we could approximate the function as a combination of a straight line and a circle, which would be a somewhat better approximation.
This statement - the radius of curvature is 1 / (second derivative) = 1 / acceleration is universally true. Let's look at a somewhat unusual example. Einstein tells us that there is no gravity, that space-time is curved. Let's see if we can make something of this statement with this new understanding of ours.
The acceleration of gravity at the surface of the earth is 32 feet per second per second, which is about 10 meters per second per second. That is, after one second of falling, you're moving at 32 feet per second, which is about 20 miles per hour, and you've fallen about 16 feet, which is one second times your average speed of (0 + 32) / 2. Now, it's an unfortunate fact of history that we think of time and space as different - we measure time in seconds and space in meters or feet. However, Einstein assures us that this is a misunderstanding, and that time and space have the same units, feet or meters. We can use the speed of light to convert from seconds to feet or meters.
The speed of light is 3*108 meters per second. If we divide 10 meters / second / second twice by 3*10 8 meters per second, we get an acceleration which does not depend on this silly distinction between time and space. The acceleration at the Earth's surface in natural units is 10 / 3*108 / 3*108 = 10 -16 / meter. Here, we use the little known fact that 3*3 = 10. When we're doing this sort of thing, we're just looking for approximate results, so 3*3 = 10 is just fine.
The radius of curvature of space-time at the Earth's surface is c2 / a = 1016 meters. There's π*107 seconds in a year, and 3*108 meters in a light-second, so a light year is 1016 meters. The radius of curvature of space time at the Earth's surface is about one light year. Your dining room table probably has a curved edge, and the radius of curvature is probable about 1/2 inch. The tires on your car have a radius of curvature of about 1 1/2 feet. So, it's no wonder we think space looks flat, a radius of curvature of one light year is not a very curvy surface.
Let's imagine all the mass of the Earth was scrunched into a very tiny ball. As we get closer and closer to the ball, the curvature of space-time increases, because as we get closer and closer to all that mass, the gravity increases. The radius of the Earth is about 4,000 miles, which is about 6,000,000 meters. Let's imagine the Earth was scrunched into a ball one meter in radius - about six feet across. The acceleration due to gravity goes like GM / R2, and we just got 6,000,000 times closer, so the acceleration just went up by 6,000,0002 = 36*1012 . The radius of curvature of space-time at the surface of this six foot Earth would be c2 / a = 10 17 / (10 * 36*1012) meters, which is about 104 / 36 meters, about 280 meters.
Newton told us that F = ma, and F = GMm / R2. We can cancel m on each side, and we find that a = GM/R 2. Now we see that GM = aR2. We know a and R at the surface of the earth, a = 10m/s2 and R = 6,000,000m. So for the earth GM = 36*1013.
More specifically, we'll call the radius of the Earth R. We'll call the radius of curvature of space-time C.
C = c2 / a
C = c2 / GM / R2
C = R2 c2 / GM.
Let's figure out when R = C. Looking at the last equation, if R = GM / c2, then C = R. For the earth, GM/c2 = 36*1013 / 1017 = 36*10-4 = .36cm = 1/7 inch.
If the Earth were .7cm across, about the size of a pea, the curvature of space-time becomes the same as the curvature of the Earth. We call this size, where the radius of the object is the same as the radius of space-time, the Schwartzchild radius. We would expect things would get pretty weird near such an object, and in fact they do: we just calculated how small the Earth would have to be scruntched to turn it into a black hole - roughly the size of a pea. Because the curvature of space-time is the same as the curvature of the Earth at this size, light would orbit the Earth on the Earth's surface, and therefore could never escape. In fact, the mathematics of General Relativity is far more complex than what we just did, but nevertheless we just got nearly the right answer - if you use the full machinery of General Relativity, after about an hour of work you find the correct answer is R = 2GM / c 2. So, the correct answer is exactly twice as large as what we just calculated. If the earth were a black hole, it would be about a half inch across.
4.1: Find the minimums and maximums of
a) 4X - 12
b) 3X2 + 5X - 64
c) X3 - 3X2 + 5X/3 -27
d) 3X5/5 + 2X4 - X3
answer: none, -5/6, (5/3, 1/3), (0, 1/3, -3)
4.2: Find the derivative of
a) X2 * sqrt(R2 - X2 )
b) X2 / sqrt(R2 - X2 )
c) sqrt( R2 -X2 ) / X2
d) X2 * exp(R2 - X2 )
e) X2 / exp(R2 - X2 )
hints: use the product rule. 1 / sqrt(A) = A-½. 1 / exp(A) = exp(-A)
answer: 2X sqrt - X2 / sqrt; 2x / sqrt + 4X3/sqrt3; 1 / 2X2 sqrt - 2sqrt / X3; 2X exp - 2X3exp; 2X / exp + 2X3 / exp
4.3: The acceleration at the Sun's surface is about 274 m/s/s. The Sun's radius is about 700,000,000 meters.
a) Find the radius of curvature of space-time at the surface of the sun.
b) Find GM for the sun.
c) Find the Schwartzchild radius of the sun, that is, the size you would have to scruntch the sun down to to make it a black hole.
answer: 3*1014 meters = 1/30 light year = 12 light days; about 1.3*1020; about 2,600 meters = 1.6 miles.