Calculus is the mathematical study of how things change. If things aren't changing, calculus is irrelevant.
Calculus was invented in about 1660 by Sir Isaac Newton and Leibnetz. Newton was interested in the motion of the planets, and in gravity. He invented calculus to use as a tool to calculate orbits. Newton called his version of calculus the theory of fluxions. Today, pretty much no one remembers this. Leibnetz was a mathematician.
Newton's notation, while still in use today from time to time, is unusual. Physicists know and understand Newton's notation because, well, he was pretty much the smartest physicist who ever lived. But, almost no one else uses it. Leibnetz' notation has better stood the test of time.
Today, most calculus courses are taught by mathematics departments. These courses are very good on what is called mathematical rigor - that is, every thing is very carefully proven, most all of the assumptions are spelled out, and if you like things proven beyond a shadow of a doubt, this is the course for you. I am not a mathematician, and I'm not all that personally interested in mathematical rigor. To me, calculus is just a tool. I'm a lot more interested in applications of calculus: How do I use this thing? When is it useful? What are the limitations?
Calculus has underlying assumptions, which we will get to. Most of these assumptions are violated in one form or another in our actual physical universe. Sometimes we know exactly how the assumption is violated; in Quantum Mechanics, for example, sometimes things are much more complicated than Newton ever imagined, and his methods must be extended far beyond anything he ever dreamed of. Furthermore, in our most advanced theory of physics, Quantum Field Theory, when we use calculus we get nonsensical answers, like electrons weigh an infinite amount. Of course, this is simply wrong. At the time of this writing, we actually don't know the resolution to this problem. Instead, we physicists have simply agreed to sweep this problem under the rug, and proceed as if it doesn't exist. In casual conversations, most physicists will agree that these problems demonstrate that calculus is actually not how things work. But, it's pretty close, so we use it anyway.
We'll start with some very simple examples from very simple physics. In fact, we'll be doing things pretty much like Newton did when he first invented calculus. The first systems we'll consider will be systems that change in time - that is, we don't have to do anything to see a change besides simply wait a little while. For example, if we see a car on a freeway (well, not an LA freeway, let's say some freeway where the cars are actually moving), if we turn away for a second then look back, the car has moved. Later, we'll consider systems that change in other ways. For example, if we're in San Francisco, and we have an altimeter, we quickly notice that as we walk around the altimeter changes. If we simply stand in one place and wait, nothing happens. But, if we walk up and down hills, the altimeter changes. This is an example of how something can change in space, but not in time. If we pay attention to Alan Greenspan, the chairman of the Federal Reserve Board, we find that whenever he lowers the Federal Exchange interest rate, the stock market goes up a bit, and whenever he raises the Federal Exchange Rate, the stock market goes down a little bit. The amount the stock market changes in reaction to interest rate changes is an application of calculus, and is actually fairly well understood. A lot of money is made and lost trying to predict Mr. Greenspan's actions.
Whenever you have two or more things, and changing one means there will be a change in the other, calculus is the study of how these things correlate. Sometimes what is changing is time, that is we wait a while. Sometimes what is changing is position, we move to some different place. Sometimes what is changing is something more abstract, like an interest rate, or the age of a student, the electric charge on some material, the number of people we have hired to work on an assembly line. There's a phrase we all use from time to time, "all else equal." Whenever you hear or say this phrase, calculus is happening somewhere.
Let's start with a car going precisely 60 miles per hour on a dead straight road - say, US 10 in Texas. The driver is a paid professional, and he has been instructed to keep his speed and direction precisely constant. We'll be standing at the side of US 10 at some particular spot. The driver is crossing Texas from the west, El Paso, to the east, Houston.
Let's also say that at precisely noon the car zooms past us. One hour later, the car is 60 miles away from us to the east. Two hours later, the car is 120 miles east of us. One hour ago, at 11 am, the car was 60 miles west of us, and at 10 am, the car was 120 miles west of us. This is what we mean by 60 miles per hour: in one hour, the car travels 60 miles. Each hour that goes by, the car travels another 60 miles. Since this is an introductory course, this will be a perfect car - it never has to stop for gas, and there's a little McDonald's express inside it so the driver never needs to stop to eat.
Above is a graph of where the car is, compared to us. At noon, it's right next to us. One hour later, it's 60 miles away, etc. Notice there's nothing on this graph that explicitly says the car is going 60 miles per hour. We can notice this pretty easily - after one hour, the car is 60 miles away; after two hours, the car is 120 miles away, after four hours, the car is 240 miles away. 240 mile in four hours = 120 miles in two hours = 60 miles in one hour = 60 miles per hour.
This is calculus. Ok, there's a bit more to learn, and that's why this book runs for several chapters, but this is much of what there is to calculus. The line above, which graphs the distance from us to the car as a function of time, shows the car's speed - all we need to do is divide a distance (read from the Y axis) by a time (read from the X axis), and we have a speed.
The speed of the car is what we call the slope of the line. The slope of the line is the change in Y divided by the corresponding change in X. The process of measuring a change in Y and a corresponding change in X, and dividing, is called finding the slope, or taking the derivative, or finding the rate of change, or, if you're in economics, finding the marginal rate of change. I don't know why we have so many names for such a simple thing. I'll use the words slope and derivative interchangeably for a while.
This car is moving in a straight line at a constant speed, so we can find the slope pretty easily. That's why I chose this as the first example: it's easy. However, in the process above, to find the speed of the car, I have to wait an hour, and measure out 60 miles. Can we do better? Yes. For example, we could wait one minute, and measure how far the car traveled. 60 miles per hour equals one mile per minute. If we wait one minute and find the car move one mile, we know the car was moving at 60 miles per hour without waiting an entire hour to find out.
Could we measure the speed in less time? Yes. 60 miles per hour is 1 mile per minute. There's 5280 feet in a mile, so 1 mile per minute is 5280 feet per minute. There's 60 seconds in a minute, so 5280 feet per minute is 5280/60 feet per second, which is 88 feet per second. So, to find the speed of the car, we could wait 1 second, and measure how far the car moved. If the car moved 88 feet, we know it was going 60 miles per hour.
Could we measure the speed in even less time? Yes. We could wait one tenth of a second. If the car moved 8.8 feet in one tenth of a second, we still know it's going 60 mph. We could wait one hundredth of a second, and if the car moved .88 feet, that's 60 mph. As I'm sure you see by now, we could wait almost any amount of time you can imagine, no matter how short, and measure how far the car moved. The distance moved divided by the elapsed time is the speed of the car.
Of course, in real life, our cars have speedometers that tell us how fast we're going, and we don't have to wait an hour to find out. Nor do we have to go a constant speed for any appreciable length of time to find out. The speedometer in the car tells us the car's instantaneous speed, that is, how fast the car is going at this instant. The speedometer does not tell us at a single glance if we're speeding up, going at a constant speed, or slowing down - if we want to know that, we have to watch the speedometer for a while, and see if the speedometer is moving.
Now, we'll try something a little harder. Suppose the car above is now bring driven by someone else, and this person has no interest in moving at a constant speed. Below is a graph of how far away from us this guy is at various times in his trip.
Above, we can see that in the first hour, this guy travels about 30 miles, so he's going about 30 mph. In the second hour, from 1 to 2, he goes about 60 miles, so he's going about 60 mph. In the third hour, from 2 to 3, he travels about 90 miles, so he's going about 90 mph. In the fourth hour, he travels about 60 miles, so he's going about 60 mph. We also see that he travels 240 miles in four hours, for an average speed of 60 mph.
Now, here's a new idea: his average speed for the trip is 60 mph, so if his mother knows he lives 240 miles away, and he took four hours to get to her house, she's happy. However, the Texas Ranger who spots him going 90 mph from hour 2 to hour 3 is probably not nearly so happy. We see that the shorter the time interval we use to calculate this guy's speed, the better estimate we get of how fast he's really going at any given time.
Below is yet another graph we're going to try to analyze.
This guy starts out really moving, and in about 2/3 of an hour he's gone 60 miles - about 90 mph. At hour 2, he stops for gas and a sandwich for about 15 minutes, then he leaves. At hour 3 he realizes he's behind schedule, and starts zooming, finally getting to his mom's house in four hours. Again, she's happy, thinking 240 miles, four hours, she's done a good job of raising a decent law-abiding citizen, but we know there's more to the story.
How can we precisely calculate his speed at any point on the curve? Let's consider the point at about one half hour into his journey. What we know how to do is calculate the slope of a straight line. We do this by checking how far the line moved in Y when the X coordinate moved by some known amount.
If we go forward about one hour from the point at one half hour, we see that the guy has moved about 30 miles in this hour. On the other hand, if we back up to the beginning, we see that the guy has moved about 50 miles in about one half hour, about 100 mph. So, which is right? Is the guy going 30 mph or 100 mph? Well, neither is correct. As we saw above in Figure 2, it's a mistake to wait too long to check this guy's speed. Mom thinks he's gone 60 mph, and on average he has, but we know his speed is all over the place. So, one half hour into the trip, if we average over the next hour, we get 30 mph, and if we average over the previous half hour, we get 100 mph. Obviously, we want to average over a very short time - a second or so - if we want to know how fast he's really going. What we want to know is where he is at time T, and then where he is a short time later, which we call T+ε. This character ε is a greek letter, which is called epsilon. Epsilon is almost always used in math and science to mean something really small.
So, suppose his position at time T is P1, and his position at time T+ε is P2. Then his speed is
P2 - P2 | P2 - P1 | ||
Speed = | = | ||
T+ε - T | ε |
Now, we've seen that the longer time we take checking this guy's speed, the worse we seem to do, so we want to use a really small time here. If we use one hour to check the guy's speed, we get a speed of 30 mph. If we use a half hour, we find he's gone about 20 miles in a half hour, or about 40 mph. If we use one minute, or better yet one second, we find he's going more like 85 mph. In fact, the shorter the time we use, the better our estimate. Also, we see that our estimate keeps changing as the time we take decreases. However, we have this strong intuition that if we use one hundredth of a second, or one thousandth of a second, or one millionth of a second, the speed we measure isn't going to change much. A car simply doesn't change its speed much in a fraction of a second. But, who knows, maybe this guy has rockets attached to his car or something. To be as accurate as possible, we'd like to use as short a time as possible.
We say that this guy's actual speed is the speed we would get if the elapsed time was almost zero. We call this speed the limit speed. If the elapsed time, ε, is 1 hour, we get 30 mph. If ε is one half hour, we get 40 mph. if 1 minute, we get 84 mph. if ε is one second, we get 85 mph. If ε is one hundredth of second, we get maybe 85.03 mph. As the time we take gets shorter and shorter, we get a more and more precise answer, but we see that once we get down to one second we're getting diminishing returns. We can see that if we used only one billionth of a second, one nano-second, we'd get a speed very very close to 85 mph. So, this number, 85 mph, we call the limit. We write this as follows:
P2 - P1 | ||
Speed = | Limit | |
ε→0 | ε |
We read this as Speed is the limit as ε goes to zero of P2 - P1 divided by ε. Now, right away we notice something - if ε is ever zero, we're dividing by zero, which is illegal. This is why we say limit. ε is never allowed to get all the way to zero, it just gets really, really close.
Now, what we're really trying to do here is find the slope of the curve. If the curve is a straight line, as in figure 1.1, this is trivial: you simply take any two points on the line, and find the difference between the two Y values, then divide that by the difference in the corresponding two X values. If the curve is made up of a bunch of line segments, like figure 1.2, it's still simple. You do the same procedure, change in Y divided by change in X, only you make certain that the two points you use are on the same line segment. Otherwise, you get into this strange averaging problem we discussed.
If the curve is actually curvy, as in figure 1.3, now you have to be more tricky. You can pick a point, and hold up a ruler to the point, trying to hold the ruler tangent to the curve. Tangent means the resulting line touches the curve at the particular point you choose, and at that point the line goes in exactly the same direction as the curve at that point.
In Figure 1.5 above, I've drawn a line which is pretty much tangent to the curve at a point. We can see that this line is going up about 60 miles in about one hour, indicating that our guy is driving at about 60 mph at that particular time. So, all this stuff we've been doing with limits and epsilons is all a mathematical way to draw a line and check the slope of the line. Of course, the important thing here is we don't usually have a graph and a piece of paper, so we can't usually draw lines. In fact, in physics, when we're formulating theories, we may know something about the derivative without knowing anything at all about the curve.
For example, Newton's law is Force = the derivative of Momentum. We don't know just from reading that line what the force or the momentum are, but we know now how they're related. If we later learn that the force is due to gravity, we now know how to calculate the momentum changes. If, on the other hand, someone tells us the momentum, say on a graph, we can deduce the force. In our example above, we know where the car is at any particular time, and from this we can tell the car's speed. The car's momentum is the speed times the car's weight. So, we could draw a graph of the car's speed, and take the derivative of that graph, just as we took the derivative of the car's position and deduced the speed. From the derivative of the car's speed, we could deduce the force on the car, and therefore the horsepower that the guy was using at any given instant. Given how our sample driver likes to drive, we'd likely know the maximum horsepower his car makes, too.
That's it. This is all of calculus. The rest of this book is a bunch of tricks that have been worked out over the years to make calculating faster, and to handle more complicated situations. Of course, as the book goes on, you'll see that a lot has been done over the last 250 years.
1.1: For the graph above, at the times 0, 1 hour, 2 hours, 3 hours, and 4 hours, draw the tangents and figure out the guy's speed. Answer: about 130 mph, 30 mph, 0 mph, 45 mph, 150 mph. Apparently he's got a Corvette, and he's not embarrassed to use it. Mom had better not be waiting in the middle of the street.
1.2: In our limit equation, why don't we just set ε to zero and be done with it?
1.3: If we have an actual graph of time and distance for a real car on a real road, how small do you think ε would have to be to avoid the averaging problem? Would 1 hour be good enough? 1 minute? 1 second? 1 thousandth of a second?