Before we can get to work in this chapter, we need to learn a few new words. Function is a new word for us, so I guess we'd better define it. Here's a simple definition of a function: it's a relationship between two numbers, you can draw a graph, and when you look at the graph there's only one Y value for any given X value. Below in Figure 2.1 are two curves. A is a function, but B is not.
B is not a function because if we read up the X axis at the value 1, we see that the trace B has three Y values which correspond to the X value 1. This same problem happens at X values around 2.6 or so. In calculus, we only work with functions. It actually happens that sometimes things come up which are not functions, and we want to use calculus anyway: when all you have is a hammer, everything looks like a nail. There are complicated tricks we can use to make this work, called Branch Cuts - basically, we make a deal with ourselves that if there's a place where one X value has more than one Y value, we agree to ignore all but one of them. But, we're not going to talk about branch cuts anywhere in this book, so you can just forget about this. From now on, everything we work with will be a function.
Next, we're going to worry about whether or not the function is continuous. Continuous means what you think: the graph doesn't have any holes in it. In figure 2.2 below, our good friend A is continuous, but B is not: B has a gap in it. Why do we care about continuous? Remember the Limit thing? We can't do Limits right at the edge of a gap, because there will be some ε for which the function is not defined. So, we can get one of those average-type derivatives, but we can't take the Limit and get a precise result. Since the result is not precise, it's not mathematically acceptable.
B, however, is called piece-wise continuous. We can see in figure 2.2 above that B has a right side, and a left side, and each side is perfectly reasonable. There's a gap in between, from about 2.4 to about 2.6. We must remember that the function B is not defined in that region, and, sensibly enough, neither are its derivatives. So, we can use calculus on B so long as we remember to stay away from the gaps.
In figure 2.2, B has one gap. How many gaps can we tolerate? Mathematicians actually stay up nights worrying about things like this. The answer is we can tolerate as many gaps as we can count - zero, one, ten, 100, 100,000,000,000 are all ok, we can still do calculus. We cannot tolerate a function which has an uncountably infinite number of gaps. We'll call functions which have an uncountably infinite number of gaps psychotic. Here's an example of a psychotic function: if X is a rational number like 1 or 1/2 or 37/43, then Y is one. If X is an irrational number like π ( = 3.14159...), then Y is zero. Here's the catch: between any two rational numbers you can name, I can name an infinite number of irrational numbers. Between any two irrational numbers you can name, I can name an infinite number of rational numbers. So, this psychotic function is as non-continuous as we can imagine - the Y values jump up and down an infinite numbers of times over any given interval of X values, no matter how small the X interval. This function has an infinite number of discountinuities between any pair of numbers, no matter how close. We simply can't do calculus on such a function, because the idea of a Limit makes no sense. We need functions where, when the X values get really, really close together, the corresponding Y values also get really close together. So, we just make an agreement: we don't consider psychotic functions. Mathematicians sometimes get all excited about these functions, and they try to figure out just what you can do with them. But, we're not mathematicians, so here's what we're going to do with them: forget about them. Fortunately, it very much appears that when God designed our universe, He forgot about them too.
Do all piece-wise continuous functions have derivatives, except at the edges? No. Have a look at the function in figure 2.3 below. This function has a perfectly reasonable slope up to the point at 3 - the slope is one. It has a perfectly reasonable slope just past 3 - the slope is minus one. But, right at the point 3, the slope is undefined: it's whatever we want to claim. We say that this curve is not differentiable at 3. So, now we see that there are functions which are continuous, and some of these functions are continuously differentiable. The function below in 2.3 is continuous, but is not continuously differentiable.
Functions can have names. Above, we had a couple functions called A and B. Sometimes, if we have a lot of functions laying about we not only give them names, we also keep track of what goes in and what comes out. In all our functions so far, what goes in is X and what comes out is Y, so we would call these functions Y = A(X) and Y = B(X). It could be that we have a situation where Y = A(X), and X = B(T). This is ok, it's just names - this is like giving out name tags at a big dinner party. At a convention we might add company names and job titles. It's just some extra stuff to keep track of who's who.
Now that we're naming everything, maybe we should give a name to this process of find the slope, or finding the derivative. Here's the name Leibnetz used, and we still used today: d / dx. We read this as "the derivative with respect to x", or if we're lazy, "d by d x." In Chapter 1, everything we considered was a function of time, not space, so d/dx would have had no meaning on those functions. Instead, we would have used d/dt, the derivative with respect to time. d is a shorthand for a very small change, so dB/dt means a very small change in B(t) divided by the corresponding change in t. This is exactly how we learned to calculate the slope in Chapter 1. Alternatively, if we had labeled the position of the car as Y, then we could have said Y = B(t), and we could say any of dY/dt, or dB/dt, or dB(t)/dt. They all mean the same thing.
Why do we write d/dx? First, you could think of d as standing for difference, so dY/dx could maybe be read as "the difference in Y divided by the difference in x," which is what we have been doing. In fact, that's not how we read it - we have a different notation for difference. Actually, the d is also supposed to remind us that it's a very small difference, a difference taken to a Limit. We call a very small difference taken to a Limit a differential. So, dY/dx is properly read as "a differential Y divided by a differential X," where differential reminds us that it's a difference but only ε wide.
Another thing we notice is that d/dx does not have a value like 6, nor even a rule for values like X2. d/dx means "take a derivative." It's more like an instruction to us than a piece of algebra. d/dx is an object which we call an operator. Now we know three types of mathematical objects: numbers, functions, and operators. Numbers just are. Functions are rules for associating one number with another, for example X2 associates the number 9 with the number 3. Operators associate functions with other functions.
So far we've learned some new words and some ideas to go with them. Mostly the purpose of this is to help us to remember that there are places where we must be careful, and places where we just shouldn't go: don't bring a hammer into a china shop, you're just asking for trouble. But, now it's time to learn some math, instead of just words.
Here's a very simple function: Y = X. This is a straight line that starts at (0,0) and goes upwards to the left with a slope of 1.
Let's practice taking the derivative of this function. We already know the answer: the slope is 1. Here's what we want:
dY | X + ε - X | ε | |||||||
= | Limit | = | Limit | = | Limit | 1 = 1 | |||
dX | ε→0 | X + ε - X | ε→0 | ε | ε→0 |
Here's what we did. We're going to look at this function at a couple points, and subtract the value of the function at the two points. This gets us the change in Y. Then, we're going to divide this by the change in X. So, our first point will be at X, and the function value at X is Y1 = X, it's X too. Next, we're going to check out the function a little ways away, say at a second point (X+ε). The function value here is Y2 = (X+ε). Now, the change in Y is Y2 - Y1, that is (X+ε) - X, and the change in X is X2 - X1 = (X+ε) - X. X+ε - X is just ε. ε / ε is 1. 1 doesn't depend on ε, so this is a particularly easy Limit to take. The slope of this line is 1. Of course, we can just glance at the graph and see that this is so, but it's reassuring to know that we can calculate it too.
Now we'll try something just a little harder: Y = 3X.
dY | 3(X + ε) - 3X | 3ε | |||||||
= | Limit | = | Limit | = | Limit | 3 = 3 | |||
dX | ε→0 | X + ε - X | ε→0 | ε | ε→0 |
Just like before, we're going to look at this function at a couple points, and subtract the value of the function at the two points. This gets us the change in Y. Then, we're going to divide this by the change in X. So, our first point will be at X, and the function value at X is Y1 = 3X. Next, we're going to check out the function a little ways away, say at a second point (X+ε). The function value here is Y2 = 3(X+ε). Now, the change in Y is Y2 - Y1, that is 3(X+ε) - 3X = 3X - ε - 3X, and the change in X is X2 - X1 = (X+ε) - X. X+ε - X is just ε. 3ε / ε is 3. 3 doesn't depend on ε. No matter what the value of ε might be, 3 is always just 3. So this is also a particularly easy Limit to take: no matter what ε is, the value here is 3. The slope of this line is 3.
Ok, I know we're not exactly doing heavy lifting here, but you've gotta crawl first. Mixing my polynomials with the same abandon as my metaphors, we'll move on to X2. Our graph is in Figure 2.6 below. Here we notice something interesting: the slope depends on where you are. The slope of the straight line was the same everywhere, so the slope was just a number. Here, the slope is different at different places, so we expect the slope to be some formula that depends on X.
Well, let's get to work. Y1 = X2. Y2 = (X+ε)2.
(X+ε)2 = (X+ε)(X+ε) = X2 + 2εX + ε2.
Now, the derivative is:
dY | (X + ε)2 - X2 | X2 + 2εX + ε2 - X2 | ||||
= | Limit | = | Limit | |||
dX | ε→0 | X + ε - X | ε→0 | ε |
dY | 2εX + ε2 | |||||
= | Limit | = | Limit | 2X + ε | ||
dX | ε→0 | ε | ε→0 |
Now, this part is the next leap we have to take: 2X + ε is something just a little bigger than 2X, and as ε gets really, really small, we can just ignore it entirely. So,
dY | |||
= | Limit | 2X + ε = 2X | |
dX | ε→0 |
So, the slope of the curve Y = X2 is 2X. What does this mean? Well, 2X is just a function, a formula. You plug in an X, you get a slope. For example, if we ask for the slope where X = 1/2, the slope is 1. You can check this for yourself: put a ruler up against the curve at X = 1/2, and see what the slope is. At X = 1, the slope is 2. Again, you can check this.
What if we wanted the derivative of 17X2? We would do the math, just as above, except we would have 17(X-ε)2 - 17X2 on top, which would have given us 17*2εX, and the derivative would have been 17*2X = 34X. Constants just hang around, they don't really change anything.
Now we'll try something a bit harder - X3. It's the same trick as before, we're going to do the epsilon thing, and take a Limit. But, this time the math will be a bit lengthier. The good news is, this is pretty much the last time we're going to do this epsilon / Limit thing. After this, we'll notice some things, and then it will all be a lot easier.
dY | (X + ε)3 - X3 | (X2 + 2εX + ε2)*(X + ε) - X3 | ||||
= | Limit | = | Limit | |||
dX | ε→0 | X + ε - X | ε→0 | ε |
dY | X3 + 3εX2 + 3ε2X + ε3 - X3 | |||||
= | Limit | = | Limit | 3X2 + 3εX + ε2 | ||
dX | ε→0 | ε | ε→0 |
Ok, what is this Limit? X might be really, really big, so how do we know that εX is small? Remember, we're comparing ε*X to X*X, so as long as ε is smaller than X, ε*X is smaller than X*X. If ε is less than 1, then ε*ε is even less than ε. So, basically we get to throw away all the terms that have an ε in them, and the Limit is 3X2. If we wound up with an ε on the bottom, we'd be in real trouble: that's that divide by zero thing. But so far, we've lucked out and the ε terms all wind up on the top, where we can simply throw them away.
The slope of the curve X3 is 3X2. If X is 1, the slope is 3. If X is 2, the slope is 3*4 = 12.
What if we wanted to differentiate 17X3? The 17 is a constant and follows through the entire problem just like before. So the answer is 17*3X2 = 51X2.
Now, we can see a general trend here for polynomials. The derivative of X4 involves expanding (X+ε)4. Now, this is a book on calculus, not on expanding polynomials, so I'll just tell you the answer. The derivative will be:
dY | (X + ε)4 - X4 | X4 + 4εX3 + 6ε2X2 + 4ε3X + ε4 - X4 | ||||
= | Limit | = | Limit | |||
dX | ε→0 | X + ε - X | ε→0 | ε |
Just like before, first we subtract away the X4, then divide by ε, leaving:
dY | |||
= | Limit | 4X3 + 6εX2 + 4ε2X + ε3 | |
dX | ε→0 |
Just like before, we throw away all the terms with an ε, leaving 4X3. And, now we can see what looks like, and is, a general rule. The derivative of Xn is nXn-1. Interestingly, it turns out this is pretty close to the only thing we can differentiate: polynomials. Fortunately, it turns out that we can represent almost everything with polynomials.
2.1: Find the derivatives of all the following functions:
a) Y = X5
b) Y = 7X11
c) Y = 3X100
d) Y = 11X7
e) Y = 6
answer: 5X4, 77X10, 300X99, 77X6, 0.
2.2: Sketch a circle.
a) Is a circle a function?
b) Is a circle continuous?
c) Is a circle differentiable?
2.3: Absolute value means always positive. So, the absolute value of 6 is 6, and the absolute value of -3 is 3.
a) Draw a sketch of Y = absolute value( X ) from -3 to 3.
b) Is this a function?
c) Is the graph continuous?
d) Is the graph differentiable?
e) What is the derivative? Try to sketch it.
2.4: The Heavyside function Θ( X ) is zero for X less than zero, and 1 for X greater than zero.
Θ( X ) is not defined for X = 0.
a) Draw a sketch of Y = Θ( X ) from -3 to 3.
b) Is this a function?
c) Is the graph continuous?
d) Is the graph differentiable?
e) What is the derivative? Try to sketch it.