PROFESSOR: We're constantly asking how long we have to wait for things. And in the context of probability theory, it turns into the technical question called, the expected time to failure-- or the mean time to failure.
Some examples might be that an insurance company wants to know, for a given policy holder, the expected time before that policy holder dies. A mechanical engineer wants to know the expected time before a button that's being pushed once per second is expected to fail. And I want to know when the part that my body shop has been waiting for is expected to come in.
The mean time to failure problem, we can formalize in terms of flipping coins. So we're going to flip a coin until a head comes up. And we're going to think of a head as being a failure, and a tail as a success. So let's assume the probability of getting a head-- the probability of failure-- is p. Again, this is not a fair coin. It's a coin that may be biased in either direction.
And let's let F be the number of flips until the first head comes up-- the number of flips until the first failure. And if we're counting as flips as time, it's the time to fail. So what we'd like to know is what's the expectation of F? What's the expected number of flips before a head comes up?
Well, in order to calculate that expectation, we need to know some probabilities. So what's the probability that F equals 1? Well that's the same as saying that that's getting a head on the first flip-- it's the probability that you get just an H on the first flip, and that has probability, P.
What's the probability that F equals 2? Well that's the probability that you flip a tail and then a head. And that has probability q times p, because we're assuming the flips are independent, so it's the probability of a tail-- which is q-- times the probability of a head-- which is p.
Similarly, the probability of F equals 3 is the same as the probability of flipping tail, tail, head-- and it's q squared p. And of course, the probability density function of F-- the number of steps until you'd flip a head-- at n-- the probability that you have to flip n times before you get the first head-- is q at the n minus 1 p. By the way, a random variable whose probability density function is this value is called a geometric distribution. They come up all the time.
All right, so what's the formula for the expectation of F? It's simply, of course, by definition, it's the sum over all the possible values of F-- which in this case are integers, n, greater than 0, of n times the probability that F equals n. We figured out that the probability that F equals n is q to n minus 1 times p.
And now I'm going to observe that we really do know how to evaluate this sum easily enough. I'm going to factor out the p, and it becomes a sum over n greater than 0 of q to the n minus 1 times n. And then I can simplify matters if I replace n by n minus 1. And then I get a q to the n power. So this is equivalent to p times the sum over n greater than or equal to 0 of n plus 1 q to the n.
Now this is a familiar generating function. It's simply equal to 1 over 1 minus q squared, as we've seen already. So in short, the expectation of F is p times 1 over the square of 1 minus q. So let's put them together, there.
Of course 1 minus q is p, so it's p times 1 over p squared, or one over p. And we get this really very clean answer. The expected number of flips before you get a head is 1 over the probability of a head.
So for example, with a fair coin, where p is 1/2, the expected number of flips until you get the first head is 2-- it's 1 over 1/2. If you had a biased coin, where the probability of getting a head was 1 in 3-- 1/3-- then, in fact, the expected number of flips until you got a head was 3.
OK, that's a nice clean answer. But we got it in this way that doesn't really give much intuition. So let's look at another naive way to derive the expected time to the first head, without having to worry about generating functions, and all that sophisticated stuff about series-- which is easy to forget. So let's look at the outcome tree that describes this experiment of flipping until you get the first head.
So starting at the root, with probability, p, you flip a head immediately and you stop. Or, with probability, q, you flip a tail, and then with probability, p, you finally flip the head and stop. If you haven't flipped the head by the end of the second step, then that is a possibility that, with probability, q, flipped a tail, and then there's a possibility that you stop after the third step, with a head, and so on. That's how this tree goes.
Now, looking at the structure of this tree, it's an infinite tree, but it's very repetitive. In fact, if we call it the tree B, then what we're seeing is that this whole sub-tree is a copy of B. So now I have a nice recursive, but very finite, description of this whole infinite tree. B is a tree that has a left branch of p that ends in a leaf, or a right branch with probability, q, followed by a repeat of the tree, B.
So now I can apply total expectation to find the expectation of F. F is the expected number of steps I make in this tree until I finally make the left branch to an H. How do I figure out what the expected time in the tree is until I make that left branch of flipping a head, finally?
Well, using total expectation, what I can do is branch on whether or not the first flip is a head. So the expectation of F, according to the total expectation is going to be the expectation of F, given that the first flip is a head, times the probability, p, that it is a head. And the other term is that it's the expectation of F, given that the first flip is a tail times q, the probability that the first flip is a tail.
Well, first of all, let's look at this term. What is the expected number of flips until you get a head, given that you got a head? Well, it's 1. So this term is easily taken care of. We understand that one.
What about this term? This is the expected number of flips until you get the first tail. Sorry-- it's the expected number of flips until you get the first head, given that your first step was a tail. Well what that means is that we are here after the tail, and we're asking, what's the expectation of F-- the number of flips that you get starting at B-- when you do one flip that takes you to the start of B again?
And the answer is, obviously, 1 plus the expected number of flips in B, which is expectation of F. In short, this term-- the expectation of F, given that the first flip is a tail, is simply 1 plus the expected number of flips that we're trying to figure out. Now look at this-- I have a very simple arithmetic formula now. Expectation of F is 1 times p plus 1 plus F times q. There it is.
And now I can solve for E of F. Well, just taking a quick look at this, this is going to yield a q term, and p plus q is 1. And this is going to be q times E of F, and there's an E of F there. If I pull the E of F over, I'm going to do a little arithmetic-- I'm going to leave the rest to you, and realize, again, that the answer is what we had before, the expectation of F is equal to 1 over p.
So let's do one more silly example for fun to remember what the significance of 1 over p is. [? Suppose we think ?] about the space station Mir. Now, it's spinning around, and there's a lot of garbage out there that it's likely to hit-- a lot of space junk. And suppose that, based on our previous statistics and estimations of the small stuff that has been hitting Mir that it could survive, that we estimate that there's about a 1 in 150,000 chance that in any given hour, it's going to run into some intolerable collision with space junk, or a meteor that's going to destroy.
So suppose the space station Mir has a 1 in 150,000 chance of destruction in any given hour. So how many hours do we expect until destruction? Well, it's 1 of over 150,000, 150,000 hours, which [INAUDIBLE]
So much for silly space station examples. Let's wrap up with an intuitive argument that could be made rigorous, but I'm not going to, because I think it's clearer just left in this informal way that makes it intuitive why you would expect that, of course, the expected time to failure is 1 over the probability of failing on a given flip. Well, how many failures do we expect in one try?
Well, by definition, it's the expectation of getting a head on the first flip-- it's p. OK, now if you flip n times, you expect to get n times as many failures as you'd expect in one try. So the expected number of fails in n tries is n times p. That's an intuitive argument. In fact, it's the rigorously correct argument.
Remember that if we flip n times, we're counting the number of heads and flips-- that's a binomial distribution we already figured out in a couple of ways-- that it's expectations is n p. But never mind that. I think it's intuitively clear that if you expect p heads in one try, and you do n tries that are all independent, you're going to expect to get n times p failures-- or heads.
Now, what's the expected number of tries between failures? Well if you think about that, I've done n tries, and I've got n p failures, so if I divide the number of tries by the number of failures, that, by definition, is the average time between the failures. It's the expected time to a failure.
So I divide the number of tries by the number of fails-- which, by definition, is the average number of tries between failures, and it's equal to n over n p, which is equal to 1 over p. And that's an argument that I hope you will remember.