Stochastic processes and Markov chains are introduced in this previous post. Transition probabilities are an integral part of the theory of Markov chains. The post preceding this one is a beginning look at transition probabilities. This post shows how to calculate the step transition probabilities. The ChapmanKolmogorov equations are also discussed and derived.
Introductory Example
Suppose is a Markov chain with the transition probability matrix . The entries of the matrix are the onestep transition probabilities . The number is the probability that the Markov chain will move to state at time given that it is in state at time , independent of where the chain was prior to time . Thus can be expressed as the following conditional probability.
Thus the future state only depends on the period immediately preceding it. This is called the Markov property. Also note that is independent of the time period . Any Markov chain with this property is called a timehomogeneous Markov chain or stationary Markov chain.
The focus of this post is to show how to calculate the probability that the Markov chain will move to state at time given that it is in state at time . This probability is denoted by . In other words, is the probability that the chain will be in state after the Markov chain goes through more periods given that it is in state in the current period. The probability , as a conditional probability, can be notated as follows:
In (2), the in the step transition probabilities satisfies . Note that when , for and for . Including the case for will make the ChapmanKolmogorov equations work better.
Before discussing the general method, we use examples to illustrate how to compute 2step and 3step transition probabilities. Consider a Markov chain with the following transition probability matrix.
Calculate the twostep transition probabilities , and . Then calculate the threestep transition probability using the twostep transition probabilities.
First, let’s handle . We can condition on the first steps. To go from state 0 to state 2 in two steps, the chain must first go to an interim state and then from that state go to state 2.
Note that the above calculation lists out the three possible paths to go from state 0 to state 2 in two steps – from state 0 to state 0 and then from state 0 to state 2, from state 0 to state 1 and then from state 1 to state 2 and from state 0 to state 2 and then from state 2 to state 2. Looking at the calculation more closely, the above calculation is basically the first row of (the row corresponding to state 0) multiplying the third column of (the column corresponding to state 2).
By the same idea, the following gives the other two 2step transition probabilities.
As discussed, the above two calculations can be viewed as the sum of all the possible paths to go from the beginning state to the end state (conditioning on the interim state in the middle) or as a row in the transition probability multiplying a column in . The following shows all three calculations in terms of matrix calculation.
The view of matrix calculation will be crucial in understanding ChpanKolmogorov equations discussed below. To conclude the example, consider the 3step transition probability . We can also condition on the first step. The chain goes from state 0 to the next step (3 possibilities) and then goes from that state to state 2 in two steps.
The example shows that the calculation of a 3step transition probability is based 2step probabilities. Thus longer length transition probabilities can be built up from shorter length transition probabilities. However, it pays to focus on a more general framework before carrying out more calculation. The view of matrix calculation demonstrated above will help in understanding the general frame work.
ChapmanKolmogorov Equations
The examples indicate that finding step transition probabilities involve matrix calculation. Let be the step transition probability matrix. The goal now is to have a systematic way to compute the entries in the matrix . The computation is based on the ChapmanKolmogorov equations. These equations are:
for and for all . For a finitestate Markov chain, the summation in (3) does not go up to infinity and would have an upper limit. The number is the probability that the chain will be in state after taking steps if it is currently in state . The following gives the derivation of (3).
Here’s the idea behind the derivation. The path from state to state in steps can be broken down into two paths, one from state to an intermediate state in the first transitions, and the other from state to state in the remaining transitions. Summing over all intermediate states yields the probability that the process will go from state to state in transitions.
The entries in the matrix can be then computed by (3). There is an interesting an useful interpretation of (3). The following is another way to state the ChapmanKolmogorov equations:

.
A typical element of the matrix is and a typical element of the matrix is . Note that with varying is a row in (the row corresponding to the state ) and that with varying is a column in (the column corresponding to the state ).
The product of the above row and column is the transition probability .
Powers of the OneStep Transition Probability Matrix
Let be the onestep transition probability matrix of a Markov chain. Let be the step transition probability matrix, which can be computed by using the ChapmanKolmogorov equations. We now derive another important fact.
The step transition probability matrix is obtained by multiplying the onestep transition probability matrix by itself times, i.e. is the th power of .
This fact is important in terms of calculation of transition probabilities. Compute , the th power of (in terms of matrix calculation). Then is simply an entry in (in the row that corresponds to state and in the column that corresponds to state ). If the matrix size is large and/or is large, the matrix multiplication can be done using software or by using an online matrix calculator (here’s one matrix calculator).
Of course, the above fact is also important Markov chain theory in general. Almost all mathematical properties of Markov chains are at root based on this basic fact.
We can establish this basic fact using an induction argument. Clearly . Suppose that the fact is true for . Based on (4), . Continuing the derivation: .
The Row Vector and the Column Vector
As demonstrated in the preceding section, , the th power of the , is precisely the step transition probability matrix. Let’s examine the matrix more closely. Suppose that the Markov chain has a state space . The following shows the matrix with special focus on a generic row and a generic column.
Now look at the row corresponding to the state and call it . Also look at the column corresponding to the state and call it . They are separated from the matrix below.
The row vector is a conditional distribution. It gives the probabilities of the process transitioning (after transitions) into one of the states given the process starts in state . If it is certain that the process begins in state , then gives the probability function for the random variable since .
On the other hand, if the initial state is not certain, then the column vector gives the probabilities of the process transitioning (after transitions) into state from any one of the possible initial states. Thus a column from the matrix gives the probability of the process in a given state at the th period from all possible initial states. The following gives the probability distribution for (the state of the Markov chain at time ).
The calculation in (5) can be viewed as matrix calculation. If we put the probabilities in a row vector, then the product of this row vector with the column vector would be . The following is (5) in matrix multiplication.
Additional Comments on ChapmanKolmogorov
Even though the matrix calculation for should be done using software, it pays to understand the orientation of the matrix . In the preceding section, we learn that a row of is the conditional distribution (the state of the process at time given the initial state). In the preceding section, we also learn that a column in the matrix gives the probabilities of the process landing in a fixed state from any one of the initial states.
The ChapmanKolmogorov equations in (3) tells us that an entry in the matrix is simply the product of a row in and a column in . This observation makes it possible to focus just on the transition probability that is asked in a given problem rather than calculating the entire matrix. For example, to calculate an entry of the matrix , multiply a row of (the row corresponding to the state ) and a column of (the row corresponding to state ). There is no need to calculate the entire matrix if the goal is just one entry of .
To calculate an entry of , there is a considerable amount of variation in multiplication approaches. For example, multiple a row of and a column of or multiple a row of and a column of . Or multiple a row of and a column of .
If a row of transition probabilities multiplies a transition matrix, the result is a row in a higher transition probability matrix. For example, if a row in multiplies the matrix , the result is a row in . More specifically, the first row in multiplying the matrix gives the first row of . The first row of multiplying the matrix gives the first row in , etc.
On the other hand, when a transition probability matrix multiplies a column of transition probabilities, the result is a column in a higher transition probability matrix. For example, the matrix multiplies a column in , the result is a column in . More specifically, when the matrix multiples the first column in the matrix , the result is the first column of . When the matrix multiples a column in the matrix , the result is the first column in , etc.
Examples
We now give some examples on how to calculate transition probabilities.
Example 1
Consider a Markov chain with the following transition probability matrix.
Determine and .
Of course, the most straightforward way would be to calculate . Then would be the (0,1)th entry of and would be the (0,1)th entry of . In fact, it is a good practice to use an online matrix calculator for this task. Doing so produces the following marrices.
From the matrix , we see that and . Note that is calculated in the introductory example earlier.
Example 2
For the transition matrix in Example 1, use the ideas discussed in the section Additional Comments on ChapmanKolmogorov to compute the first row and the first column in the transition matrix .
As discussed in earlier, the first row of is obtained by multiplying the first row of with the matrix . The first column of is obtained by multiplying the matrix with the first column of the matrix .
Example 3
A particle moves among states 0, 1 and 2 according to a Markov process with the following transition probability matrix.
Let be the position of the particle after the th move. Suppose that at the beginning, the particle is in state 1. Determine the probability where .
Since the initial state of the process is certain, . Thus the problem is to find the second row of .
Example 4
Consider a Markov chain with the following transition probability matrix.
Suppose that initially the process is equally likely to be in state 0 or state 1. Determine and .
To find , we need the first column of . To find , we need the second column of . Using an online calculator produces .
The following calculation gives the answers.
Example 5
Consider a Markov chain with the following transition probability matrix.
At time 0, the Markov process is in state 0. Let . Note that state 2 is called an absorbing state. The Markov process will eventually reach and be absorbed in state 2 (it stays there forever whenever the process reaches state 2). Thus is the first time period in which the process reaches state 2. Suppose that the Markov process is being observed and that absorption has not taken place. Then we would be interested in this conditional probability: the probability that the process is in state 0 or state 1 given that the process has not been absorbed. Determine .
The key idea here is that for the event to happen, or . Thus

.
Note that since the initial state is certain to be state 0, and . Using an online calculator gives the following matrix.
The following gives the desired result.
Thus if absorption has not taken place in the 4th period, the process is in state 0 about 71% of the time.
Example 6
Continue with the Markov chain in Example 5. Determine .
Note that . The probability is already determined using . To determine , we need the top row of .
Thus . Thus . Thus about 12% of the time, absorption takes place in the 4th period about 12% of the time.
Exercises
We give a few practice problems to reinforce the concepts and calculation discussed here. Further practice problems are in a companion blog (here and here).
Exercise 1
Consider a Markov chain with the following transition probability matrix.
Determine these conditional probabilities.
Exercise 2
A particle moves among states 1, 2 and 3 according to a Markov process with the following transition probability matrix.
Let be the position of the particle after the th move. Determine the probability and .
Exercise 3
Consider a Markov chain with the following transition probability matrix.
Suppose that initially the process is equally likely to be in state 0 or state 1 or state 2. Determine the distribution for .
Exercise 4
Continue with Example 5 and Example 6. Work these two examples assuming that the Markov chain starts in state 1 initially.
Answers to Exercises
Exercise 1
Exercise 2
Exercise 3
Exercise 4
2017 – Dan Ma