# Introductory examples on first step analysis

This post gives some examples to demonstrate the useful technique called first step analysis. A great number of problems involving Markov chains can be evaluated by this technique. As we demonstrate here, the general idea of the method is to break down the possibilities resulting from the first step (first transition) in the Markov chain. Then use the law of total probability and Markov property to derive a set of relationship among the unknown variables. This topic will be further developed in subsequent posts. We start with the following examples.

Example 1
Consider a Markov process that cycles through the states 0, 1 and 2 according to the following transition probability matrix.

$\mathbf{P} = \bordermatrix{ & 0 & 1 & 2 \cr 0 & 1 & 0 & 0 \cr 1 & P_{10} & P_{11} & P_{12} \cr 2 & 0 & 0 & 1 \cr } \qquad$

Once the process reaches state 0 or state 2, it stays there forever. Hence these two states are called absorbing states. If the process starts at state 1, the process may stay in state 1 for a duration but eventually the process will leave state 1 and enter one of the absorbing states. Here are two natural questions.

• Which state is more likely (state 0 or state 2) for the process to enter into if the process starts at state 1?
• On average how long does it take to reach one of the absorbing states?

To answer these questions, we define the following random variable.

$T=\{ n \ge 0: X_n=0 \text{ or } X_n=2 \}$

The random variable $T$ is the amount of time it takes for the process to be absorbed (to enter into one of the absorbing states). In other words, $T$ is the time to absorption. Then the two questions are answered by finding the following two quantities.

$\displaystyle U=P[X_T=0 \lvert X_0=1]$

$\displaystyle V=E[T \lvert X_0=1]$

The event $X_T=0$ is the event that the process is absorbed into state 0. The quantity $U$ is the probability of being absorbed into state 0 if the starting state is 1. The quantity $V$ is mean time to absorption if the starting state is 1. Both are conditional quantities with $U$ being a conditional probability and $V$ being a conditional mean. Of course, once $U$ is known, $1-U$ would be the probability $P[X_T=2 \lvert X_0=1]$.

First, we calculate the quantity $U$. Consider the first step in the process. There are three possibilities, which are $X_1=0$, $X_1=1$ and $X_1=2$, with probabilities $P_{10}$, $P_{11}$ and $P_{12}$, respectively. In the first case, $T=1$ and $X_T=0$ (this means that the probability of absorption into state 0 at time 1 is 1). In the second case, $T>1$ and the problem repeats itself (this means that the probability of absorption into state 0 at time 1 is $U$ again). In the third case, $T=1$ and $X_T=2$ (the probability of absorption into state 0 at time 1 is 0). Weighting these three cases by their respective probabilities gives the following equation.

$U=P_{10} \times 1+P_{11} \times U+P_{12} \times 0$

Solving for $U$ gives the following.

$\displaystyle U=\frac{P_{10}}{1-P_{11}}=\frac{P_{10}}{P_{10}+P_{12}}$

The quantity $U$ is simply the conditional probability of a transition into state 0 given that a transition into state 0 or state 2 has occurred. So far the result from the first step analysis is encouraging.

To determine the quantity $V$, consider the first step again. Starting at state 1, the time to absorption is at least 1 (it takes at least one step to get to 0 or 2). There are the same three possibilities for the first step – $X_1=0$, $X_1=1$ and $X_1=2$. After the first step is taken, let’s look at the mean time to absorption in each case. In the first case the process is absorbed into state 0 at time 1. So the mean time to absorption at time 1 is 0. In the second case, the process goes back to its starting point. So the mean time to absorption at time 1 is $V$. In the third case, the process is absorbed into state 2. The mean time to absorption at time 1 is 0. Weighting these three cases by their respective probabilities gives the following equation.

$V=1+P_{10} \times 0+P_{11} \times V+P_{12} \times 0$

Solving for $V$ produces the following answer.

$\displaystyle V=\frac{1}{1-P_{11}}$

Note that if $P_{11}$ is large (i.e. it is highly likely that the process keeps staying in state 1), then it will make more steps on average to reach 0 or 2. Thus both $U$ and $V$ make sense.

We would like to comment that in this small example (only one transient state), the problem can be computed directly from the time to absorption $T$. With starting state being 1, each step of the process is like a Bernoulli event – either staying in state 1 or being absorbed into 0 or 2. Consider absorption as a success. The probability of success at each step is thus $1-P_{11}$. Then the number of steps it takes to be absorbed (i.e. the random variable $T$ as described above) has a geometric distribution with mean $V$ is shown above. Thus we can solve this simple problem (with only one transient state) without using first step analysis. But for a larger problem, the time to absorption does not have a simple characterization. The next example further demonstrates how first step analysis is done.

Example 2
Consider a process that cycles through three states (0, 1, 2 and 3) according to the following transition probability matrix.

$\mathbf{P} = \bordermatrix{ & 0 & 1 & 2 & 3 \cr 0 & 1 & 0 & 0 & 0 \cr 1 & P_{10} & P_{11} & P_{12} & P_{13} \cr 2 & P_{20} & P_{21} & P_{22} & P_{23} \cr 3 & 0 & 0 & 0 & 1 \cr } \qquad$

In this 4-state process, state 0 and state 3 are absorbing states. State 1 and state 2 are transient states. We are interested in the same questions: what is the probability of being absorbed into state 0? What is the mean time to absorption? The time to absorption (into 0 or 3) depends on the starting state (1 or 2). To answer these questions, we state the following quantities.

$U_i=P[X_T=0 \lvert X_0=i] \ \ \ \ i=1,2$

$V_i=E[T \lvert X_0=i] \ \ \ \ \ \ \ \ \ \ i=1,2$

The subscripts in the $U$ and $V$ reflect the starting state of the process. Based on the first reasoning, the following gives the equations involving $U_i$ and the equations involving $V_i$.

\displaystyle \begin{aligned}&U_1=P_{10}+P_{11} \times U_1+P_{12} \times U_2 +P_{13} \times 0 \\&U_2=P_{20}+P_{21} \times U_1+P_{22} \times U_2 +P_{23} \times 0 \end{aligned}

\displaystyle \begin{aligned}&V_1=1+P_{10} \times 0+P_{11} \times V_1+P_{12} \times V_2 +P_{13} \times 0 \\&V_2=1+P_{20} \times 0+P_{21} \times V_1+P_{22} \times V_2 +P_{23} \times 0 \end{aligned}

The following is the derivation for the equation for $U_1$.

\displaystyle \begin{aligned} U_1=P[X_T=0 \lvert X_0=1]&=\sum \limits_{k=0}^3 P[X_1=k \lvert X_0=1] \times P[X_T=0 \lvert X_0=1,X_1=k] \\&=\sum \limits_{k=0}^3 P_{1k} \times P[X_T=0 \lvert X_1=k] \\&=P_{10} \times 1 +P_{11} \times U_1+P_{12} \times U_2 +P_{13} \times 0 \end{aligned}

Note that the first line shows the four cases for the first step weighted by their probabilities (the law of total probability). The second line uses the Markov property. The third line lists out the 4 cases. the following shows the derivation of $V_1$.

\displaystyle \begin{aligned} V_1=E[T \lvert X_0=1]&=1+\sum \limits_{k=0}^3 P[X_1=k \lvert X_0=1] \times E[T \lvert X_0=1,X_1=k] \\&=1+\sum \limits_{k=0}^3 P_{1k} \times E[T \lvert X_1=k] \\&=1+ P_{10} \times 0 +P_{11} \times V_1+P_{12} \times V_2 +P_{13} \times 0 \end{aligned}

We carry out the numerical calculation using the following matrix.

$\mathbf{P} = \bordermatrix{ & 0 & 1 & 2 & 3 \cr 0 & 1 & 0 & 0 & 0 \cr 1 & 0.3 & 0.3 & 0.2 & 0.2 \cr 2 & 0.15 & 0.2 & 0.25 & 0.4 \cr 3 & 0 & 0 & 0 & 1 \cr } \qquad$

Plugging in the numerical values gives the following two sets of equations.

\displaystyle \begin{aligned}&U_1=0.3+0.3 \times U_1+0.2 \times U_2 \\&U_2=0.15+0.2 \times U_1+0.25 \times U_2 \end{aligned}

\displaystyle \begin{aligned}&V_1=1+0.3 \times V_1+0.2 \times V_2 \\&V_2=1+0.2 \times V_1+0.25 \times V_2 \end{aligned}

The solutions are:

$\displaystyle U_1=\frac{51}{97}=0.5278$

$\displaystyle U_2=\frac{33}{97}=0.34$

$\displaystyle V_1=\frac{9.5}{4.85}=1.9588$

$\displaystyle V_2=\frac{9}{4.85}=1.85567$

It is more likely to reach state 0 if the starting position is 1 (53% chance). If the process starts from state 2, it is more likely to reach state 3 (only 34% chance to reach state 0). However, the average number of steps it takes to reach state 0 is about the same (either starting from 1 or 2).

One More Example

We present one more example. This example will be used in the next post to demonstrate another angle of first step analysis.

Example 3
Consider the following transition probability matrix.

$\mathbf{P} = \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr 0 & 1 & 0 & 0 & 0 & 0 \cr 1 & 0.4 & 0.3 & 0.1 & 0.1 & 0.1 \cr 2 & 0.1 & 0.1 & 0.5 & 0.2 & 0.1 \cr 3 & 0.1 & 0.2 & 0.1 & 0.5 & 0.1 \cr 4 & 0 & 0 & 0 & 0 & 1 \cr } \qquad$

The state space is one state bigger than the two examples shown above. There are two absorbing states as in the two previous examples. We focus on the same two problems. Given a starting state (1, 2 or 3), we want to see how fast the process transition to state 0 (one of the absorbing states). Secondly, we want to see on average how many steps it takes the process to transition to one of the absorbing states.

As before let $T$ the time it takes the process to be absorbed into 0 or 4. More specifically let $T$ be defined by $T=\{ n \ge 0: X_n=0 \text{ or } X_n=4 \}$. As before if the starting state is $i$, let $U_i$ be the probability of the process being absorbed into state 0 and let $V_i$ be the expected time to absorption (into state 0 or state 4). More specifically, define $U_i$ and $V_i$ as follows:

$U_i=P[X_T=0 \lvert X_0=i] \ \ \ \ i=1,2,3$

$V_i=E[T \lvert X_0=i] \ \ \ \ \ \ \ \ \ \ i=1,2,3$

Applying the first step analysis produces the following two sets of equations.

\displaystyle \begin{aligned}&U_1=0.4+0.3 \times U_1+0.1 \times U_2+0.1 \times U_3 \\&U_2=0.1+0.1 \times U_1+0.5 \times U_2+0.2 \times U_3 \\&U_3=0.1+0.2 \times U_1+0.1 \times U_2+0.5 \times U_3 \end{aligned}

\displaystyle \begin{aligned}&V_1=1+0.3 \times V_1+0.1 \times V_2+0.1 \times V_3 \\&V_2=1+0.1 \times V_1+0.5 \times V_2+0.2 \times V_3 \\&V_3=1+0.2 \times V_1+0.1 \times V_2+0.5 \times V_3 \end{aligned}

For a reason that will be made clear in the next post, we rearrange the equations by putting the variables $U_i$ or $V_i$ on one side and the constant terms on the other side.

$\displaystyle \begin{array}{rcr} \displaystyle 0.7 \times U_1-0.1 \times U_2-0.1 \times U_3 & = & 0.4 \\ \displaystyle -0.1 \times U_1+0.5 \times U_2-0.2 \times U_3 & = & 0.1 \\ \displaystyle -0.2 \times U_1-0.1 \times U_2+0.5 \times U_3 & = & 0.1 \end{array}$

$\displaystyle \begin{array}{rcr} \displaystyle 0.7 \times V_1-0.1 \times V_2-0.1 \times V_3 & = & 1 \\ \displaystyle -0.1 \times V_1+0.5 \times V_2-0.2 \times V_3 & = & 1 \\ \displaystyle -0.2 \times V_1-0.1 \times V_2+0.5 \times V_3 & = & 1 \end{array}$

Solving these two systems of equations produces the following solutions.

Starting State Probability of Absorption to State 0 Expected Time to Absorption
1 $\displaystyle U_1=\frac{35}{47}=0.745$ $\displaystyle V_1=\frac{360}{141}=2.55$
2 $\displaystyle U_2=\frac{28}{47}=0.596$ $\displaystyle V_2=\frac{350}{141}=2.48$
3 $\displaystyle U_3=\frac{29}{47}=0.617$ $\displaystyle V_3=\frac{540}{141}=3.83$

The introductory examples here show that first step analysis on time to absorption involves the solving of systems of linear equations. Thus the larger the state space in the Markov chain, the larger of the systems of equations. Using software when the matrix is large is an excellent approach. First step analysis can also be approached in other angles. Applications of first step analysis will be presented. The technique of first step analysis will be further explored in subsequent posts.

Dan Ma Markov chains

Daniel Ma Markov chains

$\copyright$ 2017 – Dan Ma