Collapsing states into an absorbing state

This post discusses the technique of collapsing a set of states into an absorbing state to order to make certain Markov chain problems easier. We motivate the technique with the following examples. The method discussed here is to raise the new transition probability matrix (the one with the collapsed absorbing state) to an appropriate nth power (Chapman-Kolmogorov equations are discussed here). Other applications of Markov chains are discussed here. Markov chains are applied to the occupancy problem in this previous post.

Example 1
On any given day, the weather condition in a city is either sunny, cloudy (but not rainy) or rainy. The weather condition on a given day is identical to that of the previous day with probability 0.5. The weather condition on a given day is equally likely to be one of the other two weather conditions if it is different from the previous day. Suppose that it is sunny today in this city. Find the probability that none of the next five days will be rainy.

Example 2
A fair coin is flipped repeatedly until a run of 4 consecutive heads appear. We want to know how many flips are required to achieve this goal. Specifically we have the following questions.

  • Find the probability that it takes at most 9 flips to get a run of 4 consecutive heads.
  • Find the probability that it takes at exactly 9 flips to get a run of 4 consecutive heads.

Example 3
A maze is divided into 9 areas as shown below.

Area 1 = initial position of the mouse

A mouse is placed in area 1 initially. Suppose that areas 3, 7 and 9 contain food and that the other areas in the maze do not contain food. Further suppose that the mouse moves through the areas in the maze at random. That is, whenever the mouse is in an area that has w exits, the next move is to one of these w areas with probability 1/w.

  • Find the probability that after the 5 moves, the mouse is one area from a food source.
  • Find the probability that after the 7 moves, the mouse has not found food and is in area 6 or area 8 after the 7th move.
  • Find the probability that it takes at most 8 moves for the mouse to find food.
  • Find the probability that it takes exactly 8 moves for the mouse to find food.

Working Example 1

We work Example 1 to illustrate the method. The following is the transition probability matrix for Example 1 where state 0 = sunny weather, state 1 = cloudy weather and state 2 = rainy weather.

    $latex
    \mathbf{P} =
    \bordermatrix{ & 0 & 1 & 2 \cr
    0 & 0.50 & 0.25 & 0.25 \cr
    1 & 0.25 & 0.50 & 0.25 \cr
    2 & 0.25 & 0.25 & 0.50 \cr

    } \qquad$

Since it is currently a sunny day, X_0=0. The probability of not having a rainy day in the next 5 days is the probability P[X_5 \le 1, X_4 \le 1, X_3 \le 1, X_2 \le 1, X_1 \le 1 \lvert X_0=0]. This is the probability that X_j=0 or X_j=1 for the next 5 periods given the current state is 0. Because the range of movement is restricted, using the Chapman-Kolmogorov equations, i.e. finding the 5th power of the matrix \mathbf{P}, does not work. For example, though P_{01}^5 is the probability that the chain will end up in state 1 at time 5 given that it is in state 0 at time 0, the probability would include the possibilities of a rainy day (state 2) in between time 0 and time 5. The technique is to make the state 2 (the state that is not to be visited in the next 5 days) an absorbing state. Doing so produces a new transition probability matrix.

    $latex
    \mathbf{Q} =
    \bordermatrix{ & 0 & 1 & 2 \cr
    0 & 0.50 & 0.25 & 0.25 \cr
    1 & 0.25 & 0.50 & 0.25 \cr
    2 & 0 & 0 & 1 \cr

    } \qquad$

The matrix \mathbf{Q} is obtained by changing state 2 in the matrix \mathbf{P} an absorbing state (i.e. the entry in the row for state 2 and in the column for state 2 has probability 1). It is a valid transition probability matrix since the sum of each row is 1. So it describes a Markov chain. How does the new Markov chain relate to the original Markov chain? The following is the 5th power of \mathbf{Q}.

    $latex
    \displaystyle \mathbf{Q}^5 =
    \bordermatrix{ & 0 & 1 & 2 \cr
    0 & \frac{122}{1024} &\frac{121}{1024} & \frac{781}{1024} \cr
    \text{ } & \text{ } & \text{ } & \text{ } \cr
    1 & \frac{121}{1024} & \frac{122}{1024} & \frac{781}{1024} \cr
    \text{ } & \text{ } & \text{ } & \text{ } \cr
    2 & 0 & 0 & 1 \cr } \qquad
    =
    \bordermatrix{ & 0 & 1 & 2 \cr
    0 & 0.11914 & 0.11816 & 0.76270 \cr
    \text{ } & \text{ } & \text{ } & \text{ } \cr
    1 & 0.11816 & 0.11914 & 0.76270 \cr
    \text{ } & \text{ } & \text{ } & \text{ } \cr
    2 & 0 & 0 & 1 \cr
    } \qquad$

The transition probability Q_{02}^5=0.7627 is the probability of transitioning in the new process from state 0 at time 0 to state 2 at time 5 (the 5th day). Starting at state 0, there is a 76.27% chance that the new process will reach state 2 at time 5. This means that the original process will have reached state 2 at time 5 or prior to time 5. The complement of this probability is Q_{00}^5+Q_{01}^5=0.2373. So there is a 23.73% chance the original process will never reach state 2 at any time in the first 5 periods. If the weather is indeed governed the chain described in \mathbf{P}, it is likely (76.27% chance) that there will be a rainy day at some point within the next 5 periods (given that today is sunny). In fact, even if today is cloudy, there is also a 76.27% chance that there will be a rainy day within the next 5 days.

Discussion

Before working Example 2, let’s discuss the idea of collapsing states into an absorbing state. Consider a Markov chain with transition probability matrix \mathbf{P}. Let \mathcal{S} be the state space of this chain. Let \mathcal{M} be a set of states. Suppose that the states in \mathcal{M} are either desirable to enter in or states that are negative situations to be avoided. In case of \mathcal{M} being desirable or attractive, we wish to find the probability of entering into a state in \mathcal{M} after a number of transitions. In the case of \mathcal{M} being negative situations, we wish to find the probability of not in entering into any state in \mathcal{M} after a number of transitions.

The key to finding these probabilities is to collapse the states in \mathcal{M} into a state called M that is an integer that is different from all states i not in \mathcal{M}. For convenience, we let M be the smallest integer that is greater than all states not in \mathcal{M}. For example, if \mathcal{S}=\left\{0,1,2,3,4,5 \right\}, and \mathcal{M}=\left\{4,5 \right\}, then M=4.

Now define a new Markov chain by letting all states not in \mathcal{M} remain the same and by collapsing all states in \mathcal{M} into the new state M. The new state M is made an absorbing state. The state space of the new Markov chain is the original states not in \mathcal{M} along with the new state M. Let \mathbf{Q} be the transition probability matrix of the new Markov chain. The transition probabilities in \mathbf{Q} are defined as follows:

    $latex \displaystyle Q_{ij} = \left\{ \begin{array}{ll}
    \displaystyle P_{ij} &\ \ \ \ \ \ i \notin \mathcal{M}, j \notin \mathcal{M} \\
    \text{ } & \text{ } \\
    \displaystyle \sum \limits_{k \in \mathcal{M}} P_{ik} &\ \ \ \ \ \ i \notin \mathcal{M}, j=M \\
    \text{ } & \text{ } \\
    \displaystyle 1 &\ \ \ \ \ \ i=M, j=M \\
    \end{array} \right.$

The transition probability matrix \mathbf{Q} is derived from and is closely related to the matrix \mathbf{P}. For states i,j not in \mathcal{M}, the transition probabilities are the same in both matrices (the first condition). The second condition says that Q_{iM} is the sum of all P_{ik} where k \notin \mathcal{M}. Thus Q_{iM} is the probability of the original process entering one of the states in \mathcal{M} when starting from a state i \notin \mathcal{M}. Since M is an absorbing state, Q_{MM}=1.

To calculate the two probabilities concerning the process for \mathbf{P} indicated earlier, compute powers of the matrix \mathbf{Q}. In the new Markov process for the matrix \mathbf{Q}, there are two kinds of states, the states i \notin \mathcal{M} and the collapsed state M. Whenever the process is in state i \ne M, the process is not yet absorbed (into the state M). When the process starts at a state not in M (i.e. not yet absorbed), the process will eventually be absorbed (after some number of transitions).

Consider Q_{iM}^n where i \ne M. This is the probability that the process starts in state i (not yet absorbed) and is absorbed at time n. This is the connection between the two processes: being absorbed into state M in the new process at time n means that the original process enters a state in \mathcal{M} in some period prior to time n or at time n. We have the following summary.

Consider a Markov process \left\{X_n:n \ge 0  \right\} with the transition probability matrix \mathbf{P}. Collapse \mathcal{M}, a set of states, into an absorbing state called M in the manner described above. Let \mathbf{Q} be the resulting transition probability matrix of the new Markov chain. For i \ne M, Q_{iM}^n is the probability of absorption given that the process starts out not being absorbed, i.e. the probability that the original process enters a state in \mathcal{M} at time n or prior to time n given that initially the process is in state i. Equivalently Q_{iM}^n=P[X_k \in \mathcal{M} \text{ for some } k=1,2,\cdots,n \lvert X_0=i].

The above observation gives a way to interpret the complement 1-Q_{iM}^n. It would be the probability of the original process never entering any state in \mathcal{M} during the n time periods (given that the process starts at i \notin \mathcal{M}). This is the answer to Example 1. We have the following summary.

The complement of Q_{iM}^n is 1-Q_{iM}^n. It is the probability that the original process never enters any of the states in \mathcal{M} during the n time periods (given that the process starts at i \notin \mathcal{M}). Equivalently, 1-Q_{iM}^n can be stated as 1-Q_{iM}^n=P[X_k \notin \mathcal{M} \text{ for all } k=1,2,\cdots,n \lvert X_0=i]. Note that the probability 1-Q_{iM}^n is also the sum \sum_{j \notin \mathcal{M}} Q_{ij}^n. Specifically, for j \notin \mathcal{M}, Q_{ij}^n can be stated as Q_{ij}^n=P[X_n=j, X_k \notin \mathcal{M} \text{ for all } k=1,2,\cdots,n-1 \lvert X_0=i].

The probability Q_{iM}^n is the probability of absorption on or prior to time n. Another probability of interest is the probability of absorption exactly at time n.

As described above, let \mathbf{Q} be the resulting transition probability matrix of the new Markov chain after collapsing some states into an absorbing state. For i \ne M, Q_{iM}^n-Q_{iM}^{n-1} is the probability of absorption at time n given that the process starts out not being absorbed, i.e. the probability that the original process enters a state in \mathcal{M} at time n given that initially the process is in state i. Equivalently Q_{iM}^n-Q_{iM}^{n-1}=P[X_n \in \mathcal{M} \text{ and } X_k \notin \mathcal{M} \text{ for all } k=1,2,\cdots,n-1 \lvert X_0=i].

In a nutshell, the idea of collapsing states into an absorbing state is to inform on two probabilities concerning the original process. To illustrate this process, consider the following example.

Example 0
Consider a Markov chain with the following transition probability matrix.

    $latex
    \mathbf{P} =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 & 5 \cr
    0 & 0.6 & 0.08 & 0.08 & 0.08 & 0.08 & 0.08 \cr
    1 & 0.5 & 0.10 & 0.10 & 0.10 & 0.10 & 0.10 \cr
    2 & 0.4 & 0.12 & 0.12 & 0.12 & 0.12 & 0.12 \cr
    3 & 0.3 & 0.14 & 0.14 & 0.14 & 0.14 & 0.14 \cr
    4 & 0.2 & 0.16 & 0.16 & 0.16 & 0.16 & 0.16 \cr
    5 & 0.1 & 0.18 & 0.18 & 0.18 & 0.18 & 0.18 \cr
    } \qquad$

Now collapse state 4 and state 5 into a new state called 4. The following is the resulting matrix \mathbf{Q}.

    $latex
    \mathbf{Q} =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr
    0 & 0.6 & 0.08 & 0.08 & 0.08 & 0.16 \cr
    1 & 0.5 & 0.10 & 0.10 & 0.10 & 0.20 \cr
    2 & 0.4 & 0.12 & 0.12 & 0.12 & 0.24 \cr
    3 & 0.3 & 0.14 & 0.14 & 0.14 & 0.28 \cr
    4 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

The following is the 6th power of \mathbf{Q}.

    $latex
    \mathbf{Q}^6 =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr
    0 & 0.195466 & 0.034574 & 0.034574 & 0.034574 & 0.700812 \cr
    1 & 0.184167 & 0.032578 & 0.032578 & 0.032578 & 0.718099 \cr
    2 & 0.172869 & 0.030582 & 0.030582 & 0.030582 & 0.735386 \cr
    3 & 0.161570 & 0.028586 & 0.028586 & 0.028586 & 0.752673 \cr
    4 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

As a result, Q_{04}^6=0.700812 is the probability that the new process will be in state 4 at time 4 (given the process starts at state 0). This means that it is the probability that the original process will reach state 4 or state 5 by time 6 (at or before time 6) given the original process starts in state 0. Note that regardless of the initial state, the probability of being absorbed into state 4 or state 5 by time 6 is roughly the same (70% to 75%).

On the other hand, Q_{04}^5=0.631665. Thus Q_{04}^6-Q_{04}^5=0.700812-0.631665=0.069147. Thus there is a 6.9% chance that absorption will take place exactly at time 6.

More Examples

We now work Example 2 and Example 3.

Example 2 (Statement of the example given above)
Define a Markov chain with states 0, 1, 2, 3 and 4 with i \le 3 meaning that there is currently a run of i consecutive heads (the last i flips are heads) and with state 4 meaning that a run of 4 consecutive heads has already occurred (the last 4 or more flips are heads). Thus state 4 is an absorbing state. The following is the transition probability matrix.

    $latex
    \mathbf{P} =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr
    0 & 0.5 & 0.5 & 0 & 0 & 0 \cr
    1 & 0.5 & 0 & 0.5 & 0 & 0 \cr
    2 & 0.5 & 0 & 0 & 0.5 & 0 \cr
    3 & 0.5 & 0 & 0 & 0 & 0.5 \cr
    4 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

Here’s how \mathbf{P} is derived. For example, if there is currently a run of 2 heads (the third row), the next state is either 0 (the next flip is a tail) or 3 (the next flip is a head) with equal probability. To find the two probabilities in question, raise \mathbf{P} to the 8th and 9th powers.

    $latex
    \mathbf{P}^9 =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr
    0 & 0.406250 & 0.210938 & 0.109375 & 0.056641 & 0.216797 \cr
    1 & 0.376953 & 0.195313 & 0.101563 & 0.052734 & 0.273438 \cr
    2 & 0.320313 & 0.166016 & 0.085938 & 0.044922 & 0.382813 \cr
    3 & 0.210938 & 0.109375 & 0.056641 & 0.029297 & 0.593750 \cr
    4 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

    $latex
    \mathbf{P}^8 =
    \bordermatrix{ & 0 & 1 & 2 & 3 & 4 \cr
    0 & 0.421875 & 0.218750 & 0.113281 & 0.058594 & 0.187500 \cr
    1 & 0.390625 & 0.203125 & 0.105469 & 0.054688 & 0.246094 \cr
    2 & 0.332031 & 0.171875 & 0.089844 & 0.046875 & 0.359375 \cr
    3 & 0.218750 & 0.113281 & 0.058594 & 0.031250 & 0.578125 \cr
    4 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

The starting state is 0 since the coin flips are counted from the very first one. With P_{04}^9=0.216797, we know that the probability of taking 9 flips or less to get 4 consecutive heads is 0.216797.

With P_{04}^9-P_{04}^8=0.216797-0.187500=0.029297, there is a roughly 3% chance that the run of 4 consecutive heads is achieved at the 9th flip.

Example 3 (Statement of the example given above)
Let X_n be the area the mouse is in after making n moves. Then \left\{X_n: n=0,1,2,\cdots \right\} is a Markov chain. The following is the transition probability matrix. To help see the matrix, the diagram of the maze is repeated.

Area 1 = initial position of the mouse

    $latex
    \mathbf{P} =
    \bordermatrix{ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \cr
    1 & 0 & 1/2 & 0 & 0 & 0 & 1/2 & 0 & 0 & 0 \cr
    2 & 1/3 & 0 & 1/3 & 0 & 1/3 & 0 & 0 & 0 & 0 \cr
    3 & 0 & 1/2 & 0 & 1/2 & 0 & 0 & 0 & 0 & 0 \cr
    4 & 0 & 0 & 1/3 & 0 & 1/3 & 0 & 0 & 0 & 1/3 \cr
    5 & 0 & 1/4 & 0 & 1/4 & 0 & 1/4 & 0 & 1/4 & 0 \cr
    6 & 1/3 & 0 & 0 & 0 & 1/3 & 0 & 1/3 & 0 & 0 \cr
    7 & 0 & 0 & 0 & 0 & 0 & 1/2 & 0 & 1/2 & 0 \cr
    8 & 0 & 0 & 0 & 0 & 1/3 & 0 & 1/3 & 0 & 1/3 \cr
    9 & 0 & 0 & 0 & 1/2 & 0 & 0 & 0 & 1/2 & 0 \cr
    } \qquad$

States 3, 7 and 9 are desirable states since these are the areas with food. Let \mathcal{M}=\left\{3,7,9 \right\}. We then collapse \mathcal{M} into a new state called 9. The state 9 in the new process is an absorbing state, which means that the original process has reached a state with food (states 3, 7 and 9). The following is the transition probability matrix for the new (collapsed) process.

    $latex
    \mathbf{Q} =
    \bordermatrix{ & 1 & 2 & 4 & 5 & 6 & 8 & 9 \cr
    1 & 0 & 1/2 & 0 & 0 & 1/2 & 0 & 0 \cr
    2 & 1/3 & 0 & 0 & 1/3 & 0 & 0 & 1/3 \cr
    4 & 0 & 0 & 0 & 1/3 & 0 & 0 & 2/3 \cr
    5 & 0 & 1/4 & 1/4 & 0 & 1/4 & 1/4 & 0 \cr
    6 & 1/3 & 0 & 0 & 1/3 & 0 & 0 & 1/3 \cr
    8 & 0 & 0 & 0 & 1/3 & 0 & 0 & 2/3 \cr
    9 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

The state 9 in the matrix \mathbf{Q} refers to the situation in the original process reaching states 3, 7 or 9. Then all the questions in the example can be answered by raising \mathbf{Q} to an appropriate power. In particular, raising \mathbf{Q} to the 5th, 7th and 8th power.

    $latex
    \mathbf{Q}^5 =
    \bordermatrix{ & 1 & 2 & 4 & 5 & 6 & 8 & 9 \cr
    1 & 0 & 5/36 & 1/18 & 0 & 5/36 & 1/18 & 11/18 \cr
    2 & 5/54 & 0 & 0 & 7/54 & 0 & 0 & 7/9 \cr
    4 & 1/27 & 0 & 0 & 1/18 & 0 & 0 & 49/54 \cr
    5 & 0 & 7/72 & 1/24 & 0 & 7/72 & 1/24 & 13/18 \cr
    6 & 5/54 & 0 & 0 & 7/54 & 0 & 0 & 7/9 \cr
    8 & 1/27 & 0 & 0 & 1/18 & 0 & 0 & 49/54 \cr
    9 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

    $latex
    \mathbf{Q}^7 =
    \bordermatrix{ & 1 & 2 & 4 & 5 & 6 & 8 & 9 \cr
    1 & 0 & 17/216 & 7/216 & 0 & 17/216 & 7/216 & 7/9 \cr
    2 & 17/324 & 0 & 0 & 2/27 & 0 & 0 & 283/324 \cr
    4 & 7/324 & 0 & 0 & 5/162 & 0 & 0 & 307/324 \cr
    5 & 0 & 1/18 & 5/216 & 0 & 1/18 & 5/216 & 91/108 \cr
    6 & 17/324 & 0 & 0 & 2/27 & 0 & 0 & 283/324 \cr
    8 & 7/324 & 0 & 0 & 5/162 & 0 & 0 & 307/324 \cr
    9 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

    $latex
    \mathbf{Q}^8 =
    \bordermatrix{ & 1 & 2 & 4 & 5 & 6 & 8 & 9 \cr
    1 & 17/324 & 0 & 0 & 2/27 & 0 & 0 & 283/324 \cr
    2 & 0 & 29/648 & 1/54 & 0 & 29/648 & 1/54 & 283/324 \cr
    4 & 0 & 1/54 & 5/648 & 0 & 1/54 & 5/648 & 307/324 \cr
    5 & 1/27 & 0 & 0 & 17/324 & 0 & 0 & 295/324 \cr
    6 & 0 & 29/648 & 1/54 & 0 & 29/648 & 1/54 & 283/324 \cr
    8 & 0 & 1/54 & 5/648 & 0 & 1/54 & 5/648 & 307/324 \cr
    9 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr

    } \qquad$

The statements for the probabilities to be sought are repeated here.

  • Find the probability that after the 5 moves, the mouse is one area from a food source.
  • Find the probability that after the 7 moves, the mouse has not found food and is in area 6 or area 8 after the 7th move.
  • Find the probability that it takes at most 8 moves for the mouse to find food.
  • Find the probability that it takes exactly 8 moves for the mouse to find food.

To obtain the answers, it is a matter of picking the entries from the matrices.

  • Q_{12}^5+Q_{14}^5+Q_{16}^5+Q_{18}^5=7/18=0.3889
  • Q_{16}^7+Q_{18}^7=17/216+7/216=24/216=0.1111
  • Q_{19}^8=283/324=0.8735
  • Q_{19}^8-Q_{19}^7=283/324-7/9=0.095679

One area from a food source means being in areas 2, 4, 6 and 8. Thus the first probability is Q_{12}^5+Q_{14}^5+Q_{16}^5+Q_{18}^5. The probability of being in areas 6 or 8 after 7 moves is Q_{16}^7+Q_{18}^7. The probability that it will take 8 moves or less to reach food is Q_{19}^8. The probability of reaching food in exactly 8 moves is Q_{19}^8-Q_{19}^7.

With the matrices \mathbf{Q}^5, \mathbf{Q}^7 and \mathbf{Q}^8, we can also find out the probability of reaching food. Just from the diagram, it is clear that the likelihood of reaching food is greater if the initial position is area 8 or area 6. Between these two areas, it is easier to reach food if the mouse is in area 8 initially since area 8 is adjacent to two food sources. This is also confirmed by the calculation. For example, Q_{69}^5=7/9=0.7778 and Q_{89}^5=49/54=0.9074. So it is over 90% certain that the mouse will find food if it is placed in area 8 initially.

\text{ }

\text{ }

\text{ }

\copyright 2017 – Dan Ma

Advertisements

4 thoughts on “Collapsing states into an absorbing state

  1. Pingback: Probability problems using Markov chains | A Blog on Probability and Statistics

  2. Pingback: Making use of Markov chains – Climbing Math Everest

  3. Pingback: Making use of Markov chains – Climbing Math Everest

  4. Pingback: Celebrate the year of the dog | All Math Considered

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s