There is a compact data structure for words: automata.
Finite automata can “store” many words, even infinitely many—in the sense that the language of a finite automaton can be infinite.
This powerful feature makes automata ubiquitous in computer science.
Lecture notes by Javier Esparza emphasize this view of automata as data structures.
The most fundamental algorithmic problems on automata are:
Less known is the fact that one can efficiently check much wider classes of automata for equivalence, including unambiguous automata, probabilistic automata, and weighted automata over fields. The algorithmic idea is always the same. To illustrate it, in this post I pick labelled Markov chains (LMCs). Here are two LMCs:
An LMC randomly produces an infinite word, by starting in the initial state and then taking transitions randomly according to the transition probabilities and outputting the letters on the traversed transitions. For example, the LMCs $C_1$ and $C_2$ above produce only words that start with $a$, and each subsequent letter is $a$ or $b$, each one picked independently and uniformly at random. Thus, $C_1$ and $C_2$ are equivalent: for each finite prefix $w \in \{a,b\}^*$, the probability that $C_1$ generates $w$ is equal to the probability that $C_2$ generates $w$.
This algorithm can be traced back to a Schützenberger paper from 1961. In a recent paper, Nathanaël Fijalkow, Mahsa Shirmohammadi, and I show that one can compare labelled Markov decision processes (LMDPs) in the same way. LMDPs are like LMCs, but a controller can choose actions in each state to influence the transition probabilities. The equivalence checking problem for two LMDPs $D_1, D_2$ is whether for each finite prefix $w \in \{a,b\}^*$, the probability that $D_1$ generates $w$ is equal to the probability that $D_2$ generates $w$, no matter how the controller chooses its actions. This problem is solvable in polynomial time, using a generalization of the algorithm above.
Here is yet another version: given an LMC $C$ and an LMDP $D$, can the controller choose actions in $D$ so that for each finite prefix $w$, the probability that $D$ generates $w$ is equal to the probability that $C$ generates $w$? This version is undecidable.
The most fundamental algorithmic problems on automata are:
- operations, such as computing set union or complement of the languages of automata;
- tests, such as checking their languages for equality.
Less known is the fact that one can efficiently check much wider classes of automata for equivalence, including unambiguous automata, probabilistic automata, and weighted automata over fields. The algorithmic idea is always the same. To illustrate it, in this post I pick labelled Markov chains (LMCs). Here are two LMCs:
An LMC randomly produces an infinite word, by starting in the initial state and then taking transitions randomly according to the transition probabilities and outputting the letters on the traversed transitions. For example, the LMCs $C_1$ and $C_2$ above produce only words that start with $a$, and each subsequent letter is $a$ or $b$, each one picked independently and uniformly at random. Thus, $C_1$ and $C_2$ are equivalent: for each finite prefix $w \in \{a,b\}^*$, the probability that $C_1$ generates $w$ is equal to the probability that $C_2$ generates $w$.
One can check equivalence of LMCs in polynomial time.
Here is how.
For each LMC we take a row vector that indicates the initial state.
For $C_1$:
\[
\alpha_1 \ = \ \begin{pmatrix} 1 & 0 \end{pmatrix}
\]
Further we take the transition matrix for both letters.
For $C_1$:
\[
M_1(a) \ = \ \begin{pmatrix} 0 & 1 \\ 0 & \frac12 \end{pmatrix} \qquad M_1(b) \ = \ \begin{pmatrix} 0 & 0 \\ 0 & \frac12 \end{pmatrix}
\]
Finally, we take a column vectors of $1$s:
\[
\eta_1 \ = \ \begin{pmatrix} 1 \\ 1 \end{pmatrix}
\]
Now, for any finite word $w = \sigma_1 \sigma_2 \cdots \sigma_k \in \{a,b\}^*$ the probability that $C_1$ generates $w$ is:
\[
\alpha_1 M_1(\sigma_1) M_1(\sigma_2) \cdots M_1(\sigma_k) \eta_1
\]
Of course, this holds analogously for the other LMC, $C_2$.
Equivalence checking is checking whether
\[
\begin{equation}
\begin{aligned}
&\alpha_1 M_1(\sigma_1) M_1(\sigma_2) \cdots M_1(\sigma_k) \eta_1 \\
\ = \ &\alpha_2 M_2(\sigma_1) M_2(\sigma_2) \cdots M_2(\sigma_k) \eta_2
\end{aligned}
\tag{1}
\end{equation}
\]
holds for all $\sigma_1 \sigma_2 \cdots \sigma_k \in \{a,b\}^*$.
To this end we “merge the data” of $C_1$ and $C_2$.
Define:
\[
\begin{align*}
\alpha \ := \ &\begin{pmatrix} \alpha_1 & \alpha_2 \end{pmatrix} \\
= \ &\begin{pmatrix} 1 & 0 & 1 & 0 \end{pmatrix}
\end{align*}
\]
and
\[
\begin{align*}
M(a) \ := \ &\begin{pmatrix} M_1(a) & 0 \\ 0 & M_2(a) \end{pmatrix} \\[1mm]
= \ &\begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & \frac12 & 0 & 0 \\ 0 & 0 & \frac12 & \frac12 \\ 0 & 0 & 0 & 0 \end{pmatrix}
\end{align*}
\]
and
\[
\begin{align*}
M(b) \ := \ &\begin{pmatrix} M_1(b) & 0 \\ 0 & M_2(b) \end{pmatrix} \\[1mm]
= \ &\begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & \frac12 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & \frac12 & \frac12 \end{pmatrix}
\end{align*}
\]
and
\[
\eta \ := \ \begin{pmatrix} \eta_1 \\ -\eta_2 \end{pmatrix} \ = \ \begin{pmatrix} 1 \\ 1 \\ -1 \\ -1 \end{pmatrix}
\]
Notice that we have taken minus signs in front of $\eta_2$, because in this way Equation $(\text{1})$ becomes
\[
\begin{equation}
\alpha M(\sigma_1) M(\sigma_2) \cdots M(\sigma_k) \eta \ = \ 0\,. \tag{2}
\end{equation}
\]
That is, we want to check whether $(\text{2})$ holds for all $\sigma_1 \sigma_2 \cdots \sigma_k \in \{a,b\}^*$.
Here comes the trick: this is equivalent to checking whether the vector space, $\mathcal{V} \subseteq \mathbb{R}^4$, spanned by the set of vectors
\[
\left\{\alpha M(\sigma_1) M(\sigma_2) \cdots M(\sigma_k) : \sigma_1 \sigma_2 \cdots \sigma_k \in \{a,b\}^*\right\}\,,
\]
is orthogonal to $\eta$.
Once we have computed a basis of the vector space $\mathcal{V}$ we are basically done!
One can equivalently characterize $\mathcal{V}$ as the smallest vector space that includes $\alpha$ and is closed under multiplication with $M(a)$ and under multiplication with $M(b)$.
So we can compute a basis, $\mathcal{B}$, of $\mathcal{V}$ as follows:
- $\mathcal{B} := \{\alpha\}$
- while there are $v \in \mathcal{B}$ and $\sigma \in \{a,b\}$ such that $v M(\sigma)$ is not in the vector space spanned by $\mathcal{B}$:
$\hspace{10mm} \mathcal{B} := \mathcal{B} \cup \{v M(\sigma)\}$ - return $\mathcal{B}$
This algorithm can be traced back to a Schützenberger paper from 1961. In a recent paper, Nathanaël Fijalkow, Mahsa Shirmohammadi, and I show that one can compare labelled Markov decision processes (LMDPs) in the same way. LMDPs are like LMCs, but a controller can choose actions in each state to influence the transition probabilities. The equivalence checking problem for two LMDPs $D_1, D_2$ is whether for each finite prefix $w \in \{a,b\}^*$, the probability that $D_1$ generates $w$ is equal to the probability that $D_2$ generates $w$, no matter how the controller chooses its actions. This problem is solvable in polynomial time, using a generalization of the algorithm above.
Here is yet another version: given an LMC $C$ and an LMDP $D$, can the controller choose actions in $D$ so that for each finite prefix $w$, the probability that $D$ generates $w$ is equal to the probability that $C$ generates $w$? This version is undecidable.
Comments
Post a Comment