Skip to main content

Probabilistic Models of Computation

This post has 0 ideas. At least, no ideas that are younger than 50 years or so. I'm talking about the two most fundamental models of probabilistic computation and the two most fundamental results about them, according to my subjective judgement.

Every computer scientist knows finite automata:
State $1$ is the start state, and state $3$ is the only accepting state. Some words are accepted (i.e., have an accepting run), for instance the word $aa$. Other words are rejected (i.e., have no accepting run), for instance the word $ba$. Most questions about finite automata are decidable. For instance, the emptiness problem, which asks whether the automaton accepts at least one word. This problem is decidable in polynomial time: view the automaton as a graph, and search the graph for a path from the initial state to an accepting state. One can also decide whether two given automata are equivalent, i.e., do they accept the same words? This problem is computationally hard though: PSPACE-complete (never mind if you don't know what that means).

Let's slap on probabilities:
This is a probabilistic automaton. Previously each state and each letter "activated" transitions, one of which was nondeterministically taken. A probabilistic automaton, in addition, determines a probability distribution over the activated transitions. There is one such probability distribution for each state and each letter. For instance, if in state $1$ the letter $b$ is input then the probabilistic automaton moves to state $3$ with probability $\frac13$, and remains in state $1$ with probability $\frac23$. Probabilistic automata are reactive: someone inputs letters, and the probabilistic automaton reacts to the input by taking transitions probabilistically. As a consequence, each input word is accepted with some probability. For instance, consider the input word $bb$. The probabilistic automaton takes the path $1 \xrightarrow{b} 1 \xrightarrow{b} 3$ with probability $\frac23 \cdot \frac13 = \frac29$. Similarly, it takes the path $1 \xrightarrow{b} 3 \xrightarrow{b} 3$ with probability $\frac13 \cdot \frac34 = \frac14$. These two paths are accepting, because state $3$ is accepting. The other two $bb$-labelled paths, $1 \xrightarrow{b} 1 \xrightarrow{b} 1$ and $1 \xrightarrow{b} 3 \xrightarrow{b} 2$, are rejecting, because states $1$ and $2$ are rejecting. It follows that the probabilistic automaton accepts $bb$ with probability $\frac29 + \frac14 = \frac{17}{36}$. Whereas a nonprobabilistic automaton either accepts or rejects the input word, a probabilistic automaton accepts the input with some probability.
Given a probabilistic automaton, the emptiness problem asks if there exists a word that is accepted with probability $\ge \frac12$.
The following theorem appears in the 1971 textbook Introduction to Probabilistic Automata by Azaria Paz:
The emptiness problem is undecidable.
The emptiness problem remains undecidable when the nonstrict inequality $\ge$ is replaced by strict inequality $>$, and also when the threshold $\frac12$ is replaced by another number strictly between $0$ and $1$. With threshold $0$ or $1$, the emptiness problem becomes decidable: apply what I wrote above about nonprobabilistic automata.

While probabilistic automata are reactive, labelled Markov chains are generative:
Now for each state, the probabilities of all outgoing transitions sum up to $1$. There are no input words and no accepting states. Instead, in any state, the labelled Markov chain autonomously picks an outgoing transition and emits the letter that the transition is labelled with. For instance, in state $1$, it emits $a$ with probability $\frac14$, and $b$ with probability $\frac12 + \frac14$. This generalizes to longer words in the natural way. For instance, starting in state $1$, the word $bb$ is emitted with probability $\frac12 \cdot \frac12 + \frac12 \cdot \frac14 + \frac14 \cdot \frac13 + \frac14 \cdot \frac16$. In this way, for each number $n$, the labelled Markov chain defines a probability distribution over words of length $n$.
Given two labelled Markov chains, the equivalence problem asks if for each word $w$ the chains emit $w$ with the same probability.
The following theorem goes back to a 1961 paper by Marcel-Paul Schützenberger and was subsequently rediscovered many times:
The equivalence problem is decidable in polynomial time.
The algorithm is based on linear algebra. Loosely speaking, it iteratively discovers equivalent states and, more generally, equivalent linear combinations of states. In this way one can also check two probabilistic automata for equivalence, where equivalence means that for each word $w$ the automata accept $w$ with the same probability. More generally, the equivalence-checking algorithm works for weighted automata, which are similar to labelled Markov chains and probabilistic automata except that the numbers do not have to be probabilities. Equivalence checking is used to check randomized protocols for anonymity and obliviousness, properties that are hard to formalize otherwise.

As a rule of thumb, equality questions about probabilistic models are decidable, and inequality questions are undecidable. More about that in later posts, such as this one.

Comments

Popular posts from this blog

Tell me the price of memory and I give you €100

Markov Decision Processes (MDPs) are Markov chains plus nondeterminism: some states are random, the others are controlled (nondeterministic). In the pictures, the random states are round, and the controlled states are squares: The random states (except the brown sink state) come with a probability distribution over the successor states. In the controlled states, however, a controller chooses a successor state. What does the controller want to achieve? That depends. In this blog post, the objective is very simple: take a red transition . The only special thing about red transitions is that the controller wants to take them. We consider only MDPs with the following properties: There are finitely many states and transitions. The MDP is acyclic, that is, it has no cycles. There are a unique start state from which any run starts (in the pictures: blue, at the top) and a unique sink state where any run ends (in the pictures: brown, at the bottom). No matter what the controller does,

Calibrating Assessments

Last year I wrote a post on how Oxford selects undergraduate students for computer science. Oxford undergraduate admissions are politically charged: the British public suspects that Oxbridge is biased towards privileged applicants. This post is unpolitical though. I'm advocating a better way of solving a typical problem in academia: you have many candidates (think: papers, grant proposals, or Oxford applicants) and few assessors (think: members of program committees, assessment panels, or admission tutors), and the goal of the assessors is to select the best candidates. Since there are many candidates, each assessor can assess only some candidates. From a candidate's point of view: a paper may get 3 reviews, a grant proposal perhaps 5, and a shortlisted Oxford applicant may get 3 interviews. Each such review comes with a score that reflects the assessor's estimate of the candidate's quality. The problem is that different assessors will score the same candidate differ

Short Killing Words

Given a finite set $\mathcal{M}$ of $n \times n$ matrices over the integers, can you express the zero matrix as a product of matrices in $\mathcal{M}$? This is known as the mortality problem. Michael Paterson showed in 1970 that it is undecidable, even for $n=3$. Later it was shown that mortality remains undecidable for $n=3$ and $|\mathcal{M}| = 7$, and for $n=21$ and $|\mathcal{M}| = 2$. Decidability of mortality for $n=2$ is open. If all matrices in $\mathcal{M}$ are nonnegative, the mortality problem becomes decidable, because then it matters only whether matrix entries are $0$ or not. Specifically, view the given matrices as transition matrices of a nondeterministic finite automaton where all states are initial and accepting, and check whether there is a word that is not accepted. This problem is decidable and PSPACE-complete. One can construct cases where the shortest word that is not accepted by the automaton has exponential length. In this post we focus on the nonnegative