### The Price of Memory

In an earlier post I defined the price of memory, $u$, and asked:
Is $u < 1$? Worth €100.
Determine $u$. Worth another €100, even if $u=1$.
I'm happy to announce that there is now a solution:
The price of memory, $u$, is 1.
I don't want to repeat the precise definition of $u$, but here is a short recap: Suppose you have an acyclic finite MDP and a strategy $\sigma$ that guarantees to visit a red transition with probability 1. This strategy $\sigma$ might use memory, i.e., might base its behaviour in a state on the history of the run, as indicated in yellow:
Any such strategy $\sigma$ induces a randomized strategy $\overline\sigma$ that acts like $\sigma$ in average, but is memoryless. In each state, when averaged over all runs, the strategies $\sigma$ and $\overline\sigma$ take the same actions with the same probabilities:
One can show that $\sigma$ and $\overline\sigma$ visit red transitions equally often in average; in particular, $\overline\sigma$ visits at least one red transition in average. The price of memory is defined as the smallest global (independent of the MDP or any strategy) constant $u$ such that the probability that $\overline\sigma$ does not visit a red transition is at most $u$. Trivially, $u \le 1$, and examples as above show that $u \ge \frac12$. Let us now prove that $u=1$:
Consider the following MDP, $M_1$:
In the controlled state in the middle, the strategy $\sigma$ goes left if and only if no red transition has been visited yet (which is the case if and only if the initial random state has chosen the right transition). In this way, $\sigma$ visits exactly one red transition with probability 1. It also follows that in the controlled state in the middle, the randomized strategy $\overline\sigma$ picks either outgoing transition with probability $\frac12$. Overall, the probability in $M_1$ that $\overline\sigma$ does not visit a red transition is $u_1 := \frac12 \cdot \frac12 = \frac14$. It follows that $u \ge u_1 = \frac14$, which is not a strong statement. Consider now the following MDP, $M_2$, which contains two copies of $M_1$:
As before, there is a strategy $\sigma$ that ensures to visit exactly one red transition: in any controlled state, $\sigma$ goes left if and only if no red transition has been visited yet. As before, the resulting randomized strategy $\overline\sigma$ picks, in any controlled state, either outgoing transition with probability $\frac12$. The probability in $M_2$ that $\overline\sigma$ does not visit a red transition is $u_2 := (\frac12 + \frac{1}2u_1)^2 = \frac{25}{64} \approx 0.39$. It follows that $u \ge u_2$.

One can define a sequence of MDPs $M_1, M_2, M_3, \ldots$ in this way:
Defining $u_{i+1} := (\frac12 + \frac{1}2u_i)^2$ and generalizing $\sigma$ in the natural way, we obtain that in $M_i$ the probability that $\overline\sigma$ does not visit a red transition is $u_i$. It is straightforward to check that the sequence $u_1, u_2, u_3, \ldots$ is non-decreasing and converges to the only fixed point of the function $f(x) = (\frac12 + \frac{1}2x)^2$, which is $1$. Since $u \ge u_i$, it follows that $u=1$.
I should award the prize of €200 to the person who found this solution. The only issue is: that would be myself. Was the problem difficult to solve? For me, yes. Could a Maths or CS undergrad have found this solution? Absolutely. Does this result, $u=1$, lead to a paper? It might, but my collaborators and I would have preferred $u \lt 1$. Oh well.

### Popular posts from this blog

An initial draft of this post was about how to make rankings more meaningful, a piece of mathsy computer science, as you expect it from this blog. My motivation is personal: I'm involved in selecting CS undergraduate students. Some people might be curious how Oxford admissions work: it's certainly a favourite subject for UK politicians and news media. Therefore I decided to postpone my original plan and to write this post about Oxford admissions. I hope I'm not playing with too much fire and the blog will not suddenly become famous for the wrong reasons!

The tenor in newspapers and political campaigns is that the Oxbridge admissions process is “opaque” and unfair. Here I am going to offer transparency and some comments on fairness. The views expressed here are my own, and may not necessarily coincide with the university's views.

Oxford's goal is to admit the strongest applicants. One can disagree with this basic premise, but we try to admit those applicants who w…

### Equivalence Checking

There is a compact data structure for words: automata. Finite automata can “store” many words, even infinitely many—in the sense that the language of a finite automaton can be infinite. This powerful feature makes automata ubiquitous in computer science. Lecture notes by Javier Esparza emphasize this view of automata as data structures.

The most fundamental algorithmic problems on automata are:
operations, such as computing set union or complement of the languages of automata; tests, such as checking their languages for equality. In this post I focus on the latter: equivalence checking. In general, equivalence checking for automata is PSPACE-complete, which is excusable considering how compact and flexible data structures they are. One can check two deterministic finite automata for equivalence in polynomial time, by searching their product.

Less known is the fact that one can efficiently check much wider classes of automata for equivalence, including unambiguous automata, probab…

### Unambiguous Automata: From PSPACE to P

Computer science is about leveraging mathematical structure, for instance, in order to accelerate algorithms. More often than we'd like we can't accelerate much: For instance, checking two NFAs for language inclusion or equivalence is PSPACE-complete. Even the problem whether an NFA accepts all words is PSPACE-complete. For DFAs, those problems are easy. This means that for NFAs we can't do much better than determinizing (using the subset construction and incurring an exponential blowup) and solving the resulting DFA problem.

In this post I'll discuss an automata problem that seems to call for determinization and is, in fact, PSPACE-complete for NFAs. But there is an efficient algorithm for unambiguous automata.

Here is an example of an unambiguous automaton:
The notion of accepting states is not used in this post. By unambiguousness I just mean that the automaton has no diamonds, that is, for any states $q, r$, any word $w$ labels at most one path from $q$ to $r$. (…