The tale of the strange coffee vending machine

Where I want to convince you that coffee vending machine are more interesting than they look.

Alice and Bob are co-workers, in the basement of their office there is an old and battered coffee vending machine

it is an outdated single product model, it accepts colored tokens like these:

The machine delivers a cup of coffee only if one inserts two tokens of different colors (either first the blue then the green or vice versa). We don't know exactly how the machine works but we can suppose, for now, that it based on an automaton like this¹:

Every morning Alice comes to the office and first thing she would like to drink a cup of coffee but invariably she is disturbed by Bob. Here are the facts:

Alice goes to the coffee machine and puts a token in (either the blue or the green one)
as soon as Bob hears Alice he scrambles for the coffee machine and inserts a token at random
Bob doesn't know the color of the token Alice has inserted so one of two things can happen:

Bob's token is the same color as Alice's
then the machine does nothing, a disgruntled Bob walks away and Alice can finally get her coffee by inserting her other token

Bob's token is different from Alice's
then the machine produces a coffee cup, Bob grabs it and happily walks away

Bob then has 50% probability to steal Alice's coffee

Why does Bob do it? Maybe he is just a bad character, maybe he thinks it's funny, … who knows.

Of course Alice is sick of Bob's behaviour and wants to stop it. She then contacts Charlie, the coffee machine producer.

Alice discovers that the machine, contrary to our initial supposition, is not based on a deterministic automaton but on a weighted automaton.

Now, a weighted automaton doesn't just accept or reject a sequence of tokens but it computes a weight², for example here the machine computes +1 for the sequence "BG" and -1 for "GB".

But why anyone would construct a vending machine like this?

The answer is: to allow for many kinds of tokens, a weighted automaton can accept token superpositions!!!

It works like this: the previous automaton accepts tokens "B" and "G", but what happens if we manufactured two new tokens like these:

\[(B + G) \hspace{1cm} (B - G)\]

Let's try to feed the sequence "(B+G)(B-G)" to the machine:

the first token "(B+G)" prompts the machine to follow both transitions

\[S \xrightarrow{B:+1} q_1 \hspace{1cm} S \xrightarrow{G:+1} q_2\]

with the same weights
the second token "(B-G)" makes the machine follow transitions

\[q_1 \xrightarrow{G:+1 \times -1} F \hspace{1cm} q_2 \xrightarrow{B:-1} F\]

but this time the -1 in front of G changes the first weight, the final weight then is -2

What happens if you feed "(B+G)(B+G)" instead? If you trace the transitions you will see that when the machine reads the second "(B+G)" the weights cancel out.

Since we want to use token superposition a lot from now on we represent tokens of the form \(T = \alpha B + \beta G\) with a two component vector

\[T = \begin{bmatrix}\alpha \\ \beta \end{bmatrix}\]

so for example the blue and green tokens become

\[B = \begin{bmatrix}1 \\ 0\end{bmatrix} \hspace{1cm} G = \begin{bmatrix}0 \\ 1\end{bmatrix}\]

Now we are in a position to implement the automaton in Python:

import numpy as np

B, G = np.array([1.0, 0]), np.array([0, 1.0])

def ψ(w):
      # states:     S  q1 q2 F
      α = np.array([1, 0, 0, 0])    # initial state S has weight 1
      ω = np.array([0, 0, 0, 1])    # final state F has weight 1
      T = np.array([[[0, 1, 0, 0],  # B transition: S → q1
                     [0, 0, 0, 0],
                     [0, 0, 0,-1],  # B transition: q2 → F
                     [0, 0, 0, 0]],
                    [[0, 0, 1, 0],  # G transition: S → q2
                     [0, 0, 0, 1],  # G transition: q1 → F
                     [0, 0, 0, 0],
                     [0, 0, 0, 0]],
                    ])
      t = α
      for c in w:
            t = t @ (c[0]*T[0] + c[1]*T[1])
      t = t @ ω
      return t

print("Blue/Green tokens:")
print(" BG →", ψ([B, G]))
print(" GB →", ψ([G, B]))
print(" BB →", ψ([B, B]))
print(" GG →", ψ([G, G]))

print("Superposition tokens:")
print(" (B+G)(B-G) →", ψ([B+G, B-G]))
print(" (B-G)(B+G) →", ψ([B-G, B+G]))
print(" (B+G)(B+G) →", ψ([B+G, B+G]))
print(" (B-G)(B-G) →", ψ([B-G, B-G]))

Blue/Green tokens:
 BG → 1.0
 GB → -1.0
 BB → 0.0
 GG → 0.0
Superposition tokens:
 (B+G)(B-G) → -2.0
 (B-G)(B+G) → 2.0
 (B+G)(B+G) → 0.0
 (B-G)(B-G) → 0.0

Finally the coffee machine is implemented in this way:

def squared_norm(v):
    return np.round(np.real(np.conj(v) * v), 4)

def coffee(w):
    return "COFFEE" if squared_norm(ψ(w)) == 1 else "REJECT"

print("Blue/Green tokens:")
print(" BG →", coffee([B, G]))
print(" GB →", coffee([G, B]))
print(" BB →", coffee([B, B]))
print(" GG →", coffee([G, G]))

Blue/Green tokens:
 BG → COFFEE
 GB → COFFEE
 BB → REJECT
 GG → REJECT

Ok, now that we have implemented the coffee machine let's discuss the tokens themselves.

The coffee machine producer Charlie hired a professor Bloch as the chief of the engineering department. Instead of using the vector components prof. Bloch has devised a way to encode the tokens using two angles \(\theta,\varphi\)³:

\[T = \begin{bmatrix} \cos \frac{\theta}{2} \\ e^{i\varphi} \sin \frac{\theta}{2} \end{bmatrix}\]

similarly he has defined an RGB color code for the tokens:

\[R = 127 \cdot (1 + \sin\theta \cdot \cos\varphi) \hspace{0.5cm} G = 127 \cdot (1 + \sin\theta \cdot \sin\varphi) \hspace{0.5cm} B = 127 \cdot (1 + \cos\theta)\]

in Python:

def bloch_token(θ, φ):
    return np.array([np.cos(θ/2), np.exp(1j * φ) * np.sin(θ/2)])

def bloch_color(θ, φ):
    x = np.sin(θ) * np.cos(φ)
    y = np.sin(θ) * np.sin(φ)
    z = np.cos(θ)

    r,g,b = [int(np.round(127*(1 + t))) for t in [x,y,z]]

    return f"#{r:02x}{g:02x}{b:02x}"

Now let's construct a few tokens pairs⁴:

in Python again:

π = np.pi
Xp, Xm = bloch_token(π/2, 0),   bloch_token(π/2, π)
Yp, Ym = bloch_token(π/2, π/2), bloch_token(π/2, 3*π/2)
Zp, Zm = bloch_token(0,0),      bloch_token(π,0)

Finally let's see what happens when we feed them to the machine:

print("X+X-",coffee([Xp,Xm]),"X-X+", coffee([Xm,Xp]),"X+X+", coffee([Xp,Xp]),"X-X-",coffee([Xm,Xm]))
print("Y+Y-",coffee([Yp,Ym]),"Y-Y+", coffee([Ym,Yp]),"Y+Y+", coffee([Yp,Yp]),"Y-Y-",coffee([Ym,Ym]))
print("Z+Z-",coffee([Zp,Zm]),"Z-Z+", coffee([Zm,Zp]),"Z+Z+", coffee([Zp,Zp]),"Z-Z-",coffee([Zm,Zm]))

X+X- COFFEE X-X+ COFFEE X+X+ REJECT X-X- REJECT
Y+Y- COFFEE Y-Y+ COFFEE Y+Y+ REJECT Y-Y- REJECT
Z+Z- COFFEE Z-Z+ COFFEE Z+Z+ REJECT Z-Z- REJECT

they work like the original B and G, but if you try to mix tokens from different pairs the machine always rejects them (try it).

Armed with this new knowledge Alice devises a strategy that destroys the chances of Bob stealing her coffee.

Alice prepares a stash of \(X_{+},X_{-}\) tokens and uses them instead of the usual B,G. Now when Bob insert his B or G token the machine invariably rejects it:

print("the machine accepts Alice's token:")
print(" Xp Xm →", coffee([Xp, Xm]))
print(" Xm Xp →", coffee([Xm, Xp]))
print("but once Alice has inserted her token it always rejects Bob's:")
print(" Xp B  →", coffee([Xp, B]))
print(" Xp G  →", coffee([Xp, G]))
print(" Xm B  →", coffee([Xm, B]))
print(" Xm G  →", coffee([Xm, G]))

the machine accepts Alice's token:
 Xp Xm → COFFEE
 Xm Xp → COFFEE
but once Alice has inserted her token it always rejects Bob's:
 Xp B  → REJECT
 Xp G  → REJECT
 Xm B  → REJECT
 Xm G  → REJECT

Inserting the first token primes the machine and puts it in a state that will accept only the other token of the pair and nothing else. This is more subtle than the deterministic automaton because now both states are active (except in the case of B,G tokens when one state has weight zero).

But then Alice thinks: "I can do even better!". Every morning she produces a random pair of tokens in this way:

def make_random_bloch_pair():
    θ, φ = π*np.random.rand(), 2*π*np.random.rand()
    return bloch_token(θ,φ), bloch_token(π-θ,φ-π)

First of all let's check that they behave in the usual way:

for _ in range(1000):
    A1, A2 = make_random_bloch_pair()
    assert(coffee([A1,A2]) == coffee([A2,A1]) == 'COFFEE')
    assert(coffee([A1,A1]) == coffee([A2,A2]) == 'REJECT')

and again let's check that tokens from different pair are always rejected

for _ in range(1000):
    A1, A2 = make_random_bloch_pair() # Alice's tokens
    B1, B2 = make_random_bloch_pair() # Bob's tokens
    assert(coffee([A1,B1]) == coffee([A1,B2]) == coffee([A2,B1]) == coffee([A2,B2]) == 'REJECT')

Now, since Bob can't see the first token Alice has inserted, his only chance is to guess the values of \(\theta,\varphi\) that Alice has used when manufacturing her pair which drastically lowers his success rate.

Alice wins!

This is the end of the story, but you may well ask: "What on earth is this all about?"

You may have noticed that:

the tokens actually are qubit states

\[B \to \ket{0} \hspace{1cm} G \to \ket{1}\]
the \(\theta,\varphi\) token encoding is just the Bloch sphere representation of the qubit state
the color is just the Bloch vector (it's the three dimensional spin vector of the particle)
and of course token superposition is just the superposition of quantum states
with a lot of fantasy the coffee machine is the unnormalized entangled state (more on this in a future post)

\[\ket{\psi} = \ket{0}\ket{1} - \ket{1}\ket{0}\]

What I have tried to do is to transpose these concepts into the language of weighted automata (whether it makes sense or not).

Actually this is something we have already done in the post about Automata & Matrix Product States but here I wanted to stress the point that the quantum mechanical concepts of superposition and entanglement have a computational counterpart.

Footnotes:

see how this automaton accepts either "Green-Blue" or "Blue-Green" and nothing else

again, for a more detailed introduction see my previous post

with \(0 \le \theta \le \pi\) and \(0 \le \varphi \le 2\pi\)

⁴

the last pair is just the old B,G tokens