Luca Marx

Automata & Parallelism

Weighted automata have a peculiar kind of parallelism. In this post we will see how to exploit it to evaluate a function over many arguments at once, this will lead to a "poor-man's" version of Deutsch-Jozsa algorithm.

A few posts back we saw how to use weighted automata to compute real valued functions, in that post we encoded the domain of the function into a binary string and we took the computed weight as the output of the function. This time I want to discuss a different scheme.

Consider a function \(f(x)\) that takes an integer in the interval \([0, 2^{n}-1]\) and outputs a integer in \([0, 2^{m}-1]\), we can encode \(f\) into a deterministic finite automaton in this way

  1. encode both \(x\) and \(y=f(x)\) into binary strings
  2. concatenate them and construct the language

    \[L = \{ x \cdot y \, | \, x \in \{0,1\}^n, y = f(x) \}\]

  3. build the deterministic finite automaton that recognizes \(L\)

Let's see how it works with a simple example: \(f(x) = \lfloor x/2 \rfloor\)

f(0) = 0      f(4) = 2
f(1) = 0      f(5) = 2
f(2) = 1      f(6) = 3
f(3) = 1      f(7) = 3

encode \(x, f(x)\) into binary strings

f(000) = 00   f(100) = 10
f(001) = 00   f(101) = 10
f(010) = 01   f(110) = 11
f(011) = 01   f(111) = 11

construct the language

00000         10010
00100         10110
01001         11011
01101         11111

and finally build the automaton that recognizes it, we will be using the Brzozowski algorithm (all the code you will need for this post is here1)

from wfa import make_automaton_from_words

L = '00000 00100 01001 01101 10010 10110 11011 11111'
φ = make_automaton_from_words(L)
φ.diag('floor')

the result looks like this

floor.png

Note how you can trace a single path for each pair \((x, f(x))\) in the graph.

But now that we have smashed together \(x\) and \(f(x)\) can we still evaluate the function?

Yes: to evaluate \(f(x)\) we

  1. use the derivative operator \(\partial_x\)2 over the binary string encoding of \(x\): this is the same as tracing the path for \(x\) up to the state where the output \(f(x)\) begins.

    For example take \(x=5\) ('101') tracing the path in the graph you'll see that it leads to state \(s9\) just where the output \(y=2\) ('10') begins.

    As expected the resulting automaton \(\partial_{101} \varphi\)

    φ.D('101').diag('f_101')
    
    f_101.png

    matches just the string '10'.

  2. take a sample from \(\partial_{101} \varphi\) to retrieve the value (or the values) from the automaton.

    There are more efficient ways to do it but here we just enumerate all the possible words and keep only those accepted by the automaton.

from itertools import product

# cycle through the possible values of x
for x in product(*[['0','1']]*3):
    # first derive
    y = φ.D(''.join(x))
    # then take a sample
    print(f"f({''.join(x)}) = {y.sample(2)}")
f(000) = 00
f(001) = 00
f(010) = 01
f(011) = 01
f(100) = 10
f(101) = 10
f(110) = 11
f(111) = 11

that's it, we have fully recovered \(f\) from the automaton \(\varphi\).


Up to this point we haven't seen any parallelism, \(\varphi\) is deterministic, it encodes the pairs \(x, f(x)\) into a single path in its graph, so no parallelism here.

But I have been cheating, under the hood make_automaton_from_words actually builds a weighted automaton3.

This means that, as we have done in the last post, we can evaluate \(\varphi\) over superpositions of its arguments.

For example: let's evaluate \(\varphi\) over all of its possible arguments, how would we do it?

Instead of evaluating \(\varphi\) over binary strings we construct the wildcard alphabet where the letters are combinations of '0' and '1' like this

\[(0+1) \hspace{1cm} (0-1)\]

and feed those to the automaton, let's see how it is done

import numpy as np
from wfa import zero, one

# the new 'letters' (with normalization)
Xp = (zero + one) / np.sqrt(2) # the (0+1) letter
Xm = (zero - one) / np.sqrt(2) # the (0-1) letter

# evaluate φ as before
all_x_φ = φ.D([Xp, Xp, Xp])
print(f"f((0+1) (0+1) (0+1)) = {all_x_φ.sample(2)}")
f((0+1) (0+1) (0+1)) = 00, 01, 10, 11

and we get all the possible values of \(f\).

We can also mix letters from different alphabets, for example let's evaluate \(\varphi\) over all arguments whose encodings start with a '1'

start_with_1 = φ.D([one, Xp, Xp])
print(f"f(1 (0+1) (0+1)) = {start_with_1.sample(2)}")
f(1 (0+1) (0+1)) = 10, 11

or have a middle '1'

middle_1 = φ.D([Xp, one, Xp])
print(f"f((0+1) 1 (0+1) = {middle_1.sample(2)}")
f((0+1) 1 (0+1) = 01, 11

This works because when we feed the automaton with a superposition of two letters we force the automaton to follow different paths simultaneously.

For example suppose the automaton is in the starting state \(s0\) if we feed '(0+1)' we make it follow both transitions \(s0 \to s1\) and \(s0 \to s2\), making the automaton end up in a linear combination of the states \(s1,s2\).

For example in the last automaton we calculated:

the resulting automaton then starts from there

middle_1.diag('f_p1p')
f_p1p.png

and matches '01' and '11' as expected.

Consider another case: imagine you want to evaluate \(f\) on the superposition of \(x=1\) ('001') and \(x=7\) ('111'). You could evaluate \(f\) on '(0+1) (0+1) 1' but then you would get all the values corresponding to '001 011 101 111' and not just '001 111'.

In this case we need to take the derivative over another automaton, we do it in this way

  1. make a new automaton \(\eta\) that matches only the two values \(x=1\) ('001') and \(x=7\) ('111')
  2. take the derivative of \(\varphi\) with respect to \(\eta\): that is use \(\eta\) to filter only the required values of \(x\)4
η = make_automaton_from_words('001 111')

y = φ.D_wfa(η, 3)
y.diag('d_phi_eta')
print(f"f(001 111) = {y.sample(2)}")
f(001 111) = 00, 11

and the resulting automaton is

d_phi_eta.png

Deutsch–Jozsa algorithm now, the setup is this:

  1. Alice selects a number \(x = 0 \dots (2^n-1)\) and sends it to Bob who is far away
  2. Bob has a binary function \(f\) that is either

    • constant for all values of \(x\)
    • balanced that is equal to 1 for half the values of \(x\) and 0 for the other half

    he computes the value \(f(x)\) and sends it back to Alice

  3. Alice must determine whether \(f\) is constant or balanced

Classically Alice would need at least \(1 + 2^n/2\) evaluations of \(f\), using the quantum Deutsch-Josza algorithm she needs just one evaluation.

Now suppose that Bob has encoded his function \(f\) into a weighed automaton:

from wfa import make_automaton_from_f

# the constant f
cst_f = lambda _: '0'
cst = make_automaton_from_f(cst_f, 3)
cst.diag("bob_cst")

# the balanced f (or use 'make_random_balanced_f')
bal_f = lambda w: '1' if w in ['001','010','011','101'] else '0'
bal = make_automaton_from_f(bal_f, 3)
bal.diag("bob_bal")

# tabulate them
for x in product(*[['0','1']]*3):
    x = ''.join(x)
    print(f"cst({x}) = {cst_f(x)}    bal({x}) = {bal_f(x)}")
cst(000) = 0    bal(000) = 0
cst(001) = 0    bal(001) = 1
cst(010) = 0    bal(010) = 1
cst(011) = 0    bal(011) = 1
cst(100) = 0    bal(100) = 0
cst(101) = 0    bal(101) = 1
cst(110) = 0    bal(110) = 0
cst(111) = 0    bal(111) = 0

The constant automaton is

bob_cst.png

The balanced automaton is

bob_bal.png

If we take, as before, the derivative over a string of wildcard letters '(0+1) (0+1) (0+1)' we get

Then it's easy to discriminate them, just evaluate the last automata over the letter \((0-1)\): if we are in the balanced case

and since the two weights were equal we get a zero. Conversely in the constant case we just have the state \(s3\) and the single transition \(s3 \to s4\) so there is no weight cancellation.


Finally we have the poor-man's Deutsch-Josza algorithm

  1. Alice selects the string '(0+1) (0+1) (0+1) (0-1)' and sends it to Bob who is far away
  2. Bob has a constant/balanced binary function \(f\) that he has converted into an automaton, he evaluates the automaton over Alice's string and sends the result back to her

    x = [Xp, Xp, Xp, Xm]
    
    print(f"constant f case: {cst(x)}")
    print(f"balanced f case: {bal(x)}")
    
    constant f case: 2.0
    balanced f case: 0.0
    
  3. if Alice receives \(0\) she concludes that \(f\) is balanced otherwise \(f\) must be constant

That's it Alice can discriminate the two cases with just one evaluation.

Footnotes:

1

the only requirements are numpy and graphviz for making diagrams

3

one with all the transition weights set to one

4

see how D_wfa is implemended in the code