0% found this document useful (0 votes)
27 views17 pages

Chapter Notes

The document discusses randomized algorithms, focusing on quantum computing and its theoretical foundations. It covers concepts such as random circuits, bra-ket notation, and the transition from classical to quantum operations using complex-valued amplitudes. Additionally, it explores the implications of these algorithms in various applications, including MAX CUT and quantum operations.

Uploaded by

kaiser1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views17 pages

Chapter Notes

The document discusses randomized algorithms, focusing on quantum computing and its theoretical foundations. It covers concepts such as random circuits, bra-ket notation, and the transition from classical to quantum operations using complex-valued amplitudes. Additionally, it explores the implications of these algorithms in various applications, including MAX CUT and quantum operations.

Uploaded by

kaiser1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Notes on Randomized Algorithms

James Aspnes

2020-10-24 14:59
CONTENTS viii

11.1.3 Directed cycles in tournaments . . . . . . . . . . . . . 211


11.2 Approximation algorithms . . . . . . . . . . . . . . . . . . . . 212
11.2.1 MAX CUT . . . . . . . . . . . . . . . . . . . . . . . . 212
11.2.2 MAX SAT . . . . . . . . . . . . . . . . . . . . . . . . 213
11.3 The Lovász Local Lemma . . . . . . . . . . . . . . . . . . . . 217
11.3.1 General version . . . . . . . . . . . . . . . . . . . . . . 217
11.3.2 Symmetric version . . . . . . . . . . . . . . . . . . . . 218
11.3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . 219
[Link] Graph coloring . . . . . . . . . . . . . . . . . 219
[Link] Satisfiability of k-CNF formulas . . . . . . . 219
[Link] Hypergraph 2-colorability . . . . . . . . . . . 220
11.3.4 Non-constructive proof . . . . . . . . . . . . . . . . . . 220
11.3.5 Constructive proof . . . . . . . . . . . . . . . . . . . . 223

12 Derandomization 228
12.1 Deterministic vs. randomized algorithms . . . . . . . . . . . . 229
12.2 Adleman’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 230
12.3 Limited independence . . . . . . . . . . . . . . . . . . . . . . 231
12.3.1 MAX CUT . . . . . . . . . . . . . . . . . . . . . . . . 231
12.4 The method of conditional probabilities . . . . . . . . . . . . 232
12.4.1 A trivial example . . . . . . . . . . . . . . . . . . . . . 233
12.4.2 Deterministic construction of Ramsey graphs . . . . . 233
12.4.3 MAX CUT using conditional probabilities . . . . . . . 234
12.4.4 Set balancing . . . . . . . . . . . . . . . . . . . . . . . 235

13 Quantum computing 236


13.1 Random circuits . . . . . . . . . . . . . . . . . . . . . . . . . 236
13.2 Bra-ket notation . . . . . . . . . . . . . . . . . . . . . . . . . 239
13.2.1 States as kets . . . . . . . . . . . . . . . . . . . . . . . 240
13.2.2 Operators as sums of kets times bras . . . . . . . . . . 240
13.3 Quantum circuits . . . . . . . . . . . . . . . . . . . . . . . . . 241
13.3.1 Quantum operations . . . . . . . . . . . . . . . . . . . 242
13.3.2 Quantum implementations of classical operations . . . 243
13.3.3 Representing Boolean functions . . . . . . . . . . . . . 245
13.3.4 Practical issues (which we will ignore) . . . . . . . . . 245
13.3.5 Quantum computations . . . . . . . . . . . . . . . . . 246
13.4 Deutsch’s algorithm . . . . . . . . . . . . . . . . . . . . . . . 246
13.5 Grover’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . 247
13.5.1 Initial superposition . . . . . . . . . . . . . . . . . . . 248
13.5.2 The Grover diffusion operator . . . . . . . . . . . . . . 248
Chapter 13

Quantum computing

Last updated 2019. Some material may be out of date.

Quantum computing is a currently almost-entirely-theoretical branch


of randomized algorithms that attempts to exploit the fact that probabilities
at a microscopic scale arise in a mysterious way from more fundamental
probability amplitudes, which are complex-valued and can cancel each
other out where probabilities can only add. In a quantum computation, we
replace random bits with quantum bits—qubits for short—and replace
random updates to the bits with quantum operations.
To explain how this works, we’ll start by re-casting our usual model of a
randomized computation to make it look more like the standard quantum
circuits of Deutsch [Deu89]. We’ll then get quantum computation by
replacing all the real-valued probabilities with complex-valued amplitudes.

13.1 Random circuits


Let’s consider a very simple randomized computer whose memory consists of
two bits. We can describe our knowledge of the state of this machine using a
vector of length 4, with the coordinates in the vector giving the probability
of states 00, 01, 10, and 11. For example, if we know that the initial state is
always 00, we would have the (column) vector
 
1
0
 .
 
0
0

236
CHAPTER 13. QUANTUM COMPUTING 237

Any such state vector x for the system must consist of non-negative
real values that sum to 1; this is just the usual requirement for a discrete
probability space. Operations on the state consist of taking the old values of
one or both bits and replacing them with new values, possibly involving both
randomness and dependence on the old values. The law of total probability
applies here, so we can calculate the new state vector x0 from the old state
vector x by the rule

x0b0 b0 = xb1 b2 Pr Xt+1 = b01 b02 Xt = b1 b2 .


X  
1 2
b1 b2

These are linear functions of the previous state vector, so we can summarize
the effect of our operation using a transition matrix A, where x0 = Ax.1
We imagine that these operations are carried out by feeding the initial
state into some circuit that generates the new state. This justifies the calling
this model of computation a random circuit. But the actual implementa-
tion might be an ordinary computer than is just flipping coins. If we can
interpret each step of the computation as applying a transition matrix, the
actual implementation doesn’t matter.
For example, if we negate the second bit 2/3 of the time while leaving
the first bit alone, we get the matrix
 
1/3 2/3 0 0
2/3 1/3 0 0 
A= .
 
 0 0 1/3 2/3
0 0 2/3 1/3

One way to derive this matrix other than computing each entry directly is
that it is the tensor product of the matrices that represent the operations
on each individual bit. The idea here is that the tensor product of A and B,
written A ⊗ B, is the matrix C with Cij,k` = Aik Bj` . We’re cheating a little
bit by allowing the C matrix to have indices consisting of pairs of indices,
one for each of A and B; there are more formal definitions that justify this
at the cost of being harder to understand.
1
Note that this is the reverse of the convention we adopted for Markov chains in
Chapter 9. There it was convenient to have Pij = pij = Pr [Xt+1 = j | Xt = i]. Here we
defer to the physicists and make the update operator come in front of its argument, like
any other function.
CHAPTER 13. QUANTUM COMPUTING 238

In this particular case, we have


 
1/3 2/3 0 0 " # " #
2/3 1/3 0 0  1 0 1/3 2/3
= ⊗ .
 
 0 0 1/3 2/3 0 1 2/3 1/3

0 0 2/3 1/3

The first matrix in the tensor product gives the update rule for the first bit
(the identity matrix—do nothing), while the second gives the update rule for
the second.
Some operations are not decomposable in this way. If we swap the values
of the two bits, we get the matrix
 
1 0 0 0
0 0 1 0
S= ,
 
0 1 0 0
0 0 0 1

which maps 00 and 11 to themselves but maps 01 to 10 and vice versa.


The requirement for all of these matrices is that they be stochastic. This
means that each column has to sum to 1, or equivalently that 1A = 1, where
1 is the all-ones vector. This just says that our operations map probability
distributions to probability distributions; we don’t apply an operation and
find that the sum of our probabilities is now 3/2 or something. (Proof: If
1A = 1, then |Ax|1 = 1(Ax) = (1A)x = 1x = |x|1 .)
A randomized computation in this model now consists of a sequence
of these stochastic updates to our random bits, and at the end performing a
measurement by looking at what the values of the bits actually are. If we
want to be mystical about it, we could claim that this measurement collapses
a probability distribution over states into a single unique state, but really
we are just opening the box to see what we got.
For example, we could generate two bias-2/3 coin-flips by starting with
00 and using the algorithm flip second, swap bits, flip second, or in matrix
CHAPTER 13. QUANTUM COMPUTING 239

terms:

xout = ASAxin
    
1/3 2/3 0 0 1 0 0 0 1/3 2/3 0 0 1
2/3 1/3 0 0 
 0 0 1  2/3 1/3
0 0  0
0 
   
=

 0 0 1/3 2/3 0 1 0 0  0 0 1/3 2/3 0
   

0 0 2/3 1/3 0 0 0 1 0 0 2/3 1/3 0


 
1/9
2/9
= .
 
2/9
4/9

When we look at the output, we find 11 with probability 4/9, as we’d


expect, and similarly for the other values.

13.2 Bra-ket notation


A notational oddity that scares many people away from quantum mechanics in
general and quantum computing in particular is the habit of practitioners in
these fields of writing basis vectors and their duals using bra-ket notation,
a kind of typographical pun invented by the physicist Paul Dirac [Dir39].
This is based on a traditional way of writing an inner product of two
vectors x and y in “bracket form” as hx|yi. The interpretation of this is
hx|yi = x∗ y, where x∗ is the conjugate transpose of x.
For real-valued x the conjugate transpose is the same as the transpose.
For complex-valued x, each coordinate xi = a + bi is replaced by its complex
conjugate x̄i = a − bi.) Using the conjugate transpose makes hx|xi equal
|x|22 when x is complex-valued.
For example, for our vector xin above that puts all of its probability on
00, we have
 
1
h i 0
hxin |xin i = 1 0 0 0   = 1. (13.2.1)
 
0
0

The typographic trick is to split in half both hx|yi and its expansion. For
CHAPTER 13. QUANTUM COMPUTING 240

example, we could split (13.2.1) as


 
1
h i 0
hxin | = 1 0 0 0 |xin i =   .
 
0
0

In general, wherever we used to have a bracket hx|yi, we now have a bra


hx| and a ket |yi. These are just row vectors and column vectors, and hx| is
always the conjugate transpose of |xi.
The second trick in bra-ket notation is to make the contents of the bra or
ket an arbitrary name. For kets, this will usually describe some state. As an
example, we might write xin as |00i to indicate that it’s the basis vector that
puts all of its weight on the state 00. For bras, this is the linear operator
that returns 1 when applied to the given state and 0 when applied to any
orthogonal state. So h00| |00i = h00|00i = 1 but h00| |01i = h00|01i = 0.

13.2.1 States as kets


This notation is useful for the same reason that variable names are useful.
It’s a lot easier to remember that |01i refers toi the distribution assigning
h >
probability 1 to state 01 than that 0 1 0 0 is.
Other vectors can be expressed using a linear combination of kets. For
example, we can write
1 2 2 4
xout = |00i + |01i + |10i + |11i .
9 9 9 9
This is not as compact as just writing out the vector as a matrix, but it has
the advantage of clearly labeling what states the probabilities apply to.

13.2.2 Operators as sums of kets times bras


A similar trick can be used to express operators, like the swap operator S.
We can represent S as a combination of maps from specific states to specific
other states. For example, the operator
   
0 0 0 0 0
1 h i 0 0 1 0
|01i h10| =   0 0 1 0 = 
   
0 0 0 0 0

0 0 0 0 0
CHAPTER 13. QUANTUM COMPUTING 241

maps |10i to |01i (Proof: |01i h10| |10i = |01i h10|10i = |01i) and sends all
other states to 0. Add up four of these mappings to get
 
1 0 0 0
0 0 1 0
S = |00i h00| + |10i h01| + |01i h10| + |11i h11| =  .
 
0 1 0 0
0 0 0 1

Here the bra-ket notation both labels what we are doing and saves writing a
lot of zeros.
The intuition is that just like a ket represents a state, a bra represents a
test for being in that state. So something like |01i h10| tests if we are in the
10 state, and if so, sends us to the 01 state.

13.3 Quantum circuits


So how do we turn our random circuits into quantum circuits?
The first step is to replace our random bits with quantum bits (qubits).
For a single random bit, the state vector represents a probability distri-
bution

p0 |0i + p1 |1i ,

where p0 and p1 are non-negative real numbers with p0 + p1 = 1.


For a single qubit, the state vector represents amplitudes

a0 |0i + a1 |1i ,

where a0 and a1 are complex numbers with |a0 |2 + |a1 |2 = 1.2 The reason for
this restriction on amplitudes is that if we measure a qubit, we will see state 0
with probability |a0 |2 and state 1 with probability |a1 |2 . Unlike with random
bits, these probabilities are not mere expressions of our ignorance but arise
2
√ The absolute value, norm, or magnitude |a + bi| of a complex number is given by
a2 + b2 . When b = 0, this is the same as the absolute value for the corresponding
√ real
number. For any complex number x, the norm p as x̄x, where
p can also be written √ x̄ is
the complex conjugate of x. This is because (a + bi)(a − bi) = a2 − (bi)2 = a2 + b2 .
The appearance of the complex conjugate here explains why we define hx|yi = x∗ y; the
conjugate transpose means that for hx|xi, when we multiply x∗i by xi we are computing a
squared norm.
CHAPTER 13. QUANTUM COMPUTING 242

through a still somewhat mysterious process from the more fundamental


amplitudes.3
With multiple bits, we get amplitudes for all combinations of the bits,
e.g.
1
(|00i + |01i + |10i + |11i)
2
gives a state vector in which each possible measurement will be observed
 2
1
with equal probability 2 = 14 . We could also write this state vector as
 
1/2
1/2
.
 
1/2

1/2

13.3.1 Quantum operations


In the random circuit model, at each step we pick a small number of random
bits, apply a stochastic transformation to them, and replace their values
with the results. In the quantum circuit model, we do the same thing, but
now our transformations must have the property of being unitary. Just as
a stochastic matrix preserves the property that the probabilities in a state
vector sum to 1, a unitary matrix preserves the property that the squared
norms of the amplitudes in a state vector sum to 1.
Formally, a square, complex matrix A is unitary if it preserves inner
products: hAx|Ayi = hx|yi for all x and y. Alternatively, A is unitary if
A∗ A = AA∗ = I, where A∗ is the conjugate transpose of A, because if this
holds, hAx|Ayi = (Ax)∗ (Ay) = x∗ A∗ Ay = x∗ Iy = x∗ y = hx|yi. Yet another
3
In the old days of “shut up and calculate,” this process was thought to involve the
unexplained power of a conscious observer to collapse a superposition into a classical
state. Nowadays the most favored explanation involves decoherence, the difficulty of
maintaining superpositions in systems that are interacting with large, warm objects with lots
of thermodynamic degrees of freedom (measuring instruments, brains). The decoherence
explanation is particularly useful for explaining why real-world quantum computers have a
hard time keeping their qubits mixed even when nobody is looking at them. Decoherence
by itself does not explain which basis states a system collapses to. Since bases in linear
algebra are pretty much arbitrary, it would seem that we could end up running into a
physics version of Goodman’s grue-bleen paradox [Goo83], but there are appearently ways
of dealing with this too using a mechanism called einselection [Zur03], that favors classical
states over weird ones. Since all of this is (a) well beyond my own limited comprehension
of quantum mechanics and (b) irrelevant to the theoretical model we are using, these issues
will not be discussed further.
CHAPTER 13. QUANTUM COMPUTING 243

way to state this is that the columns of A form an orthonormal basis: this
means that hAi |Aj i = 0 if i 6= j and 1 if i = j, which is just a more verbose
way to say A∗ A = I. The same thing also works if we consider rows instead
of columns.
The rule then is: at each step, we can operate on some constant number
of qubits by applying a unitary transformation to them. In principle, this
could be any unitary transformation, but some particular transformations
show up often in actual quantum algorithms.4
The simplest unitary transformations are permutations on states (like
the operator that swaps two qubits), and rotations of a single state. One
particularly important rotation is the Hadamard operator
r " #
1 1 1
H= .
2 1 −1
q q
This maps |0i to the superposition 12 |0i + 12 |1i; since this superposition
collapses to either |0i or |1i with probability 1/2, the state resulting from
H |0i is q
the quantum-computing
q equivalent of a fair coin-flip. Note that
H |1i = 12 |0i − 12 |1i 6= H |0i. Even though both yield the same prob-
abilities; these two superpositions have different phases and may behave
differently when operated on further. That H |0i and H |1i are different is
necessary, and indeed a similar outcome occurs for any quantum operation:
all quantum operations are reversible, because any unitary matrix U has
an inverse U ∗ .
If we apply H in parallel to all the qubits in our system, we get the n-fold
tensor product H ⊗n , which (if we take our bit-vector indices
q as integers
−1
0 . . . N − 1 = 2n − 1 represented in binary) maps |0i to N1 N i=0 |ii. So
P

n applications of H effectively scramble a deterministic initial state into a


uniform distribution across all states. We’ll see this scrambling operation
again when we look at Grover’s algorithm in §13.5.

13.3.2 Quantum implementations of classical operations


One issue that comes up with trying to implement classical algorithms in the
quantum-circuit model is that classical operation are generally not reversible:
if I execution x ← x ∧ y, it may not be possible to reconstruct the old state
of x. So I can’t implement AND directly as a quantum operation.
4
Deutsch’s original paper [Deu89] shows that repeated applications of single-qubit
rotations and the CNOT operation (described in §13.3.2) are enough to approximate any
unitary transformation.
CHAPTER 13. QUANTUM COMPUTING 244

The solution is to use more sophisticated reversible operations from which


standard classical operations can be extracted as a special case. A simple
example is the controlled NOT or CNOT operator, which computes the
mapping (x, y) 7→ (x, x ⊕ y). This corresponds to the matrix (over the basis
|00i , |01i , |10i , |11i)
 
1 0 0 0
0 1 0 0
,
 
0 0 0 1

0 0 1 0
which is clearly unitary (the rows are just the standard basis vectors). We
could also write this more compactly as |00i h00| + |01i h01| + |11i h10| +
|10i h11|.
The CNOT operator gives us XOR, but for more destructive operations
we need to use more qubits, possibly including junk qubits that we won’t look
at again but that are necessary to preserve reversibility. The Toffoli gate or
controlled controlled NOT gate (CCNOT) is a 3-qubit gate that was
originally designed to show that classical computation could be performed
reversibly [Tof80]. It implements the mapping (x, y, z) 7→ (x, y, (x ∧ y) ⊕ z),
which corresponds to the 8 × 8 matrix
 
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
 
0 0 1 0 0 0 0 0
 
 
0 0 0 1 0 0 0 0
 .
0 0 0 0 1 0 0 0
 
0 0 0 0 0 1 0 0


0 0 0 0 0 0 0 1
 

0 0 0 0 0 0 1 0
By throwing in some extra qubits we don’t care about, Toffoli gates
can implement basic operations like NAND ((x, y, 1) 7→ (x, y, ¬(x ∧ y))),
NOT ((x, 1, 1) 7→ (x, 1, ¬x)), and fan-out ((x, 1, 0) 7→ (x, 1, x)).5 This gives
5
In the case of fan-out, this only works with perfect accuracy for classical bits and not
superpositions, which run into something called the no-cloning theorem. For example,
applying CCNOT to √12 |010i + √12 |110i yields √12 |010i + √12 |111i. This works, sort of,
but the problem is that the first and last bits are still entangled, meaning we can’t operate
on them independently. This is actually not all that different from what happens in the
probabilistic case (if I make a copy of a random variable X, it’s correlated with the original
X), but it has good or bad consequences depending on whether you want to prevent people
from stealing your information undetected or run two copies of a quantum circuit on
independent replicas of a single superposition.
CHAPTER 13. QUANTUM COMPUTING 245

a sufficient basis for implementing all classical circuits.

13.3.3 Representing Boolean functions


Suppose we have a quantum circuit that computes a Boolean function f .
There are two conventions for representing the output of f :
1. We can represent f (x) by XORing it with an extra qubit y: |x, yi 7→
|x, y ⊕ f (x)i. This is the approach taken by the CNOT (f (x) = x) and
CCNOT (f (x1 x2 ) = x1 ∧ x2 ) gates.

2. We can represent f (x) by changing the phase of |xi: |xi 7→ (−1)f (x) |xi.
The actual unitary operator corresponding to this is x (−1)f (x) |xi hx|,
P

which in matrix form looks like a truth table for f expressed as ±1


values along the diagonal, e.g.:
 
1 0 0 0
0 −1 0 0
 
0 0 −1 0
 

0 0 0 1
for the XOR function.
This has the advantage of requiring one fewer qubit, but we can’t
observe the value of f (x) directly, because the amplitude of |xi is
unchanged. Where this comes in handy is when we can use the change in
phase in the context of a larger quantum algorithm to get cancellations
for some values of x.
The first representation makes more sense for modeling classical circuits.
The second turns out to be more useful when f is a subroutine in a larger
quantum algorithm. Fortunately, the operator with matrix
" #
1 0
0 −1

converts the |0i–|1i representation into the ±1 representation, so we can


build any desired classical f using |0i–|1i logic and convert it to ±1 as
needed.

13.3.4 Practical issues (which we will ignore)


The big practical question is whether any of these operations—or even non-
trivial numbers of independently manipulable qubits—can be implemented
CHAPTER 13. QUANTUM COMPUTING 246

in a real, physical system. As theorists, we can ignore these issues, but in


real life they are what would make quantum computing science instead of
science fiction.6

13.3.5 Quantum computations


Putting everything together, a quantum computation consists of three stages:

1. We start with a collection of qubits in some known state x0 (e.g.,


|000 . . . 0i).

2. We apply a sequence of unitary operators A1 , A2 , . . . Am to our qubits.

3. We take a measurement of the final superposition Am Am−1 . . . A1 x0


that collapses it into a single state, with probability equal to the square
of the amplitude of that state.

Our goal is for this final state to tell us what we want to know, with
reasonably high probability.

13.4 Deutsch’s algorithm


We now have enough machinery to describe a real quantum algorithm.
Known as Deutsch’s algorithm, this computes f (0) ⊕ f (1) while evaluating
f once [Deu89]. The trick, of course, is that f is applied to a superposition.
Assumption: f is implemented reversibly, as a quantum computation
that maps |xi to (−1)f (x) |xi. To compute f (0) ⊕ f (1), evaluate
r
1
Hf H |0i = Hf (|0i + |1i)
r2 
1 
= H (−1)f (0) |0i + (−1)f (1) |1i
2
1     
= (−1)f (0) + (−1)f (1) |0i + (−1)f (0) − (−1)f (1) |1i .
2
Suppose now that f (0) = f (1) = b. Then the |1i terms cancel out and
we are left with
1 
2 · (−1)b |0i = (−1)b |0i .
2
6
Current technology, sadly, still puts quantum computing mostly in the science fiction
category.
CHAPTER 13. QUANTUM COMPUTING 247

This puts all the weight on |0i, so when we take our measurement at the
end, we’ll see 0.
Alternatively, if f (0) = b 6= f (1), it’s the |0i terms that cancel out,
leaving (−1)b |1i. Again the phase depends on b, but we don’t care about the
phase: the important thing is that if we measure the qubit, we always see 1.
The result in either case is that with probability 1, we determine the
value of f (0) ⊕ f (1), after evaluating f once (albeit on a superposition of
quantum states).
This is kind of a silly example, because the huge costs involved in building
our quantum computer probably swamp the factor-of-2 improvement we got
in the number of calls to f . But a generalization of this trick, known as the
Deutsch-Josza algorithm [DJ92], solves the much harder (although still a bit
contrived-looking) problem of distinguishing a constant Boolean function on
n bits from a function that outputs one for exactly half of its inputs. No
deterministic algorithm can solve this problem without computing at least
2n /2 + 1 values of f , giving an exponential speed-up.7

13.5 Grover’s algorithm


Grover’s algorithm [Gro96] is one of two main exemplars for the astonishing
power of quantum computers.8 The idea of Grover’s algorithm is that if we
have a function f on N = 2n possible inputs whose value is 1 for exactly

one possible input w, we can find this w with high probability using O( N )
quantum evaluations of f . As with Deutsch’s algorithm, we assume that f
is encoded as an operator (conventionally written Uw ) that maps each |xi to
(−1)f (x) |xi.
The basic outline of the algorithm:
q
1. Start in the superposition |si = 1
x |xi = H ⊗n |0i.
P
N

2. Alternate between applying the Grover diffusion operator D =



2 |si hs| − I and the f operator Uw = 2 |wi hw| − I. Do this O( n)
times (the exact number of iterations is important and will be calculated
below).
7
The speed-up compared to a randomized algorithm that works with probability 1 − 
is less impressive. With randomization, we only need to look at O(log 1/) values of f to
see both a 0 and a 1 in the non-constant case. But even here, the Deutsch-Josza algorithm
does have the advantage of giving the correct answer always.
8
The other is Shor’s algorithm [Sho97], which allows a quantum computer to factor
n-bit integers in time polynomial in n. Sadly, Shor’s algorithm is a bit too complicated to
talk about here.
CHAPTER 13. QUANTUM COMPUTING 248

3. Take a measurement of the state. It will be w with high probability.

Making this work requires showing that (a) we can generate the original
superposition |si, (b) we can implement D efficiently using unitary operations
on a constant number of qubits each, and (c) we actually get w at the end
of this process.

13.5.1 Initial superposition


To get the initial superposition, start with |0qn i and apply the Hadamard
transform to each bit individually; this gives N1 x |xi as claimed.
P

13.5.2 The Grover diffusion operator


We have the definition D = 2 |si hs| − I.
Before we try to implement this, let’s start by checking that it is in fact
unitary. Compute

DD∗ = (2 |si hs| − I)2


= 4 |si hs| |si hs| − 4 |si hs| + I 2
= 4 |si hs| − 4 |si hs| + I
= I.

Here we use the fact that |si hs| |si hs| = |si hs|si hs| = |si (1) hs| = |si hs|.
Recall that |si = H ⊗n |0n i, where H ⊗n is the result of applying H to
each of the n bits individually. We also have that H ∗ = H and HH ∗ = I,
from which H ⊗n H ⊗n = I as well.
So we can expand

D = 2 |si hs| − I
= 2H ⊗n |0n i (H ⊗n |0n i)∗ − I
= 2H ⊗n |0n i h0n | H ⊗n − I
= H ⊗n (2 |0n i h0n | − I) H ⊗n .

The two copies of H ⊗n involve applying H to each of the n bits, which


we can do. The operator in the middle, 2 |0n i h0n | − I, maps |0n i to |0n i and
maps all other basis vectors |xi to − |xi. This can be implemented as a NOR
of all the qubits, which we can do using our tools for classical computations.
So the entire operator D can be implemented using O(n) qubit operations,
most of which can be done in parallel.
CHAPTER 13. QUANTUM COMPUTING 249

13.5.3 Effect of the iteration


To see what happens when we apply Uw D, it helps to represent the state in
terms of a particular two-dimensional basis. The idea here is that the initial
state |si and the operation Uw D are symmetric with respect to any basis
vectors |xi that aren’t |wi, so instead of tracking all of these non-|wi vectors
separately, we will represent all of them by a single composite vector
s
1 X
|ui = |xi .
N − 1 x6=w
q
The coefficient N 1−1 is chosen to make hu|ui = 1. As always, we like
our vectors to have length 1.
Using |ui, we can represent
r s
1 N −1
|si = |wi + |ui . (13.5.1)
N N
q
1
A straightforward calculation shows that this indeed puts N amplitude
on each |xi. q
Now we’re going to bring in some trigonometry. Let θ = sin−1 N1 , so
q √ q
that sin θ = N1 and cos θ = 1 − sin2 θ = NN−1 . We can then rewrite
(13.5.1) as

|si = (sin θ) |wi + (cos θ) |ui . (13.5.2)

Let’s look at what happens if we expand D using (13.5.2):

D = 2 |si hs| − I
= 2 ((sin θ) |wi + (cos θ) |ui) ((sin θ) hw| + (cos θ) hu|) − I
= (2 sin2 θ − 1) |wi hw| + (2 sin θ cos θ) |wi hu| + (2 sin θ cos θ) |ui hw| + (2 cos2 θ − 1) |ui hu|
= (− cos 2θ) |wi hw| + (sin 2θ) |wi hu| + (sin 2θ) |ui hw| + (cos 2θ) |ui hu|
" #
− cos 2θ sin 2θ
= ,
sin 2θ cos 2θ

where the matrix is over the basis (|wi , |ui).


CHAPTER 13. QUANTUM COMPUTING 250

Multiplying by Uw negates all the |wi coefficients. So we get


" #" #
−1 0 − cos 2θ sin 2θ
Uw D =
0 1 sin 2θ cos 2θ
" #
cos 2θ − sin 2θ
= . (13.5.3)
sin 2θ cos 2θ

Aficionados of computer graphics, robotics, or just matrix algebra in


general may recognize (13.5.3) as the matrix that rotates two-dimensional
vectors by 2θ. Since we started with |si at an angle of θ, after t applications
of this matrix we will be at an angle of (2t + 1)θ, or in state

(sin(2t + 1)θ) |wi + (cos(2t + 1)θ) |ui .

Ideally, we pick t so that (2t + 1)θ = π/2, which would put all of the
amplitude on |wi. Because t is an integer, we can’t do this exactly, but
setting t = b π/2θ−1
2 c will get us somewhere between π/2 − 2θ and π/2. Since
q
1
θ≈ N,this gives us a probability of seeing |wi in our final measurement
p √
of 1 − O( 1/N ) after O( N ) iterations of Uw D.
Sadly, this is as good as it gets. A lower bound of Bennet et al. [BBBV97]
shows that any quantum √ algorithm using Uw as the representation for f must
apply Uw at least Ω( N ) times to find w. So we get a quadratic speedup
but not the exponential speedup we’d need to solve NP-complete problems
directly.

You might also like