Exercise 1: Majorization and Random Permutations: CSE 599d Quantum Computing Problem Set 1 Solutions
Exercise 1: Majorization and Random Permutations: CSE 599d Quantum Computing Problem Set 1 Solutions
Author: Dave Bacon (Department of Computer Science & Engineering, University of Washington)
Due: January 20, 2006
Why is the uniform vector p the only probability vector which has the property of being ma-
jorized by all probability vectors (this is what I asked you to show)? Suppose that there was another
such probability vector x which is not uniform but which is majorized by all probability vectors. Then
we will show that it is not majorized by the uniform vector p. A probability vector which is not
uniform must have at least one element xi > n1 . But this implies that at least the first element of x
must satisfy x↓1 > n1 = p1 . But this implies that there is a vector, namely the uniform vector, which x
is not majorize by. This is a contradiction.
Pn
An n × n matrix A = (aij ) is called doubly stochastic if aij ≥ 0 for all i and j,
(b) P i=1 aij = 1 for all j and
n
a
j=1 ij = 1 for all i. Show that every convex combination of a doubly stochastic matrices is a doubly stochastic
Pm
matrix (recall that a convex combination of matrices A1 , A2 , . . . , Am is a sum of these matrices, j=1 qj Aj with
Pm
qj ≥ 0 and j=1 qj = 1.)
Denote the matrix elements Pm of Ak by (ak )ij . A convex combination of these elements is a matrix
C with entries cij = k=1 q k (ak )ij . Since qk ≥ 0 (from definition of convex) and (ak )ijP≥ 0 from
n
definition
Pn Pm of doubly stochastic,
Pm this implies that
P n Pmcij ≥ 0. Next we take the column sums i=1 cij =
i=1 k=1 qk (ak )ij = k=1 qk i=1 (ak )ij = k=1 qk = 1, where we have used the fact that each Ak
is doubly stochastic in the second to last equality Pn and the
Pn convexity
Pm in the last
Pequality.
PnThis holds
m
for all j. A similar sum holds for each row, j=1 cij = j=1 k=1 qk (ak )ij = k=1 qk j=1 (ak )ij =
Pm
k=1 qk = 1 for all i. Thus we have shown that cij satisfies all of the properties of being doubly
stochastic.
(c) Prove that if Ax ≺ x for all x then A must be doubly stochastic (hint consider the vector from part (a) as well as
vectors like (0, 0, 1, 0, . . . , 0).)
If Ax ≺ x for all x then it must also do so for arbitrary vectors x and in particular for all probability
vectors x. Consider the case where x has components xi =P δik , i.e. only one component, the kth is
n
nonzero. Now the components of the vector Ax is (Ax)i = j=1 Aij xi = Aik . The requirement that
Pn Pn Pn
the sum over all the elements must be the same is then i=1 (Ax)i = i=1 Aik = i=1 x = 1. This
must hold for all k.
Ax ≺ x must also hold for the uniform probability vector x with components xi = n1 . But from above
we know that the only probability vector which the uniform Pvector can majorize is itself. And in this
n Pn
case all our inequalities become strick equalities. (Ax)i = j=1 Aij xj = j=1 Aij n1 = xi = n1 . Thus
Pn
j=1 Aij = 1 for all i.
2
Finally we need to show that each PnAij ≥ 0. To show this consider again the vector x with
components xi = δik . Then (Ax)i = j=1 Aij xj = Aik . Now if we sort (Ax)i and xi we will obtain the
largest value of (Ax)i first. The first inequality which must be satisfied is therefore (Ax)↓1 ≤ 1. The
next inequality will be (Ax )↓2 + (Ax)↓1 ≤ 1. The last inequality will be i=1 (Ax)↓i ≤ 1. But the equality
Pn−1
will be i=1 (Ax)↓i = 1. This implies that i=1 (Ax)↓i = 1 − (Ax)↓n , so the last inequality becomes
Pn P n−1
↓
1 − (Ax) ≤ 1 or (Ax)n ≥ 0 which is just Ank ≥ 0. Similarly we can turn the second to last inequality
Pn−2 n ↓
i=1 (Ax)i ≤ 1 into 1 − (Ax)n−1 − (Ax)n ≥ 1. Using these inequalities we obtain A
Pn−k n−1,k
≥ 0. It is
↓
easy to see that we can use induction to prove that Ai,k ≥ 0 (the inequality i=1 (Ax)i ≤ 1 can via
the inductive hypothesis be turned into 1 − i=n−k (Ax)↓i ≤ 1 and then using the previous inequalities
Pn
we obtain the desired result.)
Thus we have have shown if Ax ≺ x, then A must satisfy all of the properties of a doubly
stochastic matrix.
(d) Prove that if A is doubly stochastic then Ax ≺ x for all vectors x.
For any vector x we can redefine A such that the columns of A are reordered so that x is a decreasing
vector and the rows of A are reordered so that Ax is a decreasing vector. Without loss of generality we
will use this A.
Pk
We will prove the kth inequality. Define gj = Aij . Then because A is doubly stochastic,
i=1 P
Pn n Pk
0 ≤ gj ≤ 1. Notice also that j=1 gj = k. Define yi = j=1 Aij xj . We consider will i=1 (yi − xi ).
By definitions this is equal to
k
X X n
k X k
X
(yi − xi ) = Aij xj − xi (1)
i=1 i=1 j=1 i=1
The notion of majorization occurs naturally in various contexts. For example, in economics if x1 , . . . , xn and
y1 , . . . , yn denote incomes of individuals 1, . . . , n, then x ≺ y would mean that there is a more equal distribution
of incomes in the state x than in y. We will encounter majorization later when we talk about transformations on
entangled quantum states.
Elements of the Pauli group are made up of tensor products of n of the Pauli matrices, along with a phase ik , where
k ∈ {0, 1, 2, 3}. Elements of the Pauli group can be parameterized by two n bit strings, a and b along with k = 0, 1, 2, 3
as
P (a, b, k) = ik (X a1 Z b1 ) ⊗ (X a2 Z b2 ) ⊗ · · · ⊗ (X an Z bn )
(a) Show that all elements of the Pauli group on n qubits, P (a, b, k), are unitary.
First we need to calculate P (a, b, k)† . Note that the adjoint of the tensor product of two matrices is the
tensor product of the adjoint of these two matrices: (A ⊗ B)† = A† ⊗ B † . Further adjoint of a complex
number times a matrix is the conjugate of that number times the adjoint of the matrix: (cA)† = c∗ A† .
Thus
P (a, b, k)† = (ik (X a1 Z b1 )⊗(X a2 Z b2 )⊗· · ·⊗(X an Z bn ))† = (ik )∗ (X a1 Z b1 )† ⊗(X a2 Z b2 )† ⊗· · ·⊗(X an Z bn )†
(10)
4
Now use the fact that when you multiply tensor products they multiply tensor wise and a scalar multi-
plies separately (cA ⊗ B)(dC ⊗ D) = cd((AC) ⊗ (BC)). This implies
= ik+l (X a1 Z b1 X c1 Z d1 ) ⊗ · · · ⊗ (X an Z bn X cn Z dn )
= ik+l ((−1)b1 c1 +a1 d1 mod 2 X c1 Z d1 X a1 Z b1 ) ⊗ · · · ⊗ ((−1)bn cn +an dn mod 2
X cn Z dn X an Z bn )
Pn
= ik+l (−1) i=1 bi ci +ai di mod 2
(X c1 Z d1 X a1 Z b1 ) ⊗ · · · ⊗ (X cn Z dn X an Z bn )
Pn
bi ci +ai di mod 2
= (−1) i=1 (il (X c1 Z d1 ) ⊗ · · · ⊗ (X cn Z dn ))(ik (X a1 Z b1 ) ⊗ · · · ⊗ (X an Z bn ))
Pn
bi ci +ai di mod 2
= (−1) i=1 P (c, d, l)P (a, b, k) (12)
Actually this is what I wanted you to show, but I wrote it incorrectly: there is a phase ik and il which
commute
Pn between both expressions. Thus it is easy to see that this is also equal to P (a, b, k)P (c, d, l) =
(−1) i=1 bi ci +ai di mod 2 P (c, d, k)P (a, b, l)
(c) Consider operators on n qubits of the form R(P (a, b, k)) = √12 (I + iP (a, b, k)) where P (a, b, k) is an element of the
Pauli group. Show that if P (a, b, k) is hermitian, then R(P (a, b, k)) is unitary. For the later parts of this problem,
assume that R(P (a, b, k)) is indeed one of these unitary gates.
R(P (a, b, k))† = √12 (I + iP (a, b, k))† = √12 (I † + i∗ P (a, b, k)† ). If P (a, b, k) is hermitian, then this is
equal to R(P (a, b, ik))† = √12 (I − iP (a, b, k)). Next check it if is unitary: R(P (a, b, k))R(P (a, b, k)† =
√1 (I + iP (a, b, k)) √1 (I − iP (a, b, k)) = 1 (I + iP (a, b, k) − iP (a, b, k) + i(−i)P (a, b, k)P (a, b, k)) =
2 2 2
1 2
2 (I + P (a, b, k) ). But since P (a, b, k) is hermitian, and we have shown that the P (a, b, k) are uni-
tary: P (a, b, k)P (a, b, k)† = I and so P (a, b, k)2 = P (a, b, k)P (a, b, k)† = I. Thus we see that
R(P (a, b, k))R(P (a, b, k)† = 12 (I + I) = I, so R(P (a, b, k)) is indeed unitary.
(d) Show that R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = P (c, d, l) if P (a, b, k) commutes with P (c, d, l).
If P (a, b, k) commutes with P (c, d, l), then R(P (a, b, k)) commutes with P (c, d, l) (because if a ma-
trix commutes two matrices then it commutes with their sum, and P (c, d, l) commutes with both
I and P (a, b, k).) Thus R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = P (c, d, l)R(P (a, b, k))R(P (a, b, k))† =
P (c, d, l)I = P (c, d, l) where we have used the unitarity of R(P (a, b, k)).
(e) Show that R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = iP (a, b, k)P (c, d, l) if P (a, b, k) anti-commutes with P (c, d, l).
R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = √1
2
(I + iP (a, b, k)) P (c, d, l) √12 (I − iP (a, b, k)) We can expand
this as
1
R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = (P (c, d, l) + P (a, b, k)P (c, d, l)P (a, b, k)
2
+iP (a, b, k)P (c, d, l) − iP (c, d, l)P (a, b, k)) (13)
5
. But since P (a, b, k)P (c, d, l) = −P (c, d, l)P (a, b, k) and P (a, b, k)2 = I (see solution for part (c)),
P (a, b, k)P (c, d, l)P (a, b, k) = −P (c, d, l), so
1
R(P (a, b, k))P (c, d, l)R(P (a, b, k))† = (iP (a, b, k)P (c, d, l) − iP (c, d, l)P (a, b, k)) (14)
2
1
= (iP (a, b, k)P (c, d, l) + iP (a, b, k)P (c, d, l)) = iP (a, b, k)P (c, d, l)
2
where we have use the anti-commutivity of P (a, b, k) and P (c, d, l) once more.
(f) The Toffoli gate on 3 qubits is a controlled-controlled-NOT gate. In the computational basis, it acts as
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
CCX =
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
Show that no sequence of unitary operations R(P (a, b, k)) on 3 qubits can be used to reproduce the Toffoli gate
(hint consider the effect of conjugating a Pauli operator by the Tofolli gate: (CCX )P (a, b, k)(CCX )† )
†
Consider CCX (X ⊗ I ⊗ I)CCX . This is the monster matrix multiplication
1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0
(15)
0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 1
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0
This is the gate X ⊗ CX which acts as X on the first qubit and CX on the second and third qubits.
But CX is not a member of the Pauli group. To see this note that the trace (sum of diagonal matrix
elements) of all elements of Pauli group are zero or 2n , but CX has trace 2 and does not act on one
qubit. Further X ⊗ CX is also not a member of the Pauli group, since the Pauli group requires that all
†
qubits should be acted upon by Pauli matrices. Thus we have seen that CCX (X ⊗ I ⊗ I)CCX produces
an element not in the Pauli group. But by the previous two parts, we have seen that the R(P (a, b, k))
elements, when conjugated about a Pauli group member always produce a Pauli group member. So
any sequence of such R(P (a, b, k)) elements must also, when conjugated about a Pauli group member
always produce a Pauli group member. But the Toffoli gate doesn’t do this. This is a contradiction with
the idea that we can express the Toffoli as a product of R(P (a, b, k)) gates. Thus we cannot construct
Toffoli from such a product of R(P (a, b, k)) gates.
What you’ve demonstrated in the last part of this problem is that there is no way to use the R gates alone to perform
a Tofolli gate. In fact, we will learn when we get to quantum error correction that the R gates are what are called
Clifford gates and that they do not form a universal set of quantum gates.
6
FE
α|0i + β|1i U V
where V is an arbitrary unitary. Prove that it is impossible to choose a V and input α|0i + β|1i such that it is
always possible to distinguish with perfect certainty which Pauli, I, X, Y , or Z, the black box U implements.
This isn’t as hard as it sounds.
There are four different Pauli’s but our measurement outcome has only two outcomes! For each mea-
surement outcome we need to guess one of the four Pauli’s (if we want to succeed with certainty.) But
there is no way to do this if we have only two measurement outcomes.
(b) Show that (P ⊗ I) √12 (|00i + |11i), where P is one of the four single qubit Pauli matrices I, X, Y , or Z are
orthogonal for the four different P s.
Define |ψP i = (P ⊗ I) √12 (|00i + |11i). Then
1 1 1
hψQ |ψP i = √ (h00| + h11|)(Q† ⊗ I)(P ⊗ I) √ (|00i + |11i) = (h00| + h11|)((QP ) ⊗ I)(|00i + |11i) (17)
2 2 2
But this is equal to
1
hψQ |ψP i = (h00|((QP ) ⊗ I)|00i + h00|((QP ) ⊗ I)|11i + h11|((QP ) ⊗ I)|00i + h11|((QP ) ⊗ I)|11i) (18)
2
But using the fact that h00|M ⊗I|11i = h0|M |1ih0|I|1i = 0 (and similar expressions for the other terms)
we see that this is equal to
1 1
hψQ |ψP i = (h0|QP |0i + h1|QP |1i) = Tr[QP ] (19)
2 2
where Tr is the trace (sum of diagonal elements of matrix.) Now it is easy to calculate that Tr[X] =
Tr[Y ] = Tr[Z] = 0. Further since multiplying different elements from {I, X, Y, Z} produces a matrix
proportional to an element in {X, Y, Z} this implies that hψQ |ψP i = 21 Tr[QP ] = 0 if Q 6= P .
(c) Now suppose that instead of the restricted circuit used in part (a), you are now allowed to input two-qubit states
into the black box, then perform a two qubit gate after the evolution of the black box, and then perform a
measurement in the computational basis. In other words the general circuit considered is now
FE
U
α|00i + β|01i + γ|10i + δ|11i V
FE
where V is now a two-qubit unitary. Show that it is now possible to distinguish all for single qubit Paulis from
each other with certainty by choosing the appropriate two qubit input state and two qubit unitary V .
Suppose we choose the two qubit input state to be √12 (|00i + |11i). Then the state after the Pauli will
be one of the |ΨQ i states defined above, where Q = U . Since these states are orthogonal we should be
able to choose a unitary V which rotates this basis into the computational basis. A measurement will
then distinguish these Paulis with certainty. One way to choose this matrix is
1
√1
√
2
0 0 2
0 √1 √1 0
V =
2 2
(20)
1 1
0 √2 − √2 0
√1 0 0 − √12
2
7
One can check that V is indeed unitary. Then if we apply V a measure, if U = I we will obtain |00i, if
U = X we will obtain |01i, if U = Y we obtain |10i, and if U = Z we obtain |11i.
What you’ve just demonstrated is that it is possible to use entangled quantum states to help distinguish between
different unknown unitary gates. This idea, generalized, is one way to think about how quantum algorithms work.