Crypt Emb Sys
Crypt Emb Sys
Jakub Breier
Cryptography and
Embedded Systems
Security
Cryptography and Embedded Systems Security
Xiaolu Hou • Jakub Breier
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
vii
viii Foreword
novel attack surfaces which can cripple the best of crypto-algorithms if suitable
countermeasures are not implemented along with.
The contribution of this book is to address these aspects of secured crypto-design
and provide a vivid description to develop an end-to-end understanding. The designs
of cryptographic algorithms and their analysis are often based on mathematical and
statistical tools. The book starts with a nice summary of important mathematical
principles, which are needed to comprehend the cipher constructions and their attack
analysis. Subsequently, the book provides a summary of both classical and modern
cryptosystems. The following chapters also stress on implementations of these
modern cryptosystems, before delving into various forms of physical attacks on the
implementations. The book discusses techniques for side-channel analysis of both
symmetric-key and public-key cryptosystems, along with suitable countermeasures.
The book then presents a contemporary summary of various forms of fault attacks
on cryptosystems, and countermeasures against them. The book concludes with
practical aspects of physical attacks, providing much-needed details of physical
setups, useful to develop practical setups for hardware security research.
Engaging and informative, this book is fine reading for anyone fascinated by
the intricate realm of embedded security and cryptographic engineering. It offers
a compelling glimpse into the workings of attacks on cryptosystems in embedded
devices and provides actionable strategies for mitigation. Enjoy the journey into the
captivating world of security engineering!
This book here provides a contemporary summary of techniques for attacks and
countermeasures. There are many good examples provided: I encourage all readers
of this book to pay particular attention to these and implement and extend as many
as possible. The best way to understand the foundational aspects of any field is by
active learning: do as much as you can yourself!
xi
Acknowledgment
We would like to thank Debdeep Mukhopadhyay and Elisabeth Oswald for writing
a nice and motivating foreword for this book.
Our thanks also go to Mladen Kovačević, Romain Poussier, and Dirmanto Jap for
proofreading an earlier version of the book and for their detailed and constructive
comments.
We would also like to acknowledge the editorial team of Springer Nature,
especially Bakiyalakshmi R M, and Charles Glaser.
For the unwavering support and encouragement from our parents and especially
our son, Aurel, who turned our writing process into a wild adventure.
This project has received funding from the European Union’s Horizon 2020
Research and Innovation Programme under the Programme SASPRO 2 COFUND
Marie Sklodowska-Curie grant agreement No. 945478.
xiii
Contents
xv
xvi Contents
A Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
A.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
A.2 Invertible Matrices for the Stochastic Leakage Model . . . . . . . . . . . . . . . 448
xix
xx List of Figures
Fig. 4.14 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600
computed with Fixed dataset A and Random plaintext
dataset. The signal is given by the plaintext value, and the
fixed versus random setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Fig. 4.15 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600
computed with 50 traces from Fixed dataset A and 50
traces from Random plaintext dataset. The signal is given
by the plaintext value, and the fixed versus random setting
is chosen. Blue dashed lines correspond to the threshold
.4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Fig. 4.16 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. .T1 contains
.M1 = 634 traces and .T2 contains .M2 = 651 traces.
given by the 0th Sbox output, and the fixed versus fixed
setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Fig. 4.19 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. Both .T1 and
.T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is
given by the 0th Sbox output, and the fixed versus random
setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Fig. 4.20 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Fig. 4.21 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Fig. 4.22 SNR for each time sample, computed using Random
dataset. The signal is given by the exact value of the 0th
Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
xxii List of Figures
Fig. 4.23 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the Hamming weight of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.24 SNR for each time sample, computed using Random
dataset. The signal is given by the Hamming weight of
the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.25 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the Hamming weight of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.26 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the 0th bit of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.27 SNR for each time sample, computed using Random
dataset. The signal is given by the 0th bit of the 0th Sbox
output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.28 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the 0th bit of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.29 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for
all time samples .t = 1, 2, . . . , 3600. Computed following
Eq. 4.21 with the identity leakage model and the Random
plaintext dataset. The blue line corresponds to the correct
key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Fig. 4.30 Sample correlation coefficients .r10,t (corresponds to
the correct key hypothesis 9) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the identity leakage model and the Random plaintext dataset . . . . . . 248
Fig. 4.31 Sample correlation coefficients .r1,t (corresponds
to a wrong key hypothesis 0) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.32 Sample correlation coefficients .r5,t (corresponds
to a wrong key hypothesis 4) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.33 Sample correlation coefficients .r14,t (corresponds
to a wrong key hypothesis D) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.34 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16)
for all time samples .t = 1, 2, . . . , 3600. Computed
following Eq. 4.21 with the Hamming leakage model and
the Random plaintext dataset. The blue line corresponds
to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
List of Figures xxiii
the Hamming leakage model and the Random plaintext dataset . . . 251
Fig. 4.36 Sample correlation coefficients .r1,t (corresponds
to a wrong key hypothesis 0) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the Hamming leakage model and the Random plaintext dataset . . . 251
Fig. 4.37 Sample correlation coefficients .r5,t (corresponds
to a wrong key hypothesis 4) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the Hamming leakage model and the Random plaintext dataset . . . 252
Fig. 4.38 Sample correlation coefficients .r14,t (corresponds
to a wrong key hypothesis D) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with
the Hamming leakage model and the Random plaintext dataset . . . 252
M̂p
Fig. 4.39 Sample correlation (.i = 1, 2, . . . , 16)
coefficients .ri,POI
for .POI = 392. Computed following Eq. 4.23 with the
identity leakage model and the Random plaintext dataset.
The blue line corresponds to the correct key hypothesis
k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
.
M̂
p
Fig. 4.40 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with
the Hamming weight leakage model and the Random
plaintext dataset. The blue line corresponds to the correct
key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
M̂
p
Fig. 4.41 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with the
stochastic leakage model and the Random plaintext
dataset. The blue line corresponds to the correct key
hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Fig. 4.42 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of .v, the 0th Sbox output. Three POIs (time samples
.392, 218, 1328) were chosen. The blue line corresponds
Fig. 4.43 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by .wt (v), the
Hamming weight of the 0th Sbox output. Three POIs
(time samples .392, 1309, 1304) were chosen. The blue
line corresponds to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . 266
Fig. 4.44 SNR for each time sample, computed using Random
dataset. The signal is given by the exact value of the 1st
Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Fig. 4.45 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of the 1st Sbox output. One POI (time samples 404)
was chosen. The blue line corresponds to the correct key
hypothesis .8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Fig. 4.46 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of the 1st Sbox output. One POI (time samples 464)
was chosen. The blue line corresponds to the correct key
hypothesis 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
M̂p
Fig. 4.47 Sample correlation (.i = 1, 2, . . . , 16)
coefficients .ri,POI
for .POI = 392. Computed following Eq. 4.23 with the
identity leakage model and the Random plaintext dataset
arranged in reverse order. The blue line corresponds to the
correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
M̂
p
Fig. 4.48 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with
the Hamming weight leakage model and the Random
plaintext dataset arranged in reverse order. The blue line
corresponds to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . 269
Fig. 4.49 Estimations of success rate computed following
Algorithm 4.1 for profiled DPA attacks based on the
stochastic leakage model, the identity leakage model, and
the Hamming weight leakage model using the Random
plaintext dataset as attack traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Fig. 4.50 Estimations of guessing entropy computed following
Algorithm 4.1 for profiled DPA attacks based on the
stochastic leakage model, the identity leakage model, and
the Hamming weight leakage model using the Random
plaintext dataset as attack traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
List of Figures xxv
.R0
i+1 , .R1i+1 , .R2i+1 , .R3i+1 are in orange, blue, green,
and red colors, respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Fig. 4.88 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with 50 traces from Masked fixed dataset A and
50 traces from Masked fixed dataset B. The signal is given
by the plaintext value, and the fixed versus fixed setting is
chosen. Blue dashed lines correspond to the threshold .4.5
and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Fig. 4.89 SNR computed with Masked random dataset. The signal
is given by the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . 338
Fig. 4.90 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attacks on the
Masked random plaintext dataset (in black) and on the
Random plaintext dataset (in red) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Fig. 4.91 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attacks on the
Masked random plaintext dataset (in black) and on the
Random plaintext dataset (in red) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Fig. 5.1 An illustration of DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Fig. 5.2 Visual illustration of how the fault propagates when a
fault is injected at the beginning of one AES round (not
the last round) in byte .s00 . Blue squares correspond to
bytes that can be affected by the fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Fig. 5.3 Visual illustration of how the fault propagates when a fault
is injected at the beginning of one AES round in bytes:
(a) .s00 , s11 , (b) .s00 , s11 , s22 , and (c) .s00 , s11 , s22 , s33 . Blue
squares correspond to bytes that can be affected by the fault. . . . . . . 362
xxx List of Figures
xxxi
xxxii List of Tables
Table 3.5 Left and right part of the intermediate values in DES key
schedule after PC1. The 1st bit of the left part comes
from the 57th bit of the master key (input to PC1) . . . . . . . . . . . . . . . . 138
Table 3.6 Number of key bits rotated per round in DES key schedule . . . . . . 138
Table 3.7 PC2 in DES key schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Table 3.8 Specifications of Rijndael design, where blue-colored
values are adopted by AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Table 3.9 AES Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Table 3.10 Inverse of AES Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Table 3.11 PRESENT Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Table 3.12 PRESENT pLayer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Table 3.13 The Boolean function .ϕ0 takes input .x and outputs the
0th bit of SB.PRESENT (x). The second last row lists the
output of .ϕ0 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Table 4.1 Difference distribution table for PRESENT Sbox
(Table 3.11). The columns correspond to input
difference .δ, and the rows correspond to output
difference .Δ. The row for .Δ = 0 is omitted since it is empty . . . . 277
Table 4.2 In the first column, we list the possible
values of .α such that the following
entries of AES Sbox DDT are nonempty
.(0E·α, 4F), (09·α, 8F), (0D·α, 21), (0B·α, 9F).
The corresponding hypotheses for
.k00 ⊕ 4C, k11 ⊕ AA, k22 ⊕ 10, k33 ⊕ 90 are
Table 5.1 Part of the difference distribution table for SB.1DES (Table 3.3) . . . 355
List of Tables xxxiii
Table 5.2 Part of the difference distribution table for AES Sbox
(Table 3.9) corresponding to output differences 0C, 69,
8C, and ED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Table 5.3 Fault distribution tables for fault models: (a) stuck-at-0,
(b) bit flip, and (c) random fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Table 5.4 Fault distribution tables for fault models: (a) stuck-at-0
with probability .0.5 and (b) random-AND with .δ, where
.δ follows a uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Table 5.5 Lookup table for carrying out XOR between .a, b
(.a, b ∈ F2 ) using 01 as the codeword for 0 and 10 as the
codeword for 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Table 5.6 Lookup table for error-correcting code based
computation of AND between .a, b (.a, b ∈ F2 ), using the
3-repetition code .{000, 111}. 000 is the codeword for 0,
and 111 is the codeword for 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Table C.1 Sboxes in DES (Sect. 3.1.1) round function . . . . . . . . . . . . . . . . . . . . . . . 455
Table D.1 The Boolean function .ϕ1 takes input .x and outputs the
1st bit of SB.PRESENT (x). The second last row lists the
output of .ϕ1 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table D.2 The Boolean function .ϕ2 takes input .x and outputs the
2nd bit of SB.PRESENT (x). The second last row lists the
output of .ϕ2 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table D.3 The Boolean function .ϕ3 takes input .x and outputs the
3rd bit of SB.PRESENT (x). The second last row lists the
output of .ϕ3 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table E.1 Table .TSG , estimated signals for each integer between
00 and FF with Hamming weight 6, computed with the
stochastic leakage model obtained in Code-SCA Step
6 from Sect. 4.5.1.1. The first (resp. second) column
contains the hexadecimal (resp. binary) representations
of the integers. The last column lists the corresponding
estimated signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Table E.2 Sorted version of .TSG from Table E.1 such that the
estimated signals (values in the last column) are
in ascending order. The hexadecimal (resp. binary)
representations of the corresponding integers are in the
first (resp. second) column. Words highlighted in blue
constitute the chosen binary code with Algorithm 4.5 . . . . . . . . . . . 463
List of Algorithms
xxxv
xxxvi List of Algorithms
1.1 Preliminaries
Before we start with math, let us first introduce the basic notations.
1.1.1 Sets
.s ∈ T . Two sets are said to be equal if they contain the same elements. In other
.2 = { ∅, S, { 2 } , { 3 } }.
• S
The union of two sets A and B, denoted .A ∪ B, is the set that contains all elements
from A or B.
A ∪ B := { x | x ∈ A or x ∈ B } .
.
The intersection of .A and .B, denoted .A ∩ B, is the set that contains elements in both
A and B.
A ∩ B := { x | x ∈ A and x ∈ B } .
.
⋃
n ⋂
n
. Ai := { a | a ∈ Ai for some i } , Ai := { a | a ∈ Ai for all i } .
i=1 i=1
The difference between set A and set B is the set of all elements of A that are
not in B:
A − B := { a | a ∈ A, a /∈ B } .
. (1.1)
Ac := S − A = { s | s ∈ S, s /∈ A } .
.
The Cartesian product of A and B is the set of ordered pairs .(a, b) such that
a ∈ A and .b ∈ B,
.
. A × B := { (a, b) | a ∈ A, b ∈ B } .
∏
n
. Ai := { (a1 , a2 , . . . , an ) | ai ∈ Ai for all i } .
i=1
1.1 Preliminaries 3
A × B = { (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5), (6, 1), (6, 3), (6, 5) } .
.
B × A = { (1, 2), (3, 2), (5, 2), (1, 4), (3, 4), (5, 4), (1, 6), (3, 6), (5, 6) } /= A × B.
.
1.1.2 Functions
Functions (also called maps) will be used a lot in the rest of the book. Here we
provide the formal definition.
Definition 1.1.1 A function/map .f : S → T is a rule that assigns each element
s ∈ S a unique element .t ∈ T .
.
f −1 (A) := { s ∈ S | f (s) ∈ A }
.
f :R→R
.
x ⍿→ x 2 ,
where .R is the set of real numbers. Then f has domain .R and codomain .R.
Let .A = { 1 } ⊆ R, the preimage of A under f is given by
f −1 (A) = { −1, 1 } .
.
Definition 1.1.2
• A function .f : S → T is called onto or surjective, if given any .t ∈ T , there
exists .s ∈ S, such that .t = f (s).
• A function .f : S → T is said to be one-to-one (written 1-1) or injective if for
any .s1 , s2 ∈ S such that .s1 /= s2 , we have .f (s1 ) /= f (s2 ).
• f is called 1-1 correspondence or bijective if f is 1-1 and onto.
Example 1.1.5
• Define f
f : R → R≥0
.
x ⍿→ x 2 ,
g:R→R
.
x ⍿→ x.
and .f (s1 ) = f (s2 ) = t, which means .f −1 (t) is not a unique element. However,
when f is bijective, .f −1 : T → S is a function—it assigns to each .t ∈ T a unique
element .s ∈ S. In such a case, .f −1 is called the inverse of f .
Example 1.1.6 Define f
f :R→R
.
x ⍿→ x 3 .
.f −1 : R → R
√
x ⍿→ 3 x.
When the domain of one function is the codomain of another function, we can define
the composition of those two functions.
Definition 1.1.3 For two functions .f : T → U , .g : S → T , the composition of f
and g, denoted by .f ◦ g, is the function
1.1 Preliminaries 5
f ◦g :S → U
.
s ⍿→ f (g(s)).
. f :R→R
x ⍿→ x 2
and g
g:R→R
.
x ⍿→ x 3 .
f ◦g :R → R
.
x ⍿→ (x 3 )2 = x 6 .
For a function whose domain and codomain are the same, say .f : S → S, we can
define .f ◦f ◦· · ·◦f in a similar way. For simplicity, we write .f n for the composition
of n copies of f . When .f : S → S is bijective, .f −1 is a function. And we write
.f
−n for the composition of n copies of .f −1 .
f :R→R
.
x ⍿→ x 2 ,
then
fn : R → R
.
n
x ⍿→ x 2 .
1.1.3 Integers
We deal with integers every day. We would write one hundred and twenty-three as
123 because
123 = 1 × 100 + 2 × 10 + 3 × 1.
.
6 1 Mathematical and Statistical Background
⎲
𝓁−1
n=
. ai bi , (1.2)
i=0
where .0 ≤ ai < b .(0 ≤ i < 𝓁), .a𝓁−1 /= 0, and .𝓁 ≥ 1. .a𝓁−1 a𝓁−2 . . . a1 a0 is called a
base.−b representation for n. .𝓁 is called the length of n in base.−b representation.
The proof can be found in, e.g., [Kos02, page 81]. To emphasize the base b, we
sometimes put b as a subscript for the representation. When .b = 2, a base.−2
representation is also called a binary representation, .𝓁 is also called the bit length
of n, .a0 is said to be the least significant bit (LSB) of n, and .a𝓁−1 is said to be the
most significant bit of n. When .b = 16, a base.−16 representation is also called a
hexadecimal representation.
The correspondence between decimal numerals and hexadecimal (base .b = 16)
numerals is listed in Table 1.1.
Example 1.1.9
Table 1.1 Correspondence between decimal and hexadecimal (base .b = 16) numerals
Base 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Base 16 0 1 2 3 4 5 6 7 8 9 A B C D E F
1.1 Preliminaries 7
Example 1.1.10
• . 3|6, .−2|4, .1|8, .5|5.
• . 7 ∤ 9, .4 ∤ 6.
• All the positive divisors of 4 are .1, 2, 4.
• All the positive divisors of 6 are .1, 2, 3, 6.
We can see that there are some common divisors between 4 and 6. The largest of
them will be of importance to us. Formally, we can define the greatest common
divisor between two integers that are not both zero.
Definition 1.1.5 Take .m, n ∈ Z, .m /= 0 or .n /= 0, and the greatest common divisor
of m and n, denoted .gcd(m, n), is given by .d ∈ Z such that
• .d > 0.
• .d|m, .d|n.
• If .c|m and .c|n, then .c|d.
Example 1.1.11
• Continuing Example 1.1.10, common divisors of 4 and 6 are 1 and 2. So
.gcd(4, 6) = 2.
• All the positive divisors of 2 are 1 and 2. All the positive divisors of 3 are 1 and
3. So .gcd(2, 3) = 1.
It can be proven that the greatest common divisor of two integers (not both zero)
always exists and it is unique. The proof of the theorem can be found in, e.g.,
[Her96, page 23].
Theorem 1.1.3 (Bézout’s identity) For any .m, n ∈ Z, such that .m /= 0 or .n /= 0,
gcd(m, n) exists and is unique. Moreover, there exist .s, t ∈ Z such that .gcd(m, n) =
.
sm + tn.
The equation .gcd(m, n) = sm + tn is usually called Bézout’s identity. We note that
the choices of .s, t are not unique. Indeed, if .gcd(m, n) = sm+tn, then .gcd(m, n) =
(s + n)m + (t − m)n.
Example 1.1.12
. gcd(4, 6) = 2 = (−1) × 4 + 1 × 6.
gcd(2, 3) = 1 = (−4) × 2 + 3 × 3.
is a multiple of m.
To prove (7), we note that by Bézout’s identity, there exist .s, t ∈ Z such that
.as + mt = 1. Multiplying both sides by n, we get .asn + mnt = n. Since .a|asn and
Finally, we prove (8). Since .m|a, .a = mk for some .k ∈ Z. We have .n|mk. Now
because .gcd(m, n) = 1, by (7), .n|k and so .k = nk ' for some .k ' ∈ Z. Thus .a = mnk '
is divisible by mn. ⨆
⨅
In general, to find .gcd(m, n), it would be too time-consuming to list all the divisors
of m and n. The following theorem allows us to simplify the computation.
Theorem 1.1.4 (Euclid’s division) Given .m, n ∈ Z, take .q, r such that .n = qm +
r. Then .gcd(m, n) = gcd(m, r).
Proof We first note that we can find .q, r by Theorem 1.1.2. By Lemma 1.1.1 (6),
.gcd(m, n)|n − qm, i.e., .gcd(m, n)|r. Similarly we have .gcd(m, r)|qm + r, i.e.,
.gcd(m, r)|n.
By Definition 1.1.5, .gcd(m, n)| gcd(m, r) and .gcd(m, r)| gcd(m, n). By
Lemma 1.1.1 (5), .gcd(m, r) = ± gcd(m, n). By Definition 1.1.5, .gcd(m, r) > 0
and .gcd(m, n) > 0. We have .gcd(m, n) = gcd(m, r). ⨆
⨅
Thus, to find .gcd(m, n), we can compute Euclid’s division repeatedly until we get
r = 0.
.
The procedure is called the Euclidean algorithm, and the details are provided in
Algorithm 1.1. By Theorem 1.1.4, .gcd(m, n) = gcd(m, r) after each loop from
line 1. In the end, we get .gcd(m, n).
Furthermore, with the intermediate results we have from the Euclidean algorithm,
we can also find a pair of .s, t such that .gcd(m, n) = sm + tn (Bézout’s identity).
Example 1.1.14 Continuing Example 1.1.13, we can find integers .s, t such that
gcd(120, 35) = 120s + 35t as follows:
.
5 = 35 − 15 × 2, 15 = 120 − 35 × 3,
.
=⇒ 5 = 35 − (120 − 35 × 3) × 2 = 120 × (−2) + 35 × 7.
By the extended Euclidean algorithm, we can also find integers .s, t such that
gcd(160, 21) = s160 + t35
.
1 = 3 − 2, 2 = 5 − 3,
. 3 = 8 − 5, 5 = 13 − 8,
8 = 21 − 13, 13 = 160 − 21 × 7.
10 1 Mathematical and Statistical Background
We have
1 = 3 − (5 − 3) = 3 × 2 − 5 = 8 × 2 − 5 × 3 = 8 × 2 − (13 − 8) × 3
.
= 8 × 5 − 13 × 3 = 21 × 5 − 13 × 8 = 21 × 5 − (160 − 21 × 7) × 8
= (−8) × 160 + 61 × 21.
compute t using s.
Definition 1.1.6
• For .m, n ∈ Z such that .m /= 0 or .n /= 0, m and n are said to be relatively
prime/coprime if .gcd(m, n) = 1.
• Given .p ∈ Z, .p > 1. p is said to be prime (or a prime number) if for any
.m ∈ Z, either m is a multiple of p (i.e., .p|m) or m and p are coprime (i.e.,
.gcd(p, m) = 1).
∏
k
n=
. piei ,
i=1
where the exponents .ei are positive integers, and .p1 , p2 , . . . , pk are prime numbers
that are pairwise distinct and unique up to permutation.
Proof We prove by contradiction. Assume the theorem is false. Let .n ∈ Z (.n > 1)
be the smallest integer with two distinct factorizations. We can write
∏
k ∏
𝓁
d
n=
. piei = qj j .
i=1 j =1
∏ d
Since .p1 | 𝓁j =1 qj j , by Lemma 1.1.2, .p1 |qj for some j . Without loss of generality,
we assume .p1 |q1 . Since .p1 and .q1 are prime numbers, we have .p1 = q1 . Then
∏ ∏ d
the integer .n' = ki=2 piei = 𝓁j =2 qj j has two distinct factorizations and .n' < n,
which contradicts the minimality of n. ⨆
⨅
Example 1.1.17 .20 = 22 × 5, .135 = 33 × 5.
In this section, we discuss the basics of abstract algebra and get to know a few
abstract structures. Most of us are already familiar with examples of such structures,
probably just not by the name. Those structures will become useful when we discuss
modern cryptographic algorithms.
12 1 Mathematical and Statistical Background
1.2.1 Groups
• −1 −1
Every .g ∈ G has an inverse .g ∈ G such that .g · g = g · g = e. −1
When it is clear from the context, we omit .· and say that G is a group.
Example 1.2.1 There are many examples of groups that we are familiar with.
• .(Z, +), the set of integers with addition, is a group. The identity element is 0.
• Similarly, .(Q, +) and .(C, +) are groups.
• .(Q, ×) is not a group. Because .0 ∈ Q does not have an inverse with respect to
multiplication.
• But .(Q\ { 0 } , ×) is a group. The identity element is 1.
Next, we give an example of formally proving that a set with a binary operation is a
group. Let .G = R+ be the set of positive real numbers, and let .· be the multiplication
of real numbers, denoted .×. We will show that .(R+ , ×) is a group.
1. .R+ is closed under .×: for any .a1 , a2 ∈ R+ , .a1 × a2 ∈ R and .a1 × a2 > 0, hence
.a1 × a2 ∈ R .
+
1 1
a×
. = × a = 1,
a a
hence .a −1 = 1
a ∈ R+ .
By definition, we have proved that .(R+ , ×) is a group.
Definition 1.2.2 Let .(G, ·) be a group. If .· is commutative, i.e.,
∀g1 , g2 ∈ G, g1 · g2 = g2 · g1 ,
.
The name abelian is in honor of the great mathematician Niels Henrik Abel (1802–
1829).
Example 1.2.2 The groups we have seen so far .(Z, +), .(R+ , ×), .(Q\ { 0 } , ×),
.(Q, +), and .(C, +) are all abelian groups.
Example 1.2.3 Let us consider the set of .2 × 2 matrices with coefficients in .R. We
denote this set by .M2×2 (R).
⎛ Recall ⎞ that
⎛ matrix ⎞ addition, denoted by .+, is defined
a00 a10 b00 b10
componentwise. For any . , in .M2×2 (R),
a01 a11 b01 b11
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a10 b b a + b00 a10 + b10
. + 00 10 = 00 .
a01 a11 b01 b11 a01 + b01 a11 + b11
0 ⊕ 0 = 0,
. 0 ⊕ 1 = 1 ⊕ 0 = 1, 1 ⊕ 1 = 0.
Closure, associativity, and commutativity can be directly seen from the definition.
The identity element is 0, and the inverse of 1 is 1. Hence .(F2 , ⊕) is an abelian
group.
Example 1.2.5 Let .E = { a, b }, .a /= b. Define addition in E as follows:
a + a = a,
. a + b = b + a = b, b + b = a.
Closure, associativity, and commutativity can be directly seen from the definition.
The identity element is a, and the inverse of b is b. Hence .(E, +) is an abelian
group.
Next, we will see a group that is not abelian. To introduce this group, we start by
defining permutations.
Definition 1.2.3 A permutation of a set S is a bijective function .σ : S → S.
Example 1.2.6
• Let .S = {0, 1, 2}. Define .σ : S → S as follows:
0 ⍿→ 1,
. 1 ⍿→ 2, 2 ⍿→ 0.
14 1 Mathematical and Statistical Background
Then .σ is a permutation of S.
• Let .S = {◦, Δ, □}. Define .τ : S → S as follows:
◦ ⍿→ Δ,
. Δ ⍿→ □, □→
⍿ ◦.
Then .τ is a permutation of S.
We note that what matters for a permutation is how many objects we have, not the
objects’ nature. We can label a set of n objects with .1, 2, . . . , n. In Example 1.2.6,
we can label .◦ as 0, .Δ as 1, and .□ as 2. Then .σ and .τ are the same permutation.
Now, we take a set S of n elements. Labeling the elements allows us to consider
.S = {1, . . . , n}. Let .Sn denote the set of all permutations of S. And let .◦ denote the
composition of functions (see Definition 1.1.3). Then we have the following lemma:
Lemma 1.2.1 .(Sn , ◦) is a group.
The proof is easy. We leave it as an exercise for the readers.
We note that the identity element in the group is the identity function .σ : S → S,
.σ (s) = s ∀s ∈ S. Any .σ ∈ Sn is bijective (see Definition 1.1.2), and the inverse of
σ1 : S → S, 1 ⍿→ 2, 2 ⍿→ 3, 3 ⍿→ 1;
. σ2 : S → S, 1 ⍿→ 3, 2 ⍿→ 2, 3 ⍿→ 1.
. g · g · · · g = g k = e.
◟ ◝◜ ◞
k times
1.2.2 Rings
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a10 b00 b10 a00 b00 + a10 b01 a00 b10 + a10 b11
. × = .
a01 a11 b01 b11 a01 b00 + a11 b01 a01 b10 + a11 b11
⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
10 00 00 00 10 00
. = , but = .
00 10 00 10 00 10
Example 1.2.14 In Example 1.2.4 we have shown that .(F2 , ⊕) is an abelian group.
Let us define logical AND , denoted .&, in .F2 as follows:
0 & 0 = 0,
. 1 & 0 = 0 & 1 = 0, 1 & 1 = 1.
Closure of .F2 with respect to .&, associativity and commutativity of .&, and the
distributive laws are easy to see from the definitions. The identity element for .& is
1. .(F2 , ⊕, &) is a commutative ring.
Example 1.2.15 In Example 1.2.5 we showed that .(E, +) is an abelian group.
Define multiplication in E as follows:
a · a = a,
. a · b = b · a = a, b · b = b.
Closure of E with respect to .·, associativity of .·, commutativity of .·, and the
distributive laws are easy to see from the definitions. The identity element for .·
is b. Thus .(E, +, ·) is a commutative ring.
Definition 1.2.10 Let .(R, +, ·) be a ring with additive identity 0 and multiplicative
identity 1. Let .a, b ∈ R. If .a /= 0 and .b /= 0 but .a · b = 0, then a and b are called
zero divisors. If .a · b = b · a = 1, a (also b) is said to be invertible, and it is called
a unit.
Example 1.2.16
• There are no zero divisors in .(Z, +, ×), .(Q, +, ×), .(R, +, ×), or .(C, +, ×).
• Any nonzero element in .(Z, +, ×), .(Q, +, ×), .(R, +, ×), or .(C, +, ×) is a unit.
Example 1.2.17 As shown in ⎛ Examples
⎞ 1.2.3 and 1.2.13, .(M2×2 (R), +, ×) is a
00
ring. The additive identity is . . Since
00
⎛ ⎞⎛ ⎞ ⎛ ⎞
10 00 00
. = ,
00 10 00
18 1 Mathematical and Statistical Background
⎛ ⎞ ⎛ ⎞
10 00
by Definition 1.2.10, . and . are zero divisors.
00 10
Definition 1.2.11 An integral domain is a commutative ring with no zero divisors.
Example 1.2.18 .(Z, +, ×), .(Q, +, ×), .(R, +, ×), and .(C, +, ×) are all integral
domains.
1.2.3 Fields
ab = ac = 1.
.
bab = bac = b =⇒ b = c = b.
.
a −1 · a · b = 1 · b = 0 =⇒ b = 0,
.
a contradiction. ⨆
⨅
Example 1.2.19
• (Q, +, ×), (R, +, ×), and (C, +, ×) are all fields.
• (Z, +, ×) is not a field. For example, 2 ∈ Z is not invertible and 2 /= 0.
For the rest of this subsection, let F be a field with addition + and multiplication
·.
Definition 1.2.13 A field with finitely many elements is called a finite field.
Example 1.2.20 In Example 1.2.14 we have shown that (F2 , ⊕, &) is a commu-
tative ring. The only nonzero element is 1, which has inverse 1 with respect to &.
Thus (F2 , ⊕, &) is a finite field.
1.2 Abstract Algebra 19
⎲
p
p⊙a =
. a.
i=1
. 2 ⊙ 1 = 1 ⊕ 1 = 0.
2 ⊙ b = b + b = a.
.
where the last part follows from Lemma 1.2.2. As n, m are both strictly smaller than
p, we have a contradiction. ⨆
⨅
Definition 1.2.15 Let E, F be two fields with F ⊂ E. F is called a subfield of E
if the addition and multiplication of E, when restricted to F , are the same as those
in F .
Example 1.2.23 Q is a subfield of R, and R is a subfield of C.
20 1 Mathematical and Statistical Background
f (0) = a,
. f (1) = b.
It is easy to see that f is bijective. Also, it can be shown that f preserves both
addition and multiplication. For example,
Remark 1.2.2
• We will use Fpn to denote the unique finite field with pn elements.
• Let K be a finite field with characteristic p and multiplicative identity 1. Then
K contains 1, 2, . . . , p − 1, 0, the p multiples of 1. Thus, K contains a subfield
isomorphic to Fp .
Furthermore, we define the notion of bit formally.
Definition 1.2.17
• Variables that range over F2 are called Boolean variables or bits.
• Addition of two bits is defined to be logical XOR , also called exclusive OR.
• Multiplication of two bits is defined to be logical AND.
• When the value of a bit is changed, we say the bit is flipped.
1.3 Linear Algebra 21
The most readers are probably very familiar with linear algebra. However, when we
learned about matrices in high school, we focused on the case when the underlying
abstract structure is a field. In Sect. 1.3.1 we will see the general case when the
underlying abstract structure is a commutative ring. Then in Sect. 1.3.2 we recap
concepts for vector spaces.
1.3.1 Matrices
where .aij denotes the entry in the ith row and j th column. If .aij = 0 for .i /= j , A
is said to be a diagonal matrix. An n-dimensional identity matrix, denoted .In , is a
diagonal matrix whose diagonal entries are 1, i.e., .aii = 1 for .i = 0, 1, . . . , n − 1.
A .1 × n matrix is called a row vector. An .n × 1 matrix is called a column vector. An
.n × n matrix is called a square matrix (i.e., a matrix with the same number of rows
and columns).
22 1 Mathematical and Statistical Background
⎛ ⎞⎛ ⎞
a00 ... a0(n−1) b00 ... b0(r−1)
⎜ a10 ... ⎟ ⎜
a1(n−1) ⎟ ⎜ b10 ... b1(r−1) ⎟
⎜ ⎟
AB = ⎜ .. ⎟⎜ .. ⎟
⎝ . ⎠⎝ . ⎠
a(m−1)0 . . . a(m−1)(n−1) b(n−1)0 . . . b(n−1)(r−1)
.
⎛ ⎞ (1.5)
c00 . . . c0(r−1)
⎜ c10 . . . c1(r−1) ⎟
⎜ ⎟
=⎜ .. ⎟,
⎝ . ⎠
c(m−1)0 . . . c(m−1)(r−1)
where .cij is the scalar product of the ith row of A and the j th column of B:
⎲
n−1
cij =
. aik bkj , i = 0, 1, . . . , m − 1, j = 0, 1, . . . , r − 1.
k=0
.AB = BA = In .
Proof In Examples 1.2.3 and 1.2.13 we have shown that .M2×2 (R) is a ring. The
proof for the general case is similar.
24 1 Mathematical and Statistical Background
The closure of .Mn×n (R) with respect to both operations is easy to see.
Associativity and distributive laws for addition and multiplication follow from the
corresponding properties of R.
The additive identity is the zero matrix of size .n × n. The additive inverse of a
matrix A with coefficients .aij (.0 ≤ i, j ≤ n − 1) is given by .−A with coefficients
.−aij , .(0 ≤ i, j ≤ n − 1). The multiplicative identity is .In .
Then
⎛ ⎞ ⎛ ⎞
0 0 ... 0 0 0 ... 0
⎜0 0 ... 0⎟ ⎜0 0 ... 0⎟
⎜ ⎟ ⎜ ⎟
.AB = ⎜ . .. .. .. ⎟ , BA = ⎜ . .. . . .. ⎟ .
⎝ .. . . .⎠ ⎝ .. . . .⎠
0 0 ... 0 1 0 ... 0
⎲
n−1
. det(A) := (−1)i0 +j ai0 j det(Ai0 j ). (1.6)
j =0
We note that the value of .det(A) is independent of the choice of .i0 in Eq. 1.6 (see
Appendix A.1). Similarly, .det(A) can also be found by fixing a .j0 and computing
⎲
n−1
. det(A) = (−1)i+j0 aij0 det(Aij0 ).
i=0
1.3 Linear Algebra 25
⎛ ⎞
a00 a01
Example 1.3.6 Let .n = 2; for any .A ∈ M2×2 (R), we can write .A = .
a10 a11
Take .i0 = 0,
⎲
n−1 ⎲
1
. det(A) = (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ) = a00 a11 − a01 a10 .
j =0 j =0
Theorem 1.3.2 A matrix .A ∈ Mn×n (R) is invertible in .Mn×n (R) if and only if
det(A) is a unit in R.
.
When .det(A) is a unit in R, if .n = 1 and .A = (a), then .A−1 = (a −1 ). If .n > 1,
we define the adjoint matrix of A as follows:
⎛ ⎞
(−1)0+0 det(A00 ) (−1)0+1 det(A10 ) .. (−1)0+(n−1) det(A(n−1)0 )
⎜ . . . ⎟
.adjA:=⎜ ⎟,
. . .. .
⎝ . . . . ⎠
(−1)(n−1)+0 det(A0(n−1) ) (−1)(n−1)+1 det(A1(n−1) ) .. (−1)(n−1)+(n−1) det(A(n−1)(n−1) )
The identity element for vector addition is 0. Furthermore, for any .a + bi ∈ C, its
inverse with respect to vector addition is given by .−a − bi.
Let .F n = { (v0 , v1 , . . . , vn−1 ) | vi ∈ F ∀i } be the set of n-tuples over F . Define
vector addition and scalar multiplication by elements of F componentwise: For any
.v = (v0 , v1 , . . . , vn−1 ) ∈ F , .w = (w0 , w1 , . . . , wn−1 ) ∈ F , and any .a ∈ F ,
n n
Theorem 1.3.3 Together with vector addition and scalar multiplication by ele-
ments of F defined in Eqs. 1.8 and 1.9, respectively, .F n = { (v0 , v1 , . . . , vn−1 ) | vi ∈
F ∀i } is a vector space over F .
Proof Take any .v = (v0 , v1 , . . . , vn−1 ), w = (w0 , w1 , . . . , wn−1 ) from .F n and
any .a, b ∈ F .
By Eq. 1.8, it is easy to see that .F n is closed under vector addition. The
associativity and commutativity of vector addition follow from that for addition in
1.3 Linear Algebra 27
F . The identity element for vector addition is .(0, 0, . . . , 0), where 0 is the additive
identity in F . The inverse of .v ∈ F n is .(−v0 , −v1 , . . . , −vn−1 ), where .−vi is the
additive inverse of .vi in F . Thus .F n with vector addition is an abelian group.
By the definition of scalar multiplication by elements of F (Eq. 1.9), .av ∈ F n .
Properties 1 and 2 in Definition 1.3.5 follow from distributive law in F . Property 3
follows from the associativity of multiplication in F . Property 4 follows from the
definition of multiplicative identity in F . ⨆
⨅
Example 1.3.10 Let .F = F2 , the unique finite field with two elements (see
Example 1.2.20 and Theorem 1.2.3). Let n be a positive integer, and it follows from
Theorem 1.3.3 that .Fn2 is a vector space over .F2 .
The identity element for vector addition is .(0, 0, . . . , 0). For any .v =
(v0 , v1 , . . . , vn−1 ) ∈ Fn2 , the inverse of .v with respect to vector addition is
.(−v0 , −v1 , . . . , −vn−1 ) = v.
Recall that variables ranging over .F2 are called bits (see Definition 1.2.17). We have
shown that .(F2 , ⊕, &) is a finite field (see Example 1.2.20), where .⊕ is logical XOR
(see Example 1.2.4), and .& is logical AND (see Example 1.2.14).
Definition 1.3.6 Vector addition in .Fn2 is called bitwise XOR , also denoted .⊕.
Similarly, we define bitwise AND between any two vectors .v = (v0 , v1 , . . . , vn−1 ),
.w = (w0 , w1 , . . . , wn−1 ) from .F as follows:
n
2
Remark 1.3.2 Another useful binary operation, logical OR, denoted .∨, on .F2 is
defined as follows:
.0 ∨ 0 = 0, 1 ∨ 0 = 1, 0 ∨ 1 = 1, 1 ∨ 1 = 1.
Definition 1.3.7 A vector in .Fn2 is called an n-bit binary string. A 4-bit binary string
is called a nibble. An 8-bit binary string is called a byte.
Example 1.3.12
• .1010, 0011 ∈ F42 are two nibbles. Furthermore,
• 00101100 is a byte.
28 1 Mathematical and Statistical Background
We note that 1-(b) and 3 follow from the corresponding properties of V . Thus, to
prove U is a subspace of V , we need to prove 1-(a), 1-(c), 1-(d), and 2.
In case .F = F2 , by Example 1.3.10, 1-(d) is true by default. Furthermore, 2 is
also true as there are only two elements in .F2 : 0 and 1. To show U is a subspace
when .F = F2 , it suffices to prove 1-(a) and 1-(c).
Definition 1.3.9 A linear combination of .v 1 , v 2 , . . . , v r ∈ V is a vector of the form
a1 v 1 + a2 v 2 + · · · + ar v r , where .ai ∈ F ∀i.
.
⎲
r ⎲
r ⎲
r
v+u=
. ai v i + bi v i = (ai + bi )v i ∈ U.
i=1 i=1 i=1
⎲
r
0=
. ai v i ∈ U.
i=1
⎲
r
u :=
. (−ai )v i
i=1
⎲
r ⎲
r
α
. ai v i = (αai )v i ∈ U.
i=1 i=1
⨆
⨅
Definition 1.3.10 Let .S = { v 1 , v 2 , . . . , v r } ⊆ V ,
〈S〉 := { a1 v 1 + a2 v 2 + · · · + ar v r | ai ∈ F }
.
is called the (linear) span of S over F . For any subspace .U ⊆ V and a subset S of
U , if .U = 〈S〉, S is called a generating set for U .
We note that if S is a subspace of V , then .〈S〉 = S.
Example 1.3.13 Let .V = F32 and .S = { 001, 100 }, then .〈S〉 = { 000, 001, 100, 101 }
Definition 1.3.11 A set of vectors .{ v 1 , v 2 , . . . , v r } ⊆ V are linearly independent
over F if
⎲
r
. ai v i = 0 =⇒ ai = 0 ∀i.
i=1
(0, 1, 0), (2, 3, 0), (1, 0, 0) are linearly dependent since, for example, we have
.
⎲
r ⎲
r ⎲
r
.v= ai v i = bi v r =⇒ (ai − bi )v i = 0 =⇒ ai = bi .
i=1 i=1 i=1
Example 1.3.15
• Let .F = R, .V = R3 , and .B = { (1, 0, 0), (0, 1, 0), (0, 0, 1) }. It is easy to see
that vectors in B are linearly independent. For any .v = (v0 , v1 , v2 ) ∈ R3 , we
have
⎲
n−1
u=
. u𝓁 v 𝓁 .
𝓁=0
⎲
r1
w1 =
. aj v j
j =1
1.3 Linear Algebra 31
for some .aj ∈ F . Moreover, at least one of .aj /= 0 as vectors in .B2 are linearly
independent. Without loss of generality, let us assume .a1 /= 0, then
⎲
r1
aj 1
v1 = −
. vj + w1 ,
a1 a1
j =2
{ }
and we have . w1 , v 2 , . . . , v r1 spans V . Then, we can write
⎲
r1
w2 = b1 w1 +
. bj v j ,
j =2
where .bj ∈ F , and at least one of .bj /= 0 for .2 ≤ j ≤ r1 , otherwise .w2 is a linear
combination of .w1 . Suppose .b2 /= 0. We have
b1 ⎲
1
bj
r
1
v2 = −
. w1 − v j + w2 ,
b2 b2 b2
j =3
{ }
which means . w1 , w 2 , v 3 , . . . , v r1 spans V .
{ We can continue } in this manner, if .r1 < r2 , we will deduce that
. w 1 , w 2 . . . , w r1 spans V , and .wr1 +1 can be written as a linear combination
{ }
of . w 1 , w 2 . . . , w r1 , a contradiction. ⨆
⨅
We have the following direct corollary.
Corollary 1.3.1 If .B1 and .B2 are bases of V , then .|B1 | = |B2 |.
Proof By Lemma 1.3.2, .|B1 | ≤ |B2 | and .|B2 | ≤ |B1 |. ⨆
⨅
Definition 1.3.13 The dimension of V over F , denoted .dim(V )F , is given by the
cardinality of B, .|B|, where B is a basis of V over F .
Example 1.3.17 Continuing Example 1.3.16, .dim(F n )F = n.
Lemma 1.3.3 Let .F = F2 ; if .dim(V )F2 = k, then .|V | = 2k .
Proof Let .B = { v 1 , v 2 , . . . , v k } be a basis for V . We have discussed in
Remark 1.3.5 that every .w ∈ V has a unique representation as a linear combination
of vectors in B. In other words,
⎧ ⎫
⎲
k |
|
.V = ai v i | ai ∈ F2 , 1 ≤ i ≤ k ,
i=1
⎲
n−1
.v·w = vi wi .
i=0
⎲
n−1 ⎲
n−1 ⎲
n−1
(v + w) · u =
. (vi + wi )ui = vi ui + wi ui = v · u + w · u. (1.10)
i=0 i=0 i=0
Definition 1.3.14
• For any .v, w ∈ Fn2 , .v and .w are said to be orthogonal if .v · w = 0.
• Let .S ⊆ Fn2 be nonempty. The orthogonal complement, denoted .S ⊥ , of S is given
by
{ }
S ⊥ = v | v ∈ Fn2 , v · s = 0 ∀s ∈ S .
.
• If .S = ∅, we define .S ⊥ = Fn2 .
By definition, it is easy to see that .〈S〉⊥ = S ⊥ .
Lemma 1.3.4 For any .S ⊆ V , .S ⊥ is a subspace of .Fn2 .
Proof By Remark 1.3.4, we will prove 1-(a) and 1-(c).
1-(a). Take any .v, u ∈ S ⊥ and any .s ∈ S, by Eq. 1.10, we have
(v + w) · s = v · s + u · s = 0,
.
hence .v + w ∈ S ⊥ .
1-(c). .0 · s = 0 for any .s ∈ S. Hence .0 ∈ S ⊥ . ⨆
⨅
integer divided by n. Here we would like to provide a rigorous definition for this
association. First, we introduce the notion of equivalence relations.
Definition 1.4.1 A relation .∼ on a set S is called an equivalence relation if
∀a, b, c ∈ S, and the following conditions are satisfied.
.
• .a ∼ a (reflexivity).
• If .a ∼ b, then .b ∼ a (symmetry).
• If .a ∼ b and .b ∼ c, then .a ∼ c (transitivity)
Let us define a relation .∼ on the set .Z as follows:
a∼b
. if and only if n|(b − a). (1.11)
a := { b | b ∈ S, b ∼ a } .
.
Proof By Theorem 1.1.2, given any .b ∈ Z, we can find .q, r ∈ Z such that
0 ≤ r < n and b = qn + r =⇒ b ∼ r.
.
{ }
By Theorem 1.4.1, we have .b = r. Hence the set . 0, 1, . . . , n − 1 contains all the
congruence classes of integers modulo n, possibly with some repetitions.
If .r 1 = r 2 for some .0 ≤ r1 , r2 < n, then .n|(r1 − r2 ). Since .0 ≤ r1 , r2 < n, we
have .r1 = r2 . Thus .0, 1, . . . , n − 1 are all distinct. ⨆
⨅
Remark 1.4.1 .a = b if and only if .a ≡ b mod n.
Example
{ } Let .n = 5. We have .1 = 6 = −4. By Lemma 1.4.1, .Z5 =
1.4.1
0, 1, 2, 3, 4 .
We define the addition operation on the set .Zn as follows:
.a + b = a + b. (1.12)
If .a = a ' and .b = b' , we have .n|(a ' − a) and .n|(b' − b), therefore
n|((a ' − a) + (b' − b)) =⇒ n|((a ' + b' ) − (a + b)) =⇒ (a + b) ∼ (a ' + b' )
.
=⇒ a + b = a ' + b' .
Proof For any .a, b ∈ Zn , .a + b ∈ Zn . Hence .Zn is closed under .+. The
associativity follows from the associativity of the addition of integers. The identity
element is .0, and the inverse of .a is .n − a:
a + n − a = n − a + a = n = 0.
.
1+1 = 2
.
1+1+1 = 3
..
.
+ . . . 1◞ = n − 1
◟1 + 1◝◜
n−1 times
+ . . . 1◞ = n = 0.
◟1 + 1◝◜
n times
⨆
⨅
We define the multiplication on .Zn as follows:
a · b = ab.
. (1.13)
If .a ' = a and .b' = b, then we can write .a ' = a + sn, b' = b + tn for some integers
.s, t. We have
Hence .a ' b' = ab, and the multiplication in Eq. 1.13 is well-defined.
Example 1.4.4 Let .n = 5,
−2 · 13 = 3 · 3 = 9 = 4.
.
36 1 Mathematical and Statistical Background
Theorem 1.4.2 .(Zn , +, ·), the set .Zn together with addition defined in Eq. 1.12 and
multiplication defined in Eq. 1.13, is a commutative ring. It is an integral domain if
and only if n is prime.
Proof In Proposition 1.4.1 we have shown that .(Zn , +) is an abelian group.
Take any .a, b ∈ Zn , .ab ∈ Zn . Hence .Zn is closed under .·. Associativity,
commutativity of multiplication, and distributive laws follow from that for the
integers. The identity element for multiplication is .1. We have proved that .(Zn , +, ·)
is a commutative ring.
If n is not a prime, let m be a prime that divides n. Then .d = n/m is an integer
and .d /= 0. We have
m · d = n = 0.
.
By Definition 1.2.10, .d, m are zero divisors in .Zn . By Definition 1.2.11, .Zn is not
an integral domain.
Let n be a prime. Suppose there are .a, b ∈ Zn , such that .a /= 0, .b /= 0, and
.a · b = 0. By definition, we have .n|ab. By Lemma 1.1.2, .n|a or .n|b, which gives
.a = 0 or .b = 0, a contradiction. ⨆
⨅
For simplicity, we write a instead of .a, and to make sure there is no confusion with
.a ∈ Z, we would specify that .a ∈ Zn . In particular, .Zn = { 0, 1, 2, . . . , n − 1 }.
Furthermore, to emphasize that multiplication or addition is done in .Zn , we write
.ab mod n or .a + b mod n.
Proof By Bézout’s identity (Theorem 1.1.3), .gcd(a, n) = sa+tn for some .s, t ∈ Z.
.⇐= If .gcd(a, n) = 1, then .sa + tn = 1, i.e., .n|(1 − sa). By definition, .sa ≡
such that .as mod n = 1, which gives .n|(as − 1). Hence there is some .t ∈ Z such
that .1 = as + tn. By Lemma 1.1.1 (6), .gcd(a, n)|1. As .gcd(a, n) > 0, we have
.gcd(a, n) = 1. ⨆
⨅
Remark 1.4.3 Recall that by the extended Euclidean algorithm (Algorithm 1.2),
we can find integers .s, t such that .gcd(a, n) = sa+tn for any .a, n ∈ Z. In particular,
when .gcd(a, n) = 1, we can find .s, t such that .1 = as +tn, which gives .as mod n =
1. Thus, we can find .a −1 mod n = s mod n by the extended Euclidean algorithm.
Example 1.4.6 We calculated in Example 1.1.15 that .gcd(160, 21) = 1 and .1 =
(−8) × 160 + 61 × 21. We have .21−1 mod 160 = 61.
1.4 Modular Arithmetic 37
p = 5,
. q = 7.
7 = 5 × 1 + 2,
. 5 = 2 × 2 + 1,
1 = 5 − 2 × 2 = 5 − (7 − 5) × 2 = 5 × 3 − 7 × 2.
.
We have
p = 7,
. q = 47.
47 = 7 × 6 + 5,
. 7 = 5 × 1 + 2, 5 = 2 × 2 + 1,
1 = 5 − 2 × 2 = 5 − (7 − 5) × 2 = 5 × 3 − 7 × 2 = (47 − 7 × 6) × 3 − 7 × 2
.
= 47 × 3 − 7 × 20.
We have
and .b1 /= b2 . By Lemma 1.4.3, .a −1 exists. Multiply both sides of Eq. 1.14 by .a −1 ,
we get .b1 ≡ b2 mod n, a contradiction. ⨆
⨅
We note that when p is prime, .Zp is the unique finite field .Fp up to isomorphism
(see Theorem 1.2.3 and Remark 1.2.2).
Lemma 1.4.3 leads us to the following definition.
Definition 1.4.5 Let .Z∗n denote the set of congruence classes in .Zn which have
multiplicative inverses:
. Z∗n := { a | a ∈ Zn , gcd(a, n) = 1 } .
Euler’s totient function, .ϕ, is a function defined on the set of integers bigger than 1
such that .ϕ(n) gives the cardinality of .Z∗n :
ϕ(n) = |Z∗n |.
.
Example 1.4.9
• Let .n = 3, .Z∗3 = {1, 2}, .ϕ(3) = 2.
• Let .n = 4, .Z∗4 = {1, 3}, .ϕ(4) = 2.
• Let .n = p be a prime number, .Z∗p = Zp −{0} = {1, 2, . . . , p−1},3 .ϕ(p) = p−1.
Lemma 1.4.4 .(Z∗n , ·), the set .Z∗n together with the multiplication defined in .Zn
(Eq. 1.13), is an abelian group.
Proof For any .a, b ∈ Z∗n , .a −1 , b−1 ∈ Z∗n . We note that .(ab)(b−1 a −1 ) = 1; hence
ab has an inverse in .Z∗n and .ab ∈ Z∗n (closure). The associativity follows from that
for multiplications in .Z. The identity element is 1, and Lemma 1.4.3 proves that
every element has an inverse in .Z∗n . ⨆
⨅
Recall by the Fundamental Theorem of Arithmetic (Theorem 1.1.5), every integer
n > 1 is either a prime or can be written as a product of primes in a unique way.
.
We have the following result concerning Euler’s totient function. The proof can be
found in, e.g., [Sie88, page 247].
Theorem 1.4.3 For any .n ∈ Z, .n > 1,
∏
k k ⎛
∏ ⎞
1
if
. n= piei , then ϕ(n) = n 1− , (1.15)
pi
i=1 i=1
Example 1.4.10
• Let .n = 10. .10 = 2 × 5. We can count the elements in .Z10 that are coprime to
10 (labeled in red color):
Z10 = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } .
.
• In particular, if .p = 2,
ϕ(2k ) = 2k−1 .
.
Proof By definition, .|Z∗n | = ϕ(n). If .gcd(a, n) = 1, then .a ∈ Z∗n . The result follows
from Theorem 1.2.1. ⨆
⨅
Example 1.4.11 Let .n = 4. We have calculated that .ϕ(4) = 2 in Example 1.4.9.
And
32 = 9 ≡ 1 mod 4.
.
34 = 81 ≡ 1 mod 10.
.
40 1 Mathematical and Statistical Background
Example 1.4.12
• Let .p = 3. .22 = 4 ≡ 1 mod 3.
• Let .p = 5. .24 = 16 ≡ 1 mod 5.
Corollary 1.4.3 Let p be a prime. Then for any .a, b, c ∈ Z such that .b ≡
c mod (p − 1), we have
a b ≡ a c mod p.
.
In particular,
⨆
⨅
Example 1.4.13 Let .p = 5, .a = 2, .b = 6. Then
26 ≡ 26 mod 4 ≡ 22 ≡ 4 mod 5.
.
26 ≡ 64 ≡ 4 mod 5.
.
Corollary 1.4.4 Let p be a prime and b be an integer coprime to .ϕ(p). For any
a1 , a2 ∈ Zp , if .a1 /= a2 , then .a1b /≡ a2b mod p.
.
Proof Suppose .a1 /= a2 and .a1b ≡ a2b mod p. Let .c = b−1 mod ϕ(p), then
This agrees with Corollary 1.4.4. On the other hand, if we let .b = 2, which is no
coprime to .ϕ(p), we have
a1b ≡ 32 ≡ 9 ≡ 2 mod 7,
. a2b ≡ 42 ≡ 16 ≡ 2 mod 7.
In this part, we will discuss how to solve a system of linear congruences in .Zn .
We first consider one linear congruence equation.
Lemma 1.4.5 For any .a, b ∈ Z, the linear congruence
ax ≡ b mod n
.
ax + kn = b.
. (1.16)
sb tb
a
. +n = b.
gcd(a, n) gcd(a, n)
sb
Thus . gcd(a,n) is a solution for Eq. 1.16. ⨆
⨅
Example 1.4.15 Let .n = 10, .a = 4. Then .gcd(a, n) = 2. By Lemma 1.4.5, the
linear congruence .4x ≡ 1 mod 10 has no solution. Indeed, if we try to multiply any
integer by 4 and divide by 10, we will not get an odd remainder.
On the other hand, the linear congruence .4x ≡ 2 mod 10 has at least one solution.
For example, .x = 3 is a solution (.4 × 3 ≡ 12 ≡ 2 mod 10).
42 1 Mathematical and Statistical Background
ax ≡ b mod n
.
.⇐= Suppose .gcd(a, n) = 1. Take any two solutions .x0 , x1 ∈ Zn , and we have
x ≡ 2 mod 3
.
x ≡ 3 mod 5
x ≡ 2 mod 7
x≡? (1.17)
Before answering the question, we provide the solution for a more general case.
Let us consider a system of simultaneous linear congruences
1.4 Modular Arithmetic 43
x ≡ a1 mod m1
.
x ≡ a2 mod m2
..
.
x ≡ ak mod mk , (1.18)
where .mi are pairwise coprime positive integers, i.e., .gcd(mi , mj ) = 1 for .i /= j .
Define
∏
k
m
m=
. mi , Mi = , 1 ≤ i ≤ k. (1.19)
mi
i=1
Since .mi are pairwise coprime, .mi and .Mi are coprime. By Lemma 1.4.3, .yi :=
Mi−1 mod mi exists. It can be computed by the extended Euclidean algorithm (See
Remark 1.4.3). Let
⎲
k
x=
. ai yi Mi mod m. (1.20)
i=1
ai yi Mi ≡ ai mod mi ,
. and aj yj Mj ≡ 0 mod mi if j /= i.
Then,
⎲
x ≡ ai yi Mi +
. aj yj Mj ≡ ai mod mi for all i.
1≤j ≤n,j /=i
m1 = 3,
. m2 = 5, m3 = 7, a1 = 2, a2 = 3, a3 = 2,
and
m = 3 × 5 × 7 = 105,
. M1 = 35, M2 = 21, M3 = 15.
⎲
3
x=
. ai yi Mi mod n = 2 × 2 × 35 + 3 × 1 × 21 + 2 × 1 × 15 mod 105
i=1
= 233 mod 105 = 23 mod 105.
Example 1.4.17 Let us solve the following system of simultaneous linear congru-
ences:
x ≡ 2 mod 5
.
x ≡ 1 mod 7
x ≡ 5 mod 11
x ≡ ? mod 385.
m1 = 5,
. m2 = 7, m3 = 11, a1 = 2, a2 = 1, a3 = 5,
m = 5 × 7 × 11 = 385,
. M1 = 77, M2 = 55, M3 = 35.
Then
M1 ≡ 77 ≡ 2 mod 5,
. M2 ≡ 55 ≡ 6 mod 7, M3 ≡ 35 ≡ 2 mod 11.
y1 = M1−1 mod 5 = 3,
. y2 = M2−1 mod 7 = 6, y3 = M3−1 mod 11 = 6.
And
⎲
3
. x= ai yi Mi mod m = 2 × 3 × 77 + 1 × 6 × 55 + 5 × 6 × 35 mod 385
i=1
= 1842 mod 385 = 302.
x ≡ a1 mod m1 ,
. x ≡ a2 mod m2 , ... x ≡ ak mod mk
∏k
has a unique solution modulo .m = i=1 mi .
Proof The discussions above have shown the existence of such a solution. To prove
the uniqueness, let .x1 , x2 ∈ Zm be two solutions for the system of simultaneous
congruences. Then
x1 ≡ x2 mod m1 ,
. x1 ≡ x2 mod m2 , ... x1 ≡ x2 mod mk .
By definition, we have
m1 |(x1 − x2 ),
. m2 |(x1 − x2 ), ... mk |(x1 − x2 ).
∏k .mi s are pairwise coprime, by Lemma 1.1.1 (8), we can conclude that .m =
Since
i=1 mi divides .x1 − x2 . As .x1 and .x2 are from .Zm , we must have .x1 = x2 . ⨆
⨅
Example 1.4.18 Let .p = 3, q = 5, n = 15, a = 10. We would like to find the
unique solution .x ∈ Z15 such that
x ≡ 10 mod 3,
. x ≡ 10 mod 5.
We have
m1 = p = 3,
. m2 = q = 5, a1 = a2 = a = 10.
Hence
m = n = 15,
. M1 = 5, M2 = 3, y1 = 5−1 mod 3 = 2, y2 = 3−1 mod 5 = 2.
And
Example 1.4.19 Take two distinct primes .p, q, and let .n = pq. By Theorem 1.4.7,
for any .a ∈ Zn , there is a unique solution .x ∈ Zn such that
x ≡ a mod p,
. x ≡ a mod q. (1.21)
m1 = p,
. m2 = q, a1 = a2 = a.
46 1 Mathematical and Statistical Background
And
m = n = pq,
. M1 = q, M2 = p, y1 = q −1 mod p, y2 = p−1 mod q.
Then
By definition,
and
n|((q −1 mod p)q+(p−1 mod q)p−1) =⇒ (q −1 mod p)q+(p−1 mod q)p≡1 mod n.
.
Thus
Corollary 1.4.5 Let p and q be two distinct primes and .n = pq. For any .a, b ∈ Z,
we have
By Corollary 1.4.3,
By Example 1.4.19,
⨆
⨅
Example 1.4.20 Let .p = 3, .q = 5, .a = 2, .b = 9. Then .n = 15 and .ϕ(n) =
2 × 4 = 8. And
Corollary 1.4.6 Let p and q be two distinct primes and .n = pq. For any .a1 , a2 ∈
Zn and .b ∈ Z∗ϕ(n) , if .a1 /= a2 , then .a1b /≡ a2b mod n.
Proof Suppose .a1b ≡ a2b mod n. Let .c = b−1 mod ϕ(n), then
and
⎲
i
f (x)×F [x] g(x) := dn x n +dn−1 x n−1 +· · ·+d0 , where di =
. aj bi−j . (1.23)
j =0
is given by
where .−ai is the additive inverse of .ai in F . For simplicity, we will write .f (x)g(x)
and .f (x) + g(x) instead of .f (x) ×F [x] g(x) and .f (x) +F [x] g(x).
Example 1.5.2 Let .F = R, and .R[x] is a ring. The identity element for multipli-
cation is 1. The identity element for addition is 0. Take .f (x) = x + 1, g(x) = x in
.R[x],
f (x) + g(x) = 2x + 1,
. f (x)g(x) = x 2 + x.
. − x − 1.
1.5 Polynomial Rings 49
Lemma 1.5.1 For any .f (x), g(x) ∈ F [x], such that .f (x) /= 0, g(x) /= 0, we have
⎲
m ⎲
n
f (x) =
. ai x i , g(x) = bj x j , where am /= 0, bn /= 0.
i=0 j =0
By Eq. 1.23, .f (x)g(x) = d(x), where the highest power of x in .d(x) is .m + n, and
its coefficient is .am bn /= 0. We have .deg(d(x)) = m + n. ⨆
⨅
Lemma 1.5.2 .F [x] is an integral domain.
Proof For any .f (x), g(x) ∈ F [x], such that .f (x) /= 0, g(x) =
/ 0, we have
.deg(f (x)) ≥ 0, deg(g(x)) ≥ 0. By Lemma 1.5.1, .deg(f (x)g(x)) ≥ 0, and hence
.f (x)g(x) /= 0. ⨆
⨅
Similar to Euclid’s algorithm (Theorem 1.1.2), we have the following theorem.
The proof can be found in, e.g., [Her96, page 155].
Theorem 1.5.1 (Division Algorithm) For any .f (x), g(x) ∈ F [x], of
deg(f (x)) ≥ 1, there exist .s(x), r(x) ∈ F [x] such that .deg(r(x)) < deg(f (x)) and
.
Definition 1.5.2 Let .f (x), g(x) ∈ F [x]; if .f (x) /= 0 and .g(x) = s(x)f (x) for
some .s(x) ∈ F [x], then we say .f (x) divides .g(x), written .f (x)|g(x).
Example 1.5.3 Let .F = F5 . Take .g(x) = 4x 5 + x 3 , f (x) = x 3 ∈ F5 [x], then
and .f (x)|g(x).
Definition 1.5.3 A polynomial .f (x) ∈ F [x] of positive degree is said to be
reducible (over F ) if there exist .g(x), h(x) ∈ F [x] such that
. deg(g(x)) < deg(f (x)), deg(h(x)) < deg(f (x)), and f (x) = g(x)h(x).
Remark 1.5.1
• .f (x) ∈ F [x] of degree 2 or 3 is reducible over F if and only if it has a root in
F .4 ∑n
• Let .f (x) = i=0 ai x ∈ F [x]. Then .f (0) = a0 . Thus .f (x) is reducible if
i
.a0 = 0.
∑n ∑n
• Let .f (x) = i=0 ai x ∈ F2 [x]. Then .f (1) = i=0 ai . If .|{ ai | ai /= 0 }| is
i
even, then .f (1) = 0 and .f (x) is reducible over .F2 . In other words, any .f (x) ∈
F2 [x] with an even number of nonzero terms is reducible over .F2 .
Example 1.5.4
• .h(x) = 4x 5 +x 3 ∈ F3 [x] has degree 5, and it is reducible since .h(x) = x 3 (4x 2 +
1).
• .g(x) = x 2 ∈ F2 [x] has degree 2, and it is reducible, .g(x) = x · x.
Example 1.5.5 Let .F = F2 .
• All the polynomials of degree 2 are .x 2 , x 2 +1, x 2 +x+1, x 2 +x. By Remark 1.5.1,
the only irreducible polynomial of degree 2 is .x 2 + x + 1.
• All the degree 3 polynomials with an odd number of nonzero terms are .x 3 , x 3 +
x + 1, x 3 + x 2 + 1, x 3 + x 2 + x. Among those, the polynomials with .a0 /= 0 are
the irreducible polynomials of degree 3:
x 3 + x + 1, x 3 + x 2 + 1.
.
• Degree 4 polynomials with .a0 /= 0 and an odd number of nonzero terms are
x 4 + x + 1, x 4 + x 2 + 1, x 4 + x 3 + 1, x 4 + x 3 + x 2 + x + 1.
.
By our choice, they are not divisible by degree 1 polynomials. By Lemma 1.5.3,
any of them is reducible if and only if it is divisible by .x 2 + x + 1, which can be
verified using the Division Algorithm (Theorem 1.5.1). For example,
x 4 + x + 1 = x 2 (x 2 + x + 1) + (x 3 + x + x 2 + 1)
.
x 4 + x 2 + 1 = (x 2 + x + 1)(x 2 + x + 1)
.
is divisible by .x 2 + x + 1.
Finally, we have all the degree 4 irreducible polynomials over .F2 :
x 4 + x + 1, x 4 + x 3 + 1, x 4 + x 3 + x 2 + x + 1.
.
We note that there are many analogies between a polynomial ring .F [x] and the
ring of integers .Z. For example, a polynomial .f (x) corresponds to an integer n. An
irreducible polynomial .p(x) corresponds to a prime p.
For the rest of the section, let us fix a polynomial .f (x) ∈ F [x] such that .f (x) /=
0. The Same as in Eq. 1.11, we define a relation .∼ on .F [x] as follows:
We have shown that the relation in Eq. 1.11 is an equivalence relation on .Z, and a
similar proof shows that .∼ is an equivalence relation on .F [x]. We can also define
congruence in .F [x] (cf. Definition 1.4.2).
Definition 1.5.4 For any .g(x), h(x) ∈ F [x], if .g(x) ∼ h(x), i.e., .f (x)|(g(x) −
h(x)), we say .h(x) is congruent to .g(x) modulo .f (x), written .g(x) ≡
h(x) mod f (x).
The congruence class of .g(x) modulo .f (x) is given by
Similar proofs for Lemma 1.4.1 can be applied to prove the following lemma.
Lemma 1.5.4 Suppose .f (x) has degree n, where .n ≥ 1. Let .F [x]/(f (x)) denote
the set of all congruence classes of .g(x) ∈ F [x] modulo .f (x). Then
⎧ n−1 ⎫
⎲ |
|
F [x]/(f (x)) =
. ai x i | ai ∈ F for 0 ≤ i < n
i=0
can be identified with the set of all polynomials of degree less than n.
Example 1.5.6 Let .f (x) = x 2 + x + 1 ∈ F2 [x]. By Lemma 1.5.4,
F2 [x]/(f (x)) = { 1, x, x + 1 } .
.
F2 [x]/(g(x)) = { 1, x, x + 1 } .
.
We can see that .F2 [x]/(f (x)) and .F2 [x]/(g(x)) contain equivalent classes gener-
ated by the same polynomials.
Naturally, for any .g(x), h(x) ∈ F [x]/(f (x)), the same as in Eqs. 1.12 and 1.13,
addition and multiplication in .F [x]/(f (x)) are computed modulo .f (x).
52 1 Mathematical and Statistical Background
⎲
n−1 ⎲
n−1
g(x) =
. ai x i , h(x) = bi x i
i=0 i=0
⎲
n−1
g(x) + h(x) mod f (x) =
. ci x i , where ci = ai + bi mod 2.
i=0
Thus the addition computations in .F2 [x]/(f (x)) are the same for all .f (x) of the
same degree.
Example 1.5.8 Let .F = F2 , .f (x) = x 2 +x+1 ∈ F2 [x], .g(x) = x ∈ F2 [x]/(f (x)),
and .h(x) = x ∈ F2 [x]/(f (x)). We have
We also have the notion of the greatest common divisors between two nonzero
polynomials in .F [x] (cf. Definition 1.1.5). Then, for any .g(x) ∈ F [x], the modified
version of the Euclidean algorithm (Algorithm 1.1) can be applied to find the
greatest common divisor for .g(x) and .f (x), denoted .gcd(g(x), f (x)). Similarly the
extended Euclidean algorithm (Algorithm 1.2) can be applied to find the inverse
of .g(x) modulo .f (x) when .gcd(f (x), g(x)) = 1. More details are presented
in [LX04, Section 3.2].
Example 1.5.10 Let .F = F2 and .f (x) = x 2 + x + 1, g(x) = x ∈ F2 [x]. By the
Euclidean algorithm, we have
f (x) = (x + 1)g(x) + 1,
. gcd(g(x), f (x)) = gcd(g(x), 1) = 1.
Table 1.2 Addition and multiplication in .F2 [x]/(f (x)), where .f (x) = x 2 + x + 1
.+ 0 1 x .x +1 .× 0 1 x .x +1
0 0 1 x .x +1 0 0 0 0 0
1 1 0 .x + 1 x 1 0 1 x .x + 1
x x .x + 1 0 1 x 0 x .x + 1 1
.x + 1 .x + 1 x 1 0 .x + 1 0 .x + 1 1 x
1.5 Polynomial Rings 53
1 = g(x)(x + 1) + f (x).
.
R/(f (x)) = { a + bx | a, b ∈ R } .
.
Recall that
C = { a + bi | a, b ∈ R } .
.
There are p choices for each of the n .ai s. Hence the cardinality of .Fp [x]/(f (x)) is
pn . The result follows from Theorem 1.2.3.
. ⨆
⨅
Example 1.5.14 Let .f (x) = x 2 + x + 1 ∈ F2 [x]; by Theorem 1.5.3,
.F2 [x]/(f (x)) ∼
= F22 .
1.5.1 Bytes
b7 x 7 + b6 x 6 + b5 x 5 + b4 x 4 + b3 x 3 + b2 x 2 + b1 x + b0 ∈ F2 [x]/(f (x))
.
b7 x 7 + b6 x 6 + b5 x 5 + b4 x 4 + b3 x 3 + b2 x 2 + b1 x + b0 ⍿→ b7 b6 b5 b4 b3 b2 b1 b0
1.5 Polynomial Rings 55
bytes.
Definition 1.5.5 For any two bytes .v = v7 v6 . . . v1 v0 and .w = w7 w6 . . . w1 w0 , let
gv (x) = v7 x 7 + v6 x 6 + · · · + v1 x + v0 and .gw (x) = w7 x 7 + w6 x 6 + · · · + w1 x + w0
.
v + w = c7 c6 . . . c1 c0 , where ci = vi + wi mod 2.
.
Remark 1.5.2 Recall that a byte is also a vector in .F82 , we have defined vector
addition as bitwise XOR (see Definition 1.3.6), and
v +F8 w = u7 u6 . . . u1 u0 , where ui = vi ⊕ wi .
.
2
We note that
(x 6 + x 4 + x 2 + x + 1)(x 7 + x + 1) = x 13 + x 11 + x 9 + x 8 + x 6 + x 5 + x 4 + x 3 + 1,
.
and
x 8 = x 4 + x 3 + x + 1 mod f (x),
.
x 9 = x 5 + x 4 + x 2 + x mod f (x),
x 11 = x 7 + x 6 + x 4 + x 3 mod f (x),
56 1 Mathematical and Statistical Background
x 13 = x 9 + x 8 + x 6 + x 5 mod f (x).
Thus
x 13 +x 11 +x 9 +x 8 +x 6 +x 5 +x 4 +x 3 +1 = x 11 +x 4 +x 3 +1 = x 7 +x 6 +1 mod f (x),
.
which gives
Example 1.5.16 In this example, we would like to compute the formula for a byte
multiplied by 02.16 = x. Take any .g(x) = b7 x 7 + b6 x 6 + · · · + b1 x + b0 ∈
F2 [x]/(f (x)),
Thus, for any byte .b7 b6 . . . b1 b0 , multiplication by 02.16 is equivalent to left shift by
1 and XOR with .000110112 = 1B16 if .b7 = 1.
Example 1.5.17
• .5716 = 010101112 , .0216 × 5716 = 10101110 = AE16 .
• .8316 = 100000112 , .0216 × 8316 = 000001102 ⊕ 000110112 = 000111012 =
1D16 .
• .D416 = 110101002 , .0216 × D416 = 101010002 ⊕ 000110112 = 101100112 =
B316 .
Example 1.5.18 Now, let us compute the multiplication of a byte by .0316 = x + 1.
Take any .h(x) = b7 x 7 + b6 x 6 + · · · + b1 x + b0 ∈ F2 [x]/(f (x)),
We have
03−1
.
16 = (x + 1)
−1
mod f (x) = x 7 + x 6 + x 5 + x 4 + x 2 + x = 111101102 = F616 .
In this section, we give a brief discussion on binary codes, which will be useful for
the design of countermeasures against side-channel attacks (Sect. 4.5.1.1) and fault
attacks (Sect. 5.2.1).
Let n be a positive integer in the rest of this section. To study binary codes, we
look at the vector space .Fn2 , and we refer to vectors in .Fn2 as words of length n.
Definition 1.6.1
• w = w0 w1 . . . wn−1 ∈ Fn2 is called a binary word of length n.
.
Definition 1.6.2 For any .v, u ∈ Fn2 , the Hamming distance between .v and .u,
denoted .dis (v, u), is defined as follows:
⎧
⎲
n−1
1 if vi /= ui
dis (v, u) =
. dis (vi , ui ) , where dis (vi , ui ) = (1.24)
i=0 0 if vi = ui .
Proof (1)–(3) are easy to see. We provide the proof for (4). By Eq. 1.24, it suffices
to consider .n = 1. Take any .v, w, u ∈ F2 .
If .v = w,
Definition 1.6.4 A binary code of length n, size M, and distance d is called a binary
(n, M, d)-code.
.
with 1-bit flip from any codeword, we cannot get another codeword. But with 2-bit
flips, we can change 1101 to 1000. Thus C is exactly 1-error detecting.
Theorem 1.6.1 A binary .(n, M, d)-code C is k-error detecting if and only if .d ≥
k + 1, i.e., C is an exactly .(d − 1)-error detecting code.
Proof .⇐= If .d ≥ k + 1, take .c ∈ C and .x ∈ Fn2 such that .1 ≤ dis (c, x) ≤ k. Then
.x /∈ C, and C is k-error detecting.
codeword .c(u) to Bob. Due to transmission noise, Bob might receive a word .x ∈ Fn2
not equal to .c(u). Thus we need to define a decoding rule for Bob that allows him
to find .u given .x.
We are interested in a minimum distance decoding rule, which specifies that after
receiving .x, Bob computes
If more than one codeword is identified as .cx , there are two options. An incomplete
decoding rule says that Bob should request Alice for another transmission. Follow-
ing a complete decoding rule, Bob would then randomly select one codeword.
Example 1.6.5 Let .C = { 0000, 0111, 1110, 1111 }. We use C to encode informa-
tion words .u ∈ F22 with encoding designed as follows:
c(00) = 0000,
. c(01) = 0111, c(10) = 1110, c(11) = 1111.
Suppose Alice was sending information 00 with codeword 0000 to Bob. Due to
an error during the transmission, Bob received 0001. By the minimum distance
decoding rule, Bob computes the distances between 0001 and codewords in C.
Thus .c0001 = 0000, and Bob gets the correct information 00.
Definition 1.6.6 A binary code C is said to be k-error correcting if the minimum
distance decoding outputs the correct codeword when k or fewer bits are flipped. If
C is k-error correcting but not .k + 1-error correcting, then we say that C is exactly
k-error correcting.
60 1 Mathematical and Statistical Background
( ) ( )
dis v, c' ≥ dis c, c' − dis (v, c) ≥ 2k + 1 − k = k + 1 > dis (v, c) .
.
d bits.
Define .v ∈ Fn2 as
⎧
⎪
⎪ 0≤i<k
⎨ci
.vi = ci' k≤i<d
⎪
⎪
⎩c = c ' k ≥ d.
i i
Then
( )
dis v, c' = d − k ≤ k = dis (v, c) .
.
Definition 1.6.1).
Example 1.6.7
• Let .C = { 00, 11, 01, 10 } = F22 , then .dim(C)F2 = 2, and C is a binary .[2, 2, 1]-
linear code.
• Let .C = 〈111〉 = { 000, 111 }, then .{ 111 } is a basis for C, and .dim(C)F2 = 1.
C is a binary .[3, 1, 3]-linear code.
Example 1.6.8 (Repetition code) Let
⎲
n−2
c = (u0 , u1 , . . . , un−2 , cn−1 ), where cn−1 =
. ui .
i=0
The corresponding code C consists of codewords that have an even number of 1s.
⎧ ⎫
| ⎲
n−2
|
C=
. (c0 , c1 , . . . , cn−2 , cn−1 ) | cn−1 = ci ⊆ Fn2 . (1.25)
i=0
⎲
n−2 ⎲
n−2
v + w = (v0 + w0 , v1 + w1 , . . . , vn−1 + wn−1 ), vn−1 + wn−1 =
. vi + wi
i=0 i=0
⎲
n−2
= (vi + wi ).
i=0
62 1 Mathematical and Statistical Background
different if they differ only at one position in the first .n − 1 bits. Thus, the minimum
distance of C is 2, and C is a binary .[n, n − 1, 2]-linear code. By Theorems 1.6.1
and 1.6.2, C is exactly 1-error detecting and cannot correct errors.
Definition 1.6.9 The dual code of a binary linear code C is the orthogonal
complement of C, .C ⊥ .
By Lemma 1.3.4, .C ⊥ is a binary linear code. It is easy to see that .(C ⊥ )⊥ = C.
Example 1.6.10 Let C be a binary parity-check code of length n (see Exam-
ple 1.6.9). Then .c ∈ C ⊥ if and only if .c · v = 0 .∀v ∈ C, i.e.,
⎛ ⎞
⎲
n−1 ⎲
n−2 ⎲
n−2 ⎲
n−2
. ci vi = 0 ⇐⇒ ci vi + cn−1 vi = 0 ⇐⇒ (ci + cn−1 )vi = 0
i=0 i=0 i=0 i=0
for all .vi = 0, 1(0 ≤ i ≤ n−2), which is equivalent to .ci = cn−1 for all .0 ≤ i ≤ n−
2. Thus .C ⊥ = { 00 . . . 00, 11 . . . 11 } is the n-repetition code (see Example 1.6.8).
Example 1.6.11 Let .C = { 000, 111 } be the binary 3-repetition code, then
We note that when .n = 1, .wt (v) = 1 if .v = 1 and .wt (v) = 0 if .v = 0. Then, for
any .v = (v0 , v1 , . . . , vn−1 ) from .Fn2 ,
⎲
n−1
wt (v) =
. wt (vi ) . (1.26)
i=0
wt (u + v) = wt ((1, 1, 1, 0)) = 3.
.
⨆
⨅
Definition 1.6.11 Let C be a binary liner code. A generator matrix for C is a matrix
whose rows form a basis for C. A parity-check matrix for C is a generator matrix
for .C ⊥ .
Example 1.6.13 Let .C= { 000, 111 }, and we know that .C ⊥ = { 000, 011, 101, 110 }
(see Example 1.6.11). Let
⎛ ⎞
( ) 011
G= 111 ,
. H = .
101
⎲
k−1
uG =
. ui v i ∈ C.
i=0
On the other hand, by Remark 1.3.5, any .c ∈ C has a unique representation of the
form
64 1 Mathematical and Statistical Background
⎲
k−1
c=
. ui v i , where ui ∈ F2 .
i=0
Theorem 1.6.4 Let C be a binary linear code with at least two codewords, and
let H be a parity-check matrix for C. Then .dis (C) is given by d such that any
.d − 1 columns of H are linearly independent and H has d columns that are linearly
dependent.
Proof Take .v ∈ C such that .v /= 0. By definition,
⎲
vH =
. vi hi = 0,
i,vi /=0
where .hi denotes the ith column of H . We can see that the columns .hi , where
vi /= 0, are linearly dependent. Note that .wt (v) = | { vi | vi /= 0 }|.
.
Thus, there exists .v ∈ C such that .wt (v) = d (i.e., .dis (C) ≤ d) if and only if
there are d columns of H that are linearly dependent.
.dis (C) ≥ d if and only if there is no .v ∈ C such that .wt (v) < d, which is
Ω = { 1, 2, 3, 4, 5, 6 } .
.
A = { 1, 2, 3 } ⊆ Ω is an event.
.
66 1 Mathematical and Statistical Background
Ω = { (i, j ) | 1 ≤ i, j ≤ 6 } .
.
1.7.1 σ -Algebras
Let .Ω be a set, and let .A denote a set of subsets of .Ω. .A is called a .σ -algebra if it
has the following properties:
• Ω ∈ A.
.
• If .A ∈ A, then .Ac ∈ A.
• .A is closed under finite unions and intersections: If .A1 , A2 , . . . An ∈ A, then
⋃n ⋂n
. i=1 Ai ∈ A and . i=1 Ai ∈ A.
• .A
⋃ is closed under ⋂ countable unions and intersections: If .A1 , A2 , · · · ∈ A, then
. i=1 Ai ∈ A and . i=1 Ai ∈ A.
The pair .(Ω, A) is called a measurable space, meaning that it is a space on which
we can put a measure.
Example 1.7.2
• For any set .Ω, .A = { ∅, Ω } is a .σ -algebra.
• For any set .Ω, the power set .A = 2Ω is a .σ -algebra.
• Let us consider the random experiment to roll a die. We know .Ω =
{ 1, 2, 3, 4, 5, 6 }. Then,
A = { ∅, Ω, { 1 } , { 2, 3, 4, 5, 6 } }
.
is a .σ -algebra.
• If we toss a coin, .Ω = { H, T }. And
A = 2Ω = { ∅, Ω, { H } , { T } }
.
is a .σ -algebra.
1.7 Probability Theory 67
1.7.2 Probabilities
Let .Ω be a sample space, and let .(Ω, A) be a measurable space in this subsection.
Definition 1.7.2 A probability measure defined on a measurable space .(Ω, A) is a
function .P : A → [0, 1] such that
• .P (Ω) = 1, .P (∅) = 0.
• For any .A1 , A2 , . . . ∈ A that are pairwise disjoint, i.e., .Ai1 ∩ Ai2 = ∅ for .i1 /= i2 ,
⎛∞ ⎞ ∞
⋃ ⎲
.P Ai = P (Ai ).
i=1 i=1
Example 1.7.4 Let us consider the random experiment of tossing a coin, the sample
space .Ω = { H, T }. Let .A = 2Ω = { ∅, Ω, { H } , { T } }. Define
5 It is easy to show that the intersection of .σ -algebras is again a .σ -algebra. Since .2Ω is a .σ -algebra,
it follows that the smallest .σ -algebra containing open sets exists.
68 1 Mathematical and Statistical Background
1 1
P (∅) = 0,
. P (Ω) = 1, P ({ H }) = , P ({ T }) = .
2 2
1
P ({ ω }) =
. , ∀ω ∈ Ω.
|Ω|
We note that if P is a uniform probability measure on .(Ω, A), then for any .A ∈ A,
|A|
P (A) = |Ω|
. .
1
P ({ i }) =
. , for i ∈ Ω.
6
Let .A = { 1, 2, 3 } , B = { 2, 4 }, then
1.7 Probability Theory 69
3 1 2 1
P (A) =
. = , P (B) = = .
6 2 6 3
Take any .A, B ∈ A such that .P (B) > 0. We would like to compute the probability
of A occurring given the knowledge that B has occurred. We do not need to consider
.A ∩ B since B has already occurred. Instead, we look at .A ∩ B, which occurs when
c
both A and B occur. This leads to the definition of the conditional probability of A
given B:
P (A ∩ B)
P (A|B) :=
. , where P (B) > 0. (1.27)
P (B)
1
A ∩ B = {2},
. P (A ∩ B) = .
6
By Eq. 1.27,
P (A ∩ B) 1/6 1
P (A|B) =
. = = .
P (B) 1/3 2
That is, the probability of A occurring given the knowledge that B has occurred
is the same as the probability of A occurring without the knowledge that B has
occurred.
Example 1.7.8 Continuing Example 1.7.7,
1 1 1 1
P (A ∩ B) =
. , P (A)P (B) = × = .
6 2 3 6
By Definition 1.7.4, A and B are independent. We also note that
1
P (A|B) = P (A) =
. .
2
Next, we state a very useful theorem.
70 1 Mathematical and Statistical Background
Theorem 1.7.1 (Bayes’ Theorem) If .P (A) > 0 and .P (B) > 0, then
⨆
⨅
Definition 1.7.5 A set of events .{ E1 , E2 , . . . | Ei ∈ A } is called a partition of .Ω
if:
• They are pairwise disjoint.
• .P (Ei ) > 0 for all i.
• .∪i Ei = Ω.
If the set of events is finite, it is called a finite partition of .Ω; otherwise, it is called
a countable partition of .Ω.
Example 1.7.9 Let .Ω = { 1, 2, 3, 4, 5, 6 }, .A = 2Ω , and P be the uniform
probability measure on .(Ω, A) (see Example 1.7.6). Let
E1 = { 1, 2, 3 } ,
. E2 = { 4, 5 } , E3 = { 6 } .
1 1 1
P (E1 ) =
. , P (E2 ) = , P (E3 ) = .
2 3 6
Since .Ei are pairwise disjoint, .Ei ∩ A are also pairwise disjoint. We have
⎛ ⎞
⋃⎛ ⋂ ⎞ ⎲ ⎛ ⋂ ⎞ ⎲
P (A) = P
. A Ei = P A Ei = P (A|Ei )P (Ei ).
i i i
⨆
⨅
1.7 Probability Theory 71
1
P (A) =
. , A ∩ E1 = { 2 } , A ∩ E2 = { 4 } , A ∩ E3 = ∅.
3
By Eq. 1.27,
1/6 1 1/6 1
.P (A|E1 ) = = , P (A|E2 ) = = , P (A|E3 ) = 0.
1/2 3 1/3 2
Furthermore,
⎲
3
1 1 1 1 1
. P (A|Ei )P (Ei ) = × + × = = P (A).
3 2 2 3 3
i=1
P (A|Em )P (Em )
.P (Em |A) = ∑ .
i P (A|Ei )P (Ei )
P (A|Em )P (Em )
P (Em |A) =
. .
P (A)
Example 1.7.11
• Fix .A ∈ A, and the indicator function, denoted .1A , for A is defined as follows:
⎧
1 ω∈A
1A : A → R,
. 1A (ω) =
0 ω /∈ A.
• Consider the probability space from Example 1.7.5, then any function .X : Ω →
R is a random variable. In such a case, X is called a discrete random variable.
• Let us consider the probability space discussed in Example 1.7.4. Define .X :
Ω → R such that .X(H ) = 0, X(T ) = 1. For any .B ∈ R, .X−1 (B) is always a
subset of .Ω, which is contained in .A. And X is a discrete random variable.
Let X be a random variable, and define .P X as follows:
P X : R → [0, 1]
.
It is easy to see that .P X (R) = 1 and .P X (∅) = 0. Take any .Bi ∈ B that are
pairwise disjoint. Then .X−1 (Bi ) are also pairwise disjoint since X is a function.
The countable additivity of .P X follows from the countable additivity of P . Thus,
.P
X is a probability measure on .(R, R). We say that .P X is induced by X, and it
F : R → [0, 1]
.
For simplicity, we will write .P (X ∈ B) instead of .P (X−1 (B)) in Eq. 1.29 and
.P (X ≤ x) instead of .P (X
−1 ((−∞, x])) in Eq. 1.30.
On the other hand, the next lemma says if we start from a function F with certain
properties, there always exists a random variable that has F as its CDF. The proof
can be found in, e.g., [Dur19, page 9].
Lemma 1.7.3 If a function F satisfies the following conditions, then it is the
distribution function of some random variable.
• F is nondecreasing.
• . lim F (x) = 1, . lim F (x) = 0.
x→∞ x→−∞
• F is right continuous, i.e., .lim F (y) = F (x).
y↓x
⎲
P (X = j ) =
. P ({ ω }).
ω:X(ω)=j
Let .T := X(Ω) be the image of .Ω in .R. The probability mass function (PMF) of X
is defined to be the function
pX : T → [0, 1]
.
x ⍿→ P (X = x).
We have the following relation between the PMF of X and the CDF of X:
⎲
F (a) =
. pX (x).
x≤a, x∈T
Example 1.7.12 Let us consider the probability space defined in Example 1.7.4.
We have discussed in Example 1.7.11 that
X : Ω → R,
. X(H ) = 0, X(T ) = 1
1 1
pX (0) = P (X = 0) = P ({ H }) =
. , pX (1) = P (X = 1) = P ({ T }) = .
2 2
When the distribution function .F (x) = P (X ≤ x) has the form
⎰ x
. F (x) = f (y)dy,
−∞
we say that X has probability density function (PDF) f and X is called a continuous
random variable.
Example 1.7.13 Define .f (x) = 1 for .x ∈ (0, 1) and 0 otherwise.
⎰ x
F (x) =
. f (y)dy
−∞
is given by
⎧
⎪
⎪ x≤0
⎨0
.F (x) = x 0≤x≤1
⎪
⎪
⎩1 x > 1.
74 1 Mathematical and Statistical Background
The standard normal distribution will be very useful in later parts of the book,
and we use .Ф(z) instead of .F (z) to denote its CDF. Moreover, we say that Z is
a standard normal random variable. Figure 1.1 shows that .f (z) is a bell-shaped
curve that is symmetric about 0. The symmetry is also apparent from the formula
for .f (z).
Next, we would like to define expectations and variances for random variables. The
exact formulas for discrete and continuous random variables are different, but the
information carried by those notions is the same. In particular, the expectation/mean
of a random variable X is the expected average value of X. And the variance of
X is the average squared distance from the mean. By squaring the distances, the
small deviations from the mean are reduced, and the big ones are enlarged. Thus the
variance measures how the values of X vary from the mean or how “spread out” the
values of X are.
When X is a discrete random variable .X : Ω → R with PMF .pX and .T = X(Ω)
(the image of .Ω in .R), its expectation/mean is defined as
⎲
E [X] =
. xpX (x), (1.31)
x∈T
1 1 1
E [X] = 0 × pX (0) + 1 × pX (1) = 0 ×
. +1× = .
2 2 2
When X is a continuous random variable with PDF f , its expectation/mean is
defined as
⎰ ∞
.E [X] = xf (x)dx, (1.32)
−∞
⎰ ⎰ |1
∞ 1 x 2 || 1
E [X] = xf (x)dx = xdx = = .
2 |0
.
−∞ 0 2
Example 1.7.17 Let Z be a random variable that induces the standard normal
distribution (see Example 1.7.14), by Eq. 1.32,
⎰ ∞ ⎰ ∞ ⎛ ⎞
1 z2
.E [Z] = zf (z)dz = √ z exp − dz = 0.
−∞ 2π −∞ 2
As shown in Fig. 1.1, .f (x) is symmetric about 0, so it is not surprising that the
expected average value of Z is 0.
Let g be a function .g : R → R. Then .g(X) is also a random variable.7 It can be
proven that if X is a discrete random variable with PMF .pX , then
⎲
E [g(X)] =
. g(x)pX (x).
x
6 For example, if .Ω is finite or if .Ω is countable and the series converges absolutely, the sum exists.
7 To be more precise, g should be a measurable function for .g(X) to be a random variable. For the
definition of measurable functions, we refer the readers to [Yeh14, page 72].
76 1 Mathematical and Statistical Background
g:R→R
.
x ⍿→ x 2 .
E [X + Y ] = E [X] + E [Y ] ,
. E [aX + b] = aE [X] + b, E [b] = b. (1.33)
⎾ ⏋
= E X2 + μ2 − 2μ2
⎾ ⏋
= E X2 − μ2 . (1.34)
Equation 1.34 provides the formula for computing the variance of a random variable
⎾ ⏋
X. More specifically, let X be a random variable with mean .E [X] = μ. If .E X2 <
∞, then the variance of X is given by
⎾ ⏋ ⎾ ⏋
Var(X) = E (X − μ)2 = E X2 − μ2 .
. (1.35)
Example 1.7.20 Let us consider the discrete random variable discussed in Exam-
ples 1.7.4 and 1.7.12. By Eq. 1.35 and Examples 1.7.15 and 1.7.18,
1.7 Probability Theory 77
1 ⎲ 2 1 1 1 1 1
Var(X) = E[X2 ]−
. = x pX (x)− = 0×pX (0)+1×pX (1)− = − = .
22 x
4 4 2 4 4
Example 1.7.21 Let X be a continuous random variable that induces the uniform
distribution on .(0, 1) (see Example 1.7.13), by Eq. 1.35 and Examples 1.7.16
and 1.7.18,
⎰ ⎰ |1
∞ 1 1 x 3 || 1 1
Var(X) = x f (x)dx − E [X] =
2 2
x dx − 2 =
2
− = .
3 |0 4
.
−∞ 0 2 12
Example 1.7.22 Let Z be a random variable that induces the standard normal
distribution (see Examples 1.7.14), by Eq. 1.35, Examples 1.7.17 and 1.7.18,
⏋ ⎾ ⎰ ∞ ⎰ ∞ ⎛ 2⎞
1 z
.Var(Z) = E Z −E [Z] = z f (z)dz−0 = √ z exp − dz = 1.
22 2 2
−∞ 2π −∞ 2
⎾ ⏋ ⎾ ⏋
Var(aX + b) = E (aX + b − E [aX + b])2 = E (aX + b − aE [X] − b)2
.
⎾ ⏋
= a 2 E (X − E [X])2 = a 2 Var(X). (1.36)
In particular, we have
Var(b) = 0,
. Var(X + b) = Var(X), Var(aX) = a 2 Var(X).
Example 1.7.23 Let .Z ∼ N(0, 1) be a standard normal random variable. Take any
σ, μ ∈ R with .σ 2 > 0. Define .Y = σ Z + μ. Then by Eqs. 1.33 and 1.36,
.
E [Y ] = μ,
. Var(Y ) = σ 2 .
It can be shown (see, e.g., [Dur19, page 28]) that Y has PDF
⎛ ⎞
1 (y − μ)2
.f (y) = √ exp − . (1.37)
σ 2π 2σ 2
We say that Y induces a normal distribution with mean .μ and variance .σ 2 , written
.Y ∼ N(μ, σ ). Y is also called normal/a normal random variable. We note that the
2
f (y) is a bell-shaped curve symmetric about .μ and obtains its maximum value
.
of
1 0.399
. √ ≈
σ 2π σ
Y −μ
Z :=
.
σ
is a standard normal random variable (for proof, see [Dur19, exercise 1.2.5]).
Next, let us look at the relations between two random variables. First, similar to
Definition 1.7.4, we give the definition of independent random variables.
Definition 1.7.7 Given two random variables .X : Ω → R, .Y : Ω → R, they are
said to be independent if for any .A, B ∈ R,
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B).
.
E [XY ] = E [X] E [Y ]
. if E [|X|] < ∞ and E [|Y |] < ∞. (1.38)
where .μX and .μY denote expectations for X and Y , respectively. It is easy to see
that
.Cov(X, Y ) = E [XY − μX Y − μY X + μx μY ]
= E [XY ] − μX μY − μY μX + μX μY = E [XY ] − E [X] E [Y ] . (1.40)
Thus,
If we further assume .Xi are independent with .E [|Xi |] < ∞, by Remark 1.7.2,
80 1 Mathematical and Statistical Background
⎛ n ⎞
⎲ ⎲
n
.Var Xi = Var(Xi ). (1.41)
i=1 i=1
Recall that we have defined Borel .σ -algebra for .Rn (Definition 1.7.1). Corre-
spondingly, we can define a random vector similar to Definition 1.7.6.
Definition 1.7.9 A random vector X is a function .X : Ω → Rd , such that
⎲
n
. aj Xj , aj ∈ R
j =1
We write .X ∼ N(μ, Q), and we say that .X is Gaussian/a Gaussian random vector.
Example 1.7.24 If .X1 , . . . , Xn are pairwise independent random variables and
each .Xi ∼ N(μi , σi2 ) is normal, then .X = (X1 , X2 , . . . , Xn ) induces a Gaussian
distribution with mean .μ = (μ1 . . . . , μn ) and covariance matrix Q a diagonal
matrix with .Qii = σi2 (see [JP04, page 127] for a proof).
When we look at Gaussian random vectors, we have the following nice property
for the components of the random vector. The proof can be found in, e.g., [JP04,
page 128].
Theorem 1.7.3 Let .X = (X1 , X2 , . . . , Xn ) be a Gaussian random vector. Then
the components .Xi are independent if and only if the covariance matrix Q of .X is
diagonal.
A direct corollary is as follows.
1.7 Probability Theory 81
with a diagonal covariance matrix (see Example 1.7.24). Again by Theorem 1.7.3,
.Xi and .Xj are independent. ⨆
⨅
Corollary 1.7.2 Two normal random variables X and Y are independent if and
only if they are uncorrelated.
Definition 1.7.11 Let X and Y be two random variables with finite variances. The
correlation coefficient of X and Y is given by
Cov(X, Y )
ρ=√
. . (1.42)
Var(X)Var(Y )
1.8 Statistics
In this section, we will first discuss a few important distributions (Sect. 1.8.1).
Then we will introduce statistical methods for estimating the mean and variance
of a normal distribution (Sect. 1.8.2) which utilize properties of those important
distributions. Those methods will provide more insights into our analysis of device
leakages in Sect. 4.2.3. Finally, we touch on some basics of hypothesis testing
(Sect. 1.8.3) which justifies leakage assessment methods that will be introduced in
Sect. 4.2.3.
We suggest the readers come back to this part later when they reach Chap. 4.
Let Z denote a random variable that induces a standard normal distribution. We have
discussed in Example 1.7.14 that Z has the probability density function
⎛ 2⎞
1 z
.f (z) = √ exp −
2π 2
P (Z > zα ) = 1 − Ф(zα ) = α,
. i.e., Ф(zα ) = 1 − α. (1.43)
Those .zα values are useful for many applications, and there are tables listing
the values of .Ф(z) for small values of z (e.g., [Ros20, Table A1]). Given .α, the
approximated value of .zα can be found by examining such a table. In Table 1.4, we
list a few values of .zα with corresponding .α, which will be used later in the book.
By definition, .Ф(z) is the integral of .f (z). As shown in Fig. 1.3, .α corresponds
to the area under .f (z) for .z > zα . Furthermore, since .f (z) is symmetric about 0,
we have
Table 1.4 Values of .zα (see .α .0.1 .0.05 .0.01 .0.005 .0.001
Eq. 1.43) with
corresponding .α .1 −α .0.900 .0.950 .0.990 .0.995 .0.999
.zα .1.282 .1.645 .2.326 .2.576 .3.090
1.8 Statistics 83
and
⎲
n
X=
. Zi2 .
i=1
P (X ≥ χα,n
.
2
) = α. (1.45)
As shown in Fig. 1.4, .α corresponds to the area under the PDF of X for .X ≥ χα,n
2 .
We also have
2
P (χ1−α,n
.
2
< X < χα,n ) = P (X ≥ χ1−α,n
2
) − P (X ≥ χα,n
2
) = 1 − 2α.
Fig. 1.5 Probability density functions for .Tn ∼ tn (.n = 2, 5, 10) and for the standard normal
random variable Z
Z
Tn := √
. .
X/n
For example, in Fig. 1.5 we can see the PDF of .Tn for .n = 2, 5, 10 and for .Z ∼
N(0, 1).
The same as for .zα (Eq. 1.43) and .χα,n (Eq. 1.45), given .α ∈ (0, 1), we define
.tα,n such that
P (Tn ≥ tα,n ) = α.
. (1.46)
and
1.8 Statistics 85
which gives
P (−tα/2,n ≤ T ≤ tα/2,n ) = 1 − α,
. (1.47)
or
The table of values of .tα,n can be found in standard books for statistics, see, e.g.,
[Ros20, Table A3].
Remark 1.8.2 For large values of n (.n ≥ 30), .tα,n can be approximated by .zα (see
Table 1.4).
refer to this set as a sample. An actual outcome for .Xi , denoted .xi , is called a
realization of .Xi .
86 1 Mathematical and Statistical Background
1⎲
n
.X := Xi . (1.49)
n
i=1
Remark 1.8.3 It can be shown that the sum of independent normal random
variables induces a normal distribution with mean (respectively, variance) given
by the sum of the means (respectively, variances) of each random variable (see,
e.g., [JP04, page 120] and also Eqs. 1.33 and 1.41).
Since .Xi ∼ N(μx , σx2 ) each are independent, together with Eqs. 1.33 and 1.41,
we have
⎾ ⏋ 1⎲ n
( ) 1 ⎲
n
σ2
E X =
. E [Xi ] = μx , Var X = 2 Var(Xi ) = x .
n n n
i=1 i=1
By Remark 1.8.3,
⎛ ⎞
σx2
.X ∼ N μx , , (1.50)
n
i.e., the sample mean is a normal random variable with mean .μx and variance .σx2 /n.
It follows from Remark 1.7.1 that
X − μx
. √ ∼ N(0, 1). (1.51)
σx / n
Similarly, for .i = 1, 2, . . . , n,
Xi − μx
. ∼ N(0, 1). (1.52)
σx
1 ⎲
n
Sx2 :=
. (Xi − X)2 . (1.53)
n−1
i=1
We note that
⎲
n ⎲
n
. (Xi − μx )2 = ((Xi − X) + (X − μx ))2
i=1 i=1
⎲
n ⎲
n
= n(X − μx )2 + (Xi − X)2 + 2 (Xi − X)(X − μx )
i=1 i=1
1.8 Statistics 87
⎲
n
= n(X − μx )2 + (Xi − X)2 ,
i=1
where
⎲
n ⎲
n
. (Xi − X) = −nX + Xi = 0.
i=1 i=1
n ⎛ ⎞ ⎛√ ⎞2 ∑
⎲ Xi − μx 2 n(X − μx ) n
(Xi − X)2
. = + i=1 2 . (1.54)
σx σx σx
i=1
Since .Xi are independent normal random variables, by Eq. 1.52 and Definition 1.8.1,
the left-hand side of Eq. 1.54 induces a .χ 2 -distribution with n degrees of freedom.
By Eq. 1.51 and Definition 1.8.1, the first term of the right-hand side of Eq. 1.54
induces a .χ 2 -distribution with 1 degree of freedom. By Remark 1.8.1, it is tempting
to conclude that the two terms on the right-hand side of Eq. 1.54 are independent
and the second term induces a .χ 2 -distribution with .n − 1 degrees of freedom.
Such a result has indeed been proven. In particular, the proof of the following
theorem was demonstrated in [Ros20, page 216]8
Theorem 1.8.1 The sample mean .X and sample variance .Sx2 are independent
random variables. Furthermore,
(n − 1)Sx2
. ∼ χn−1
2
. (1.55)
σx2
√ X − μx
.n ∼ tn−1 .
Sx
X − μx (n − 1)Sx2
. √ ∼ N(0, 1), ∼ χn−1
2
,
σx / n σx2
by Definition 1.8.2,
√
n(X − μx )/σx √ X − μx
. √ = n ∼ tn−1 .
2 2
Sx /σx Sx
⨆
⨅
Let .Θ denote the subset of .R that contains all possible values of .μx . A point
estimator of .μx is a function with domain .Rn and codomain .Θ that is used to
estimate the value of .μx . We use a point in .Θ for the estimation, hence the name
point estimator.
Remark 1.8.4 For example, we can use the sample mean as a point estimator for
μx . Similarly, we can use the sample variance as a point estimator for .σx (see
.
Example 4.2.1).
Example 1.8.1 (Sample correlation coefficient) Suppose U and W are two ran-
dom variables. Let .{(U1 , W1 ), (U2 , W2 ), . . . , (Un , Wn )} be a sample for this pair of
random variable .(U, W ). We further denote the sample mean and sample variance
for .{U1 , U2 , . . . , Un } by .U and .Su2 . Similarly, the sample mean and sample variance
for .{W1 , W2 , . . . , Wn } are denoted by .W and .Sw2 . Then, following Definition 1.7.11,
we can define the sample correlation coefficient, denoted by r, as follows (see
Eqs. 1.39 and 1.35):
∑n
UW − U W 1
i=1 (Ui − U )(Wi − W )
.r = √ = /⎛ n
⎞⎛ ∑ ⎞
Su2 Sw2 1 ∑n n
i=1 (Ui − U ) i=1 (Wi − W )
2 1 2
n n
∑n
− U )(Wi − W )
i=1 (Ui
= /∑ /∑ . (1.56)
n n
i=1 (Ui − U ) i=1 (Wi − W )
2 2
Then, the sample correlation coefficient can be used as a point estimator for
the correlation coefficient between U and W . We note that since the correlation
coefficient analyzes the relations between U and W , we collect samples in pairs
.(Ui , Wi ).
However, we do not expect .μx to be exactly equal to the sample mean. Thus, we
would like to specify an interval for which we have a certain degree of confidence
that our parameter lies. We refer to such an estimator as an interval estimator.
For the rest of this part, let .α ∈ (0, 1) be a real number. We recall the definitions
of .zα and .tα from Eqs. 1.43 and 1.46, respectively.
Interval estimator for .μx with known variance We first consider .σx2 to be known.
By Eqs. 1.44 and 1.51,
⎛ ⎞
X − μx
.P −zα/2 < √ < zα/2 = 1 − α,
σx / n
1.8 Statistics 89
which gives
⎛ ⎞
σx σx
. P X − zα/2 √ < μx < X + zα/2 √ = 1 − α.
n n
σx σx
Thus, the probability that .μx lies between .X − zα/2 √ n
and .X + zα/2 √n
is .1 − α.
We say that
⎛ ⎞
σx σx
. x − zα/2 √ , x + zα/2 √ (1.57)
n n
is a .100(1 − α) percent confidence interval for .μx , where .x̄ is a realization of .X.
We define the precision of our estimate, denoted by c, to be
σx
c := zα/2 √ ,
.
n
which is the length of half of the confidence interval. It measures how “close” is our
estimate to .μx . Consequently, to have an estimate with precision c and .100(1 − α)
confidence, the number of data in the sample should be at least (see Example 4.2.2)
σx2 2
n=
. z . (1.58)
c2 α/2
Interval estimator for .μx with unknown variance In case the variance is
unknown, by Lemma 1.8.1 and Eq. 1.47, we have
⎛ ⎞
√ X − μx
P
. −tα/2,n−1 ≤ n ≤ tα/2,n−1 = 1 − α,
Sx
which gives
⎛ ⎞
Sx Sx
.P X − tα/2,n−1 √ ≤ μx ≤ X + tα/2,n−1 √ = 1 − α.
n n
Thus a .100(1−α) percent confidence interval for .μx is given by (see Example 4.2.2)
⎛ ⎞
sx sx
. x − tα/2,n−1 √ , x + tα/2,n−1 √ . (1.59)
n n
Then to have an estimate with precision c and .100(1 − α) confidence, the number
of data required in the sample is given by
sx2 2
n=
. t .
c2 α/2,n−1
By Remark 1.8.2, when n is large .(≥30), .tα,n is close to .zα , and n can be estimated
by (see Example 4.2.2)
sx2 2
n≈
. z . (1.60)
c2 α/2
For the rest of this part, let Y be a normal random variable with mean .μy and
variance .σy2 that is independent from X. Let .{ Y1 , Y2 , . . . , Ym } be a sample for Y
with sample mean .Y and sample variance .Sy . We are interested in estimating .μx −
μy .
We note that since .X and .Y are point estimators for .μx and .μy , respectively,
.X − Y is a point estimator for .μx − μy .
Interval estimator for .μx − μy with known variances Suppose we know the
values of .σx2 and .σy2 . By Eq. 1.50,
⎛ ⎞ ⎛ ⎞
σx2 σy2
.X ∼ N μx , , Y ∼ N μy , .
n m
By Remark 1.8.3,
⎾ ⏋ ( ) σ2 σy2
E X − Y = μx − μy ,
. Var X − Y = x + ,
n m
and
⎛ ⎞
σx2 σy2 X − Y − (μx − μy )
.X − Y ∼ N μx − μy , + =⇒ / ∼ N(0, 1).
n m σx2 σy2
n + m
(1.61)
By Eq. 1.44, we have
⎛ ⎞
X − Y − (μx − μy )
P ⎝−zα/2
. < / < zα/2 ⎠ = 1 − α.
σx2 σy2
n + m
⎛ / / ⎞
σx2 σy2 σx2 σy2
. ⎝x − y − zα/2 + , x − y + zα/2 + ⎠. (1.62)
n m n m
The precision c is
/
σx2 σy2
c := zα/2
. + .
n m
2 (σ 2 + σ 2 )
zα/2 x y
n=
. . (1.63)
c2
Interval estimator for .μx − μy with unknown equal variance Suppose .σx = σy
is unknown. Let .σ = σx = σy . By Eq. 1.55,
(n − 1)Sx2 (m − 1)Sy2
. ∼ X2n−1 , ∼ X2m−1 .
σ2 σ2
Since we assume the samples are independent, those two .χ 2 random variables are
independent. By Remark 1.8.1, we have
(n − 1)Sx2 (m − 1)Sy2
. + ∼ X2m+n−2 . (1.64)
σ2 σ2
Let
(n − 1)Sx2 + (m − 1)Sy2
Sp2 :=
. . (1.65)
n+m−2
By Theorem 1.8.1, .X, Sx2 , Y , Sy2 are independent. By Definition 1.8.2 and Eqs. 1.61
and 1.64,
X − Y − (μx − μy ) X − Y − (μx − μy )
. / / = √ ∼ tn+m−2 . (1.66)
σ2
+ σ2
S 2 /σ 2 Sp 1/n + 1/m
n m p
2
2tα/2,2n−2 sp2
n=
. .
c2
For large n (.n ≥ 30), we can approximate n by (see Example 4.2.3)
2 s2
2zα/2 p
n≈
. . (1.67)
c2
distribution induced by X with sample mean .X and sample variance .Sx2 . We would
like to test hypotheses about .μx using data from this sample.
The hypothesis that we want to test is called the null hypothesis, denoted by .H0 .
For example,
. H0 : μx = 1, H0 : μx ≥ 0.
We will test the null hypothesis against an alternative hypothesis, denoted by .H1 .
For example,
H1 : μx /= 1,
. H1 : μx > 1.
And
C is called the critical region. We also define the level of significance of the test,
denoted by .α, such that when .H0 is true, the probability of rejecting it is not bigger
than .α, namely
P ({ x1 , x2 , . . . , xn } ∈ C|H0 is true) ≤ α.
.
Thus, the main procedure in our hypothesis testing is to find the critical region C
given a level of significance .α.
Two-sided hypothesis testing concerning .μx Let .μ0 ∈ R be a constant. We set
the null hypothesis and the alternative hypothesis as follows:
H0 : μx = μ0 ,
. H1 : μx /= μ0 . (1.68)
Recall that the sample mean, .X, is a point estimator for .μx (see Remark 1.8.4). Then
it is reasonable to accept .H0 if .X is not too far from .μ0 . Given .α, we choose the
critical region to be
{ }
C = (X1 , X2 , . . . , Xn ) | |X − μ0 | > c ,
. (1.69)
P (|X − μ0 | > c) = α.
. (1.70)
Then our main task is to find c that satisfies the above equation.
Suppose the variance .σx2 is known. If .X = μ0 , then by Eq. 1.50,
⎛ ⎞
σ2
X ∼ N μ0 ,
. .
n
Define
X − μ0
Z :=
. √ , (1.71)
σ/ n
by Remark 1.7.1, .Z ∼ N(0, 1). According to Eq. 1.70, we can choose c such that
⎛ √ ⎞ ⎛ √ ⎞ ⎛ √ ⎞
c n c n c n α
.P |Z| > = α =⇒ 2P Z > = α =⇒ P Z > = .
σ σ σ 2
94 1 Mathematical and Statistical Background
P (|T | > c) = α.
.
c = tα/2,n−1 .
.
Hence to achieve the level of significance .α, we reject .H0 (Eq. 1.68) if
|√ |
| n(x − μ0 ) |
| | > tα/2,n−1
.
| s |
x
H0 : μ = μ0 ,
. H1 : μ > μ0 . (1.75)
To find the value of c, we assume .H0 is true. Then by definition, c should be chosen
such that
. P (X − μ0 > c) = α.
{ }
C = (X1 , X2 , . . . , Xn ) | X > c .
. (1.78)
Then by Eq. 1.77, to test whether .μx is different from 0 with significance level .α,
the number of data required is at least (see Example 4.2.7)
σ2 2
n=
. z . (1.79)
c2 α
96 1 Mathematical and Statistical Background
In case we do not know the variance .σx2 . By the definition of T (Eq. 1.74), we
have
⎛ √ ⎞
c n
.P T > = α.
Sx
and accept .H0 otherwise. When n is large (.≥ 30), we reject .H0 if (see Exam-
ple 4.2.7)
√
n(x − μ0 )
. > zα . (1.80)
sx
Suppose we want to test if the mean .μx is bigger than 0 with significance level
α, and we have a good estimate for c. Set .μ0 = 0. The number of data required is at
.
least
sx2 2
n=
. t .
c2 α,n−1
For large n (.n ≥ 30), we have (see Example 4.2.7)
sx2 2
n=
. z . (1.81)
c2 α
Two-sided hypothesis testing about .μx and .μy For the rest of this part, let Y
denote a normal random variable independent from X with mean .μy and variance
.σy . Furthermore, let .{ Y1 , Y2 , . . . , Ym } denote a sample from the distribution
2
H0 : μx = μy ,
. H1 : μx /= μy . (1.82)
Since .X and .Y are point estimators for .μx and .μy , respectively, .X − Y is a point
estimator for .μx − μy . Then it is reasonable to reject .H0 when .|X − Y | is far from
zero. Given .α, our critical region is of the form
1.8 Statistics 97
{ }
C = (X1 , X2 , . . . , Xn , Y1 , Y2 . . . , Ym ) | |X − Y | > c ,
. (1.83)
X−Y
. / ∼ N(0, 1). (1.85)
σx2 σy2
n + m
By Eq. 1.44,
⎛ ⎞ ⎛ ⎞
X−Y |X − Y |
P ⎝−zα/2
. </ < zα/2 ⎠ = 1 − α =⇒ P ⎝ / > zα/2 ⎠ = α.
σx2 σy2 σx2 σy2
n + m n + m
Thus, we let
/
σx2 σy2
c = zα/2
. + . (1.86)
n m
Example 4.2.8)
2 (σ 2 + σ 2 )
zα/2 x y
n=
. . (1.87)
c2
In case the variances are unknown but we know that .σx = σy . Let .σ = σx = σy .
By Eq. 1.66, when .μx = μy ,
X−Y
. / ∼ tn+m−2 .
Sp2 (1/n + 1/m)
98 1 Mathematical and Statistical Background
Thus, we let
/
c = tα/2,n+m−2 Sp2 (1/n + 1/m).
.
and accept .H0 otherwise. Such a test is called the student’s .t−test.
For large n and m, we reject .H0 if (see Example 4.2.11)
/ |x − y|
|x − y| > zα/2 sp2 (1/n + 1/m),
. or equivalently, / > zα/2 .
sp2 (1/n + 1/m)
(1.88)
Furthermore, when .n = m, we have (see Eq. 1.65)
|x − y|
. / 2 2 > zα/2 . (1.89)
sx +sy
n
In this case, suppose we have a good estimate for c, to have a student’s t-test with
significant level .α, and the number of data we need for both samples is given by (see
Examples 4.2.8 and 4.2.9)
Sx2 + Sy2
n = zα/2
.
2
. (1.90)
c2
If we further assume that the unknown variances .σx2 and .σy2 are not equal, it can
be shown that [Wel47]
X−Y
. / ∼ tv ,
Sx2 Sy2
n + m
1.8 Statistics 99
where
|x − y|
. / > tα/2,v .
sx2 sy2
n + m
|x − y|
. / > zα/2 . (1.91)
sx2 sy2
n + m
Remark 1.8.6 Note that when .n = m is big (.≥30), Welch’s t-test and the student’s
t-test have the same formula (see Eq. 1.89 and 1.91).
Both student’s t-test and Welch’s t-test will be useful for leakage assessment in
Sect. 4.2.3.
One-sided hypothesis testing about .μx and .μy For one-sided testing, we consider
the following null and alternative hypotheses:
H0 : μx = μy ,
. H1 : μx > μy .
We will only discuss the case when .σx2 and .σy2 are known. For unknown
variances, we refer the readers to [Wel47]. By Eqs. 1.85 and 1.43,
⎛ ⎞
X − Y
.P ⎝ / > zα ⎠ = α.
σx2 σy2
n + m
Thus, we choose
100 1 Mathematical and Statistical Background
/
σx2 σy2
c = zα
. + . (1.93)
n m
For more detailed discussions on sets, functions, number theory, and abstract
algebra, we refer the readers to [Her96, Chapters 1–6] and a series of lecture notes
from Frédérique Oggier [Ogg].
[LX04] provides more in-depth studies for finite fields and coding theory.
For probability theory, we refer the readers to [Dur19] and [JP04] for a thorough
analysis and [Ros20] for practical examples. [Ros20] also provides more insights
on statistical methods presented in Sect. 1.8.
Chapter 2
Introduction to Cryptography
Before we dive into the modern cryptographic algorithms that are in use today
(Chap. 3), we give an introduction to cryptography in general (Sect. 2.1) and discuss
some classical ciphers that were designed a few centuries back (Sect. 2.2). In the
end, we will discuss how cryptographic algorithms are actually used with different
encryption modes (Sect. 2.3).
We start with a definition of cryptography.
Definition 2.0.1 Cryptography studies techniques that allow secure communica-
tion in the presence of adversarial behavior. These techniques are related to
information security attributes such as confidentiality, integrity, authentication, and
non-repudiation.
Below, we give more details on the information security attributes that can be
achieved by using cryptography:
1. Confidentiality aims at preventing unauthorized disclosure of information. There
are various technical, administrative, physical, and legal means to enforce
confidentiality. In the context of cryptography, we are mostly interested in
utilizing various encryption techniques to keep information private.
2. Integrity aims at preventing unauthorized alteration of data to keep them correct,
authentic, and reliable. Similarly to confidentiality, while there are many means
of ensuring data integrity, in cryptography we are looking at hash functions and
message authentication codes.
3. Authentication aims at determining whether something or someone is who they
claim they are. In communication, the entities should be able to identify each
other. Similarly, the properties of the exchanged information, such as origin,
content, and timestamp, should be authenticated. In cryptography, we are mostly
interested in two aspects: entity authentication and data origin authentication. For
these purposes, signatures, and identification, primitives are used.
4. Non-repudiation aims at assuring that the sender of the information is provided
with proof of delivery, and the recipient is provided with proof of the sender’s
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 101
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_2
102 2 Introduction to Cryptography
identity so that neither party can later deny the actions taken. Similarly to
authentication, signatures, and identification, primitives are cryptographic means
of supporting non-repudiation.
Note
CIA Triad is a widely utilized information security model, where the abbrevi-
ation stands for confidentiality, integrity, and availability. Therefore, a curious
reader might be interested in knowing why we did not mention the availability.
The answer is rather simple—there are no techniques within cryptography
that could contribute in one way or another to ensure availability. Availability
attribute ensures that information is consistently and readily accessible for
authorized entities. One needs to look into other means of supporting this
attribute.
Cryptographic primitives are the tools that can be used to achieve the goals listed
in Definition 2.0.1. The categorization of cryptographic primitives is depicted in
Fig. 2.1. We have highlighted the ones that will be discussed in more detail in this
book, especially regarding hardware attacks.
Let us briefly explain each primitive.
• Hash functions: Hash functions map data of arbitrary length to a binary array of
some fixed length. We provide more details on hash functions in Sect. 2.1.1.
• Public key ciphers: Public key (or asymmetric) ciphers use a pair of related keys.
This pair consists of a private key and a public key. These keys are generated by
cryptographic algorithms that are based on mathematical problems called one-
1 It is worth noting that the existence of one-way functions is an open conjecture and depends on
.P /= N P inequality.
104 2 Introduction to Cryptography
create a message digest that is afterward digitally signed, rather than signing the
entire message which can be slow in case the message is large (see Sect. 3.4).
The current NIST standard for hash functions was released in 2015 and is called
Secure Hash Algorithm 3 (SHA-3) [Dwo15]. It is based on Keccak permutation
[BDPA13] which uses a previously developed sponge construction [BDPVA07].
2.1.2 Cryptosystems
We have mentioned three types of ciphers: public key ciphers, block ciphers, and
stream ciphers. In this subsection, we will provide more discussions on ciphers,
which are also called cryptosystems.
When we use ciphers, we normally assume insecure communication. A popular
example setting is that Alice would like to send messages to Bob, but Eve is also
listening to the communication. The goal of Alice is to make sure that even if Eve
can intercept what was sent, she will not be able to find the original message. To do
so, Alice will first encrypt the message, or the plaintext, and send the ciphertext to
Bob, instead of the original message. Bob will then decrypt the ciphertext to get the
plaintext. For this communication to work, there must be a key for encryption and
decryption. It is clear that the decryption key should be secret from Eve, and a basic
requirement is that the algorithm for encryption/decryption should be designed in
a way that Eve cannot easily brute force the plaintext with the knowledge of the
ciphertext.
Definition 2.1.1 A cryptosystem is a tuple .(P, C, K, E, D) with the following
properties:
• .P is a finite set of plaintexts, called plaintext space.
• .C is a finite set of ciphertexts, called ciphertext space.
• .K is a finite set of keys, called key space.
• For each .e ∈ K, there exists .d ∈ K such that .Dd (Ee (p)) = p for all .p ∈ P.
If .e = d, the cryptosystem is called a symmetric (key) cryptosystem. Otherwise, it is
called a public key/asymmetric cryptosystem.
Take any .c1 = Ee (p1 ), c2 = Ee (p2 ) from the ciphertext space .C, where .e ∈ K. Let
d ∈ K be the corresponding decryption key for e. If .c1 = c2 , then by definition,
.
p1 = Dd (c1 ) = Dd (c2 ) = p2 .
.
Thus, .Ee is an injective function (see Definition 1.1.2). We also note that if .P = C,
Ek is a permutation of .P (see Definition 1.2.3).
.
There are mainly two types of symmetric ciphers: block ciphers and stream
ciphers.
2.1 Cryptographic Primitives 105
But, for a stream cipher, .P = C = A are single digits. Encryptions are computed
on each digit of the plaintext. In particular, suppose we have a plaintext string .p =
p1 p2 . . . (where .pi ∈ A) and a key k. We first compute a key stream .z = z1 z2 . . .
using the key k; then the ciphertext is obtained as follows:
A stream cipher is said to be synchronous if the key stream only depends on the
chosen key k but not on the encrypted plaintext. In this case, the sender and the
receiver can both compute the keystream synchronously. In Sect. 2.2.7 we will see
a classical synchronous stream cipher called one-time pad.
An important aspect to clarify is how the message that Alice intends to send is
represented as plaintext.
For classical ciphers that we will discuss in Sect. 2.2, we will only consider
messages consisting of English letters (A–Z), and we map each letter to an element in
.Z26 . Table 2.1 lists the details of the mapping from letters to .Z26 . Thus the plaintext
U V W X Y Z
20 21 22 23 24 25
2 Such an encryption mode is called an ECB mode, and more encryption modes will be introduced
in Sect. 2.3.
106 2 Introduction to Cryptography
Table 2.2 Examples of methods for converting message symbols to bytes. The second column in
each table is the binary representation of the byte value, and the third column is the corresponding
hexadecimal representation
(a) ASCII (b) UTF-8
A 01000001 41 Á 11000001 C1
B 01000010 42 Ä 11000100 C4
a 01100001 61 Í 11001101 CD
b 01100010 62 × 11010111 D7
? 00111111 3F ÷ 11110111 F7
In this section, we will discuss some classical ciphers and analyze their security.
We focus on the case when messages consist of English letters. Those letters are
3 In a more general sense, breaking a cipher means finding a weakness in the cipher algorithm that
identified with elements in .Z26 as shown in Table 2.1. For easy reading, we will
not distinguish letters and elements in .Z26 . For example, when the message is A, we
may say that the plaintext is A or the plaintext is 0, similarly for ciphertext.
Ek : Z26 → Z26 ,
. p |→ p + k mod 26; Dk : Z26 → Z26 , c |→ c − k mod 26.
Suppose the message is A, then the corresponding plaintext is 0 (see Table 2.1). The
ciphertext is given by
When we decrypt the ciphertext using the same key, we get our original message:
We note that encrypting using a key k is the same as shifting the letters by k
positions, hence the name “shift cipher.”
Example 2.2.2 For example, when k = 5,
Ek (A) = 0 + 5 mod 26 = F,
. Ek (Z) = 25 + 5 mod 26 = 4 mod 26 = E.
To encrypt a message, we can follow Table 2.3 and replace letters in the
first row with those in the second row. Suppose the message is I STUDY IN
BRATISLAVA. Then the corresponding ciphertext (omitting the white spaces) is
NXYZIDNSGWFYNXQFAF.
When k = 3, the cipher is called the Caesar Cipher, which was used by Julius
Caesar around 50 B.C. It is unknown how effective the Caesar cipher was at the
time. But it is likely to have been reasonably secure since most of Caesar’s enemies
2.2 Classical Ciphers 109
Table 2.3 Shift cipher with k = 5. The second row represents the ciphertexts for the letters in the
first row
A B C D E F G H I J K L M N O P Q R S T
F G H I J K L M N O P Q R S T U V W X Y
U V W X Y Z
Z A B C D E
would have been illiterate and they might have also assumed the messages were
written in an unknown foreign language.
Now, suppose as an attacker, we know that the ciphertext is NXYZIDNSGWFYNXQ
FAF. By Kerckhoffs’ principle (Definition 2.1.3), we can assume that we also know
the communication language is English, and how can we find the corresponding
plaintext?
With a moment’s thought, it is easy to see that we can simply try all the
possible keys until we find a plaintext that makes sense. For example, let k =
1, then N should be decrypted to M, X to W, and so on. Eventually, we get
MWXYHCMRFVEXMWPEZE, which does not make sense. So we continue, when k = 2,
we get LVWXGBLQEUDWLVODYD. When k = 3, we have KUVWFAKPDTCVKUNCXC, and
for k = 4, we get JTUVEYJOCSBUJTMBWB. Finally, letting k = 5, we get a proper
sentence ISTUDYINBRATISLAVA. Since there are only 25 possible keys (the key is
not equal to 0), with a known ciphertext, it is easy to find the original plaintext and
the key.
Such a method of trying every possible key until the correct one is found is called
an exhaustive key search. We have demonstrated that with an exhaustive key search,
we can break the shift cipher, i.e., find the key.
Recall that .Z∗n is the set of elements .x ∈ Zn such that .gcd(x, n) = 1 (Defini-
tion 1.4.5).
{
Definition 2.2.2 (Affine Cipher) Let .P = C = Z26 and .K = (a, b) | a ∈ Z∗26 ,
b ∈ Z26 }. For each key .(a, b), define
Next, we will verify that the affine cipher is well-defined. In particular, we will
show the following:
• Decryption is always possible, i.e., given any .a ∈ Z∗26 and .b, y ∈ Z26 , a solution
for x such that
ax + b ≡ y mod 26
.
ax + b ≡ y mod 26
.
ax ≡ y − b mod 26.
.
When y varies over .Z26 , .y − b also varies over .Z26 . Thus we can focus on solutions
for
ax ≡ z mod 26,
. (2.1)
where .z ∈ Z26 . Since .a ∈ Z∗26 , by Theorem 1.4.6, Eq. 2.1 has a unique solution.
The existence of the solution proves that decryption is possible, and the uniqueness
guarantees that encryption functions are injective.
Given a key .(a, b), to find .a −1 mod 26, we can apply the extended Euclidean
algorithm (Algorithm 1.2).
Example 2.2.3 Suppose the key for affine cipher is .(3, 1), by the extended
Euclidean algorithm, we can find .3−1 mod 26:
So the ciphertext is DGARL. We can list the correspondence between plaintext and
ciphertext as follows:
S T R O M
18 19 17 14 12
3 6 0 17 11
D G A R L
So there are 12 possible values for .a ∈ Z∗26 . And there are 26 possible values for
.b ∈ Z26 . Then the total number of possible keys .(a, b) is .12 × 26 = 312. Similarly
to shift cipher, knowing a ciphertext, we can try each of the 312 keys until we find
a plaintext that makes sense. Thus we can break affine cipher by exhaustive key
search.
Recall that the symmetric group of degree n, denoted .Sn , is the set of permutations of
a set X with n elements (see Definition 1.2.4). We have discussed that a permutation
is a bijective function, and its inverse exists with respect to the composition of
functions (see Lemma 1.2.1). In particular, any permutation .σ ∈ S26 has an inverse
.σ
−1 .
Definition 2.2.3 (Substitution Cipher) Let .P = C = Z26 , and .K = S26 . For any
key .σ ∈ S26 , define
Eσ : Z26 → Z26 ,
. p |→ σ (p); Dσ : Z26 → Z26 , c |→ σ −1 (c).
U V W X Y Z
V A B C D E
Table 2.5 Definition of .σ −1 , where .σ ∈ S26 is a key for substitution cipher shown in Table 2.4
A B C D E F G H I J K L M N O P Q R S T
V W X Y Z E F G H I J K L M N O P Q R S
U V W X Y Z
T U A B C D
We have discussed that .|Sn | = n! (see Example 1.2.9). So the size of key space
for substitution cipher is .26! ≈ 4 × 1026 . Modern computers run at a speed of a few
GHz, which is .∼109 instructions per second. There are .∼105 seconds per day, so one
computer can run .∼1014 instructions per day or .∼1016 instructions per year. If we
would like to exhaust every key for substitution cipher, we will need .∼1010 years.
Compared to the age of the universe, which is .13.8 billion, i.e., .1.38 × 1010 years,
exhaustive key search is impossible with current computation power. However, we
will show in Sect. 2.2.6 that other methods can be used to break substitution cipher.
For the substitution cipher, one alphabet is mapped to a unique alphabet. Hence
such a cipher is also called a monoalphabetic cipher. Vigenère cipher, named after
the French cryptographer Blaise Vigenère, is a polyalphabetic cipher where one
alphabet can be encrypted to different alphabets depending on the key.
Let m be a positive integer, and let .Zm 26 be the set of matrices with coefficients
in .Z26 of size .1 × m. In other words, .Zm 26 is the set of .1 × m row vectors
with coefficients in .Z26 (see Definition 1.3.1). As discussed in Eq. 1.4, for any
.x = (x0 , x1 , . . . , xm−1 ), .y = (y0 , y1 , . . . , ym−1 ) in .Z , the addition .x + y is
m
26
computed componentwise:
where .xi + yi is computed with addition modulo 26. Recall that the additive inverse
of an element a in .Z26 is given by .−a (see Remark 1.4.2). .x − y is then computed
componentwise using the additive inverses of .yi s.
2.2 Classical Ciphers 113
.Ek : Zm
26 → Z26 ,
m
p |→ p + k; Dk : Z m
26 → Z26 ,
m
c |→ c − k.
To encrypt AN EXAMPLE, we write the plaintext in groups of six letters and add the
keyword to each group letter by letter, modulo 26.
A N E X A M P L E
0 13 4 23 0 12 15 11 4
18 4 2 17 4 19 18 4 2
18 17 6 14 4 5 7 15 6
S R G O E F H P G
To decrypt ZSLWCAZHPR, we write the ciphertext in groups of five letters and add the
keyword to each group letter by letter modulo 26. We get the plaintext HILLCIPHER.
Z S L W C A Z H P R
25 18 11 22 2 0 25 7 15 17
18 10 0 11 0 18 10 0 11 0
7 8 11 11 2 8 15 7 4 17
H I L L C I P H E R
The size of the key space for Vigenère Cipher is given by .26m . If .m = 6, it is
about .3.1 × 108 ≈ 228.2 , which is possible to search each key using a computer.
114 2 Introduction to Cryptography
However, for larger m, it becomes much more difficult. If .m = 25, .2625 ≈ 2117 ,
which is not feasible with current computation powers.
EA : Zm
. 26 → Z26 ,
m
p |→ pA; DA : Z m
26 → Z26 ,
m
c |→ cA−1 .
0 5 1
be a matrix in M3×3 (Z26 ). We denote by Aij the matrix obtained from A by deleting
the ith row and the j th column. Then
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
12 4 34 3 12
A00 =
. , A01 = , A02 = .
5 1 01 0 5
Let aij denote the entry of A at ith row and j th column, then by Eq. 1.6,
⎲
2
. det(A) = (−1)j a0j det(A0j ) mod 26
j =0
1 = 4 − 3 = 4 − (11 − 4 × 2) = 4 × 3 − 11 = (26 − 11 × 2) × 3 − 11
.
= 26 × 3 − 11 × 7 =⇒ 11−1 mod 26 = 7.
By Theorem 1.3.2,
⎛ ⎞ ⎛ ⎞
−8 9 −20 56 −63 140
.A
−1
= −7 ⎝−3 2 −2 ⎠ mod 26 = ⎝ 21 −14 14 ⎠
15 −10 21 −105 70 −147
⎛ ⎞
4 15 10
mod 26 = ⎝21 12 14⎠ .
25 18 9
0 5 1
116 2 Introduction to Cryptography
The inverse of a .2×2 matrix can be computed using Eq. 1.7, where the computations
should be . mod 26. We have
⎛ ⎞−1 ⎛ ⎞ ⎛ ⎞
0 19 −1 07 0 11
. =3 mod 26 = .
19 0 70 11 0
We have seen that an exhaustive key search can be used to break affine cipher,
where the attacker can find both the plaintext and the key. But this does not apply
to substitution cipher or Vigenère cipher. Next, we will discuss other cryptanalysis
methods that can be used to break those ciphers.
that the most common two consecutive letters are TH, HE, IN, .. . . , and the most
common three consecutive letters are THE, ING, AND, .. . . .
Given a ciphertext that is encrypted using a monoalphabetic cipher (i.e., one
alphabet is mapped to a unique alphabet), we expect a permutation of the letters
in the ciphertext to have similar frequencies as in Table 2.6.
Example 2.2.10 Suppose the cipher used is an affine cipher, and we have the
following ciphertext:
VCVIRSKPOFPNZOTHOVMLVYSATISKVNVLIVSZVR.
.
We can calculate the frequencies of each letter that appear in the text:
V S I O R K P N Z T L C F H M Y A
8 4 3 3 2 2 2 2 2 2 2 1 1 1 1 1 1
The most frequent letter is V, and the second most frequent one is S. Thus, it
makes sense to assume V is the ciphertext corresponding to E and S to T. Let the key
be .(a, b), and by Table 2.1 and Definition 2.2.2, we have the following equations:
which gives
26 = 15 × 1 + 11,
. 15 = 11 × 1 + 4, 11 = 4 × 2 + 3, 4 = 3 + 1,
and
= 15 × 3 − 11 × 4 = 15 × 3 − (26 − 15) × 4 = 15 × 7 − 26 × 4.
2.2 Classical Ciphers 119
Furthermore, we get
b = 21 − 4a mod 26 = 21 − 4 × 5 mod 26 = 1.
.
Applying the decryption key .(21, 1) to the ciphertext, we get the following plaintext:
We note that the same technique works for substitution cipher since it is also
monoalphabetic. But a longer ciphertext might be needed since we do not have
equations to solve for the key. Instead, we must guess the mapping between each
distinct letter in the ciphertext to the 26 alphabets (see [Sti05] Section 1.2.2).
Remark 2.2.2 Suppose the length of the keyword m is determined for Vigenère
cipher. We take every mth letter from the ciphertext and obtain m ciphertexts. Then
each of them can be considered as the ciphertext of the shift cipher with a key given
by the corresponding letter in the keyword.
Example 2.2.11 Suppose we have the following ciphertext generated with
Vigenère cipher (Definition 2.2.4), and we know that the keyword length .m = 3.
SJRRIBSWRKRAOFCDACORRGSYZTCKVYXGCCSDDLCCEKOAMBHGCEKEPRS
.
TJOSDWXFOGMBVCCTMXHGXKNKVRCMLDLCMMNRIPDIVDAGVPZOXFOWYWI.
Take every third letter, and we have the following three ciphertexts:
SRSKODOGZKXCDCOBCESOWOBCXXKCDMRDDVOOW,
. JIWRFARSTVGSLEAHEPTSXGVTHKVMLMIIAPXWI,
RBRACCRYCYCDCKMGKRJDFMCMGNRLCNPVGZFY.
We note that each of them can be considered as the ciphertext of a shift cipher,
where the keys correspond to each letter of the keyword for the Vigenère cipher (as
mentioned in Remark 2.2.2). The frequencies of each letter in the first ciphertext are
as follows:
120 2 Introduction to Cryptography
O D C S K X R B W G Z E M V
7 5 5 3 3 3 2 2 2 1 1 1 1 1
The most frequent letter is O, and we assume O .(14) is the ciphertext corresponding
to E .(4). And this gives us the first letter of the keyword
14 − 4 mod 26 = 10 mod 26 = K.
.
I A S T V W R G L E H P X M J F K
4 3 3 3 3 2 2 2 2 2 2 2 2 2 1 1 1
Similarly, we assume E .(4) is encrypted as I .(8). And the second letter of the
keyword is
8 − 4 mod 26 = 4 mod 26 = E.
.
C R Y M G D K F N B A J L P V Z
7 5 5 3 3 3 2 2 2 1 1 1 1 1 1 1
2 − 4 mod 26 = 24 mod 26 = Y.
.
Thus we have recovered the keyword KEY. Computing decryption with the
keyword, we get the following plaintext:
Next, we will discuss two methods to determine the length m of the keyword for
a Vigenère cipher.
We observe that if the distance between two appearances of the same sequence of
alphabets in the plaintext is a multiple of m, the corresponding parts in the ciphertext
2.2 Classical Ciphers 121
will be the same. Kasiski test looks for identical parts of ciphertext and records the
distance between those parts. Then we know that m is a divisor for all the distance
values.
Example 2.2.12 Suppose the plaintext is
THE MEETING WILL BE IN THE CAFE AND THE STARTING TIME IS TEN
.
THE MEETING WILL BE IN THE CAFE AND THE STARTING TIME IS TEN
. KEY KEYKEYK EYKE YK EY KEY KEYK EYK EYK EYKEYKEY KEYK EY KEY
DLC WICDMLQ AGVP ZO ML DLC MEDO ELN XFO WRKVRSRE DMKO MQ DIL.
The first two appearances of THE have distance 15, which is a multiple of 3,
and hence the corresponding parts in the ciphertext are the same DLC. But the
third appearance of THE has distance 7 from the second appearance, and the
corresponding parts in the ciphertext are different.
On the other hand, if we have only the ciphertext, we can observe the two
identical parts DLC with distance 15, and then we can conclude that very likely
m is a divisor of 15, i.e., .m = 1, 3, 5, 15. To decide the exact value of m, a longer
ciphertext is needed, or frequency analysis (see Example 2.2.11) can be applied
assuming different values of m until a meaningful plaintext is found.
Example 2.2.14 Let x be a long English text. If we randomly choose a letter from
x, we expect that the probabilities for each letter to be chosen are similar to the
values listed in Table 2.6. If we randomly choose two letters from x, the probability
for both letters to be A is then given by 0.0822 , the probability for both to be B is
0.0152 , etc. Thus, the index of coincidence for x can be approximated as
122 2 Introduction to Cryptography
⎲
25
Ic (x) ≈
. pi2 = 0.065.
i=0
Example 2.2.15 Let x be the ciphertext from Example 2.2.11. The total number of
letters is 110, and the frequencies of each letter are
C R O D S K G M V X I W A B F Y T L E P J Z H N
12 9 7 7 7 6 6 6 5 5 4 4 4 3 3 3 3 3 3 3 2 2 2 2
1
Ic (x) =
. (12 × 11 + 9 × 8 + · · · + 2 × 1) = 0.004454.
110 × 109
c1 = c1 cm+1 . . .
.
c2 = c2 cm+2 . . .
..
.
cm = cm c2m . . .
If m is the keyword length, we expect Ic (ci ) to be close to 0.065 (see Remark 2.2.3).
Otherwise, ci will be more random and Ic (ci ) will be closer to 0.038 (see
Example 2.2.13)
2.2 Classical Ciphers 123
Example 2.2.16 Suppose we have the same ciphertext as in Example 2.2.11, and
we do not know the value of m.
Assume m = 1, we have calculated that
Ic (c) = 0.004454
.
in Example 2.2.15.
Assume m = 2, we have
c1 = SRISRROCAORSZCVXCSDCEOMHCKPSJSWFGBCTXGKKRMDCMRPIDGPOFWW
.
c2 = JRBWKAFDCRGYTKYGCDLCKABGEERTODXOMVCMHXNVCLLMNIDVAVZXOYI
and
Ic (c1 ) = 0.05253,
. Ic (c2 ) = 0.03636.
Assume m = 3,
c1 = SRSKODOGZKXCDCOBCESOWOBCXXKCDMRDDVOOW
. c2 = JIWRFARSTVGSLEAHEPTSXGVTHKVMLMIIAPXWI
c3 = RBRACCRYCYCDCKMGKRJDFMCMGNRLCNPVGZFY
and
Ic (c1 ) = 0.07958,
. Ic (c2 ) = 0.04054, Ic (c3 ) = 0.06984.
Thus it is more likely that m = 3. The exact value can be verified by frequency
analysis as shown in Example 2.2.11 to see if the recovered plaintext is meaningful.
Example 2.2.17 (An Example of Cipher that Is Not Perfectly Secure) Let
P = { 0, 1 } ,
. K = { x, y } , C = { α, β } .
Ex (0) = Ey (1) = α,
. Ex (1) = Ey (0) = β.
Suppose
1 2 1 4
P (0) =
. , P (1) = , P (x) = , P (y) = .
3 3 5 5
Then
3
P (α) = P (x ∩ 0) + P (y ∩ 1) = P (x)P (0) + P (y)P (1) =
. ,
5
and
P (0)P (α|0) P (0)P (x) 1
P (0|α) =
. = = .
P (α) P (α) 9
2.2 Classical Ciphers 125
We have
8
P (1|α) = 1 − P (0|α) =
. ,
9
and
2
P (β) = 1 − P (α) =
. .
5
Similarly, we get
2 1
.P (0|β) = , P (1|β) = .
3 3
Thus .P (p|c) /= P (p) for all .p ∈ P, c ∈ C, and the cipher is not perfectly secure.
In particular, if the attacker knows the ciphertext is .α, they can conclude that it is
more likely that the plaintext is 1 rather than 0, and if the ciphertext is .β, they can
conclude that it is more likely for the plaintext to be 0.
We recall uniform probability measures from Definition 1.7.3.
Theorem 2.2.1 One-time pad is perfectly secure if and only if the probability
measure on the key space is uniform.
Proof Fix a positive integer n, and let .P = C = K = Fn2 . For any .p ∈ P and .c ∈ C,
if c is the ciphertext corresponding to p, then we know the key used is .kp,c := p ⊕c.
Thus
P (c|p) = P (kp,c ).
.
P (p ∩ c) P (p)P (c)
.P (kp,c ) = P (c|p) = = = P (c),
P (p) P (p)
which shows that the probability of .kp,c is not dependent on p and the probabilities
of all .kp,c s are the same for this fixed c. When p takes all possible values in .P, we
have all possible values of .kp,c ∈ K. Thus we can conclude that .P (k) is the same
for all .k ∈ K.
.⇐= Since .{ q | q ∈ P } is a finite partition of .Ω, by Theorem 1.7.2, for any .c ∈ C
and any .p ∈ P,
1
.P (k) = , ∀k ∈ K.
|K|
∑
Also, . q∈P P (q) = 1. We have
⨆
⨅
We note that brute force of the key does not work for one-time pad—by brute force,
the attacker can obtain any plaintext of the same length as the original plaintext.
However, key management is the bottle neck of one-time pad. With a plaintext
of length n, we will also need a key of length n. Furthermore, as we have mentioned
earlier, each key can only be used once. Thus it is necessary to share a key of the
same length as the message each time before the communication. This makes it
impractical to use one-time pad.
We have seen a few examples of classical block ciphers. For messages that are
longer than the block length, the way we encrypted them (e.g., see Examples 2.2.8
and 2.2.5) can be described by Fig. 2.2. Similarly, the decryption method we have
applied (e.g., see Examples 2.2.8 and 2.2.6) corresponds to Fig. 2.3.
In general, when we use a symmetric block cipher of block length n to encrypt a
long message, we first divide this long message into blocks of plaintexts of length
n. Then we apply certain encryption mode to encrypt the plaintext blocks. If the last
block has a length of less than n, padding might be required. Different methods exist
for padding, e.g., using a constant or using a random number.
The simplest encryption mode is the mode we have been using so far, which is
called electronic codebook (ECB) mode. ECB mode is easy to use, but the main
drawback is that the encryption of identical plaintext blocks produces identical
ciphertext blocks. For an extreme case, if the plaintext is either all 0s or all 1s, it
would be easy for the attacker to deduce the message given a collection of plaintext
Fig. 2.4 Original picture and encrypted picture with ECB and CBC modes
and ciphertext pairs. Due to this property, it is also easy to recognize patterns of
the plaintext in the ciphertext, which makes statistical attacks easier (e.g., frequency
analysis of the affine cipher described in Example 2.2.10). For example, Fig. 2.4b
gives an example for encryption using ECB mode. Compared to the original image
in Fig. 2.4a, we can see a clear pattern of the plaintext from the ciphertext.
To avoid such problems, we can use the cipherblock chaining (CBC) mode. The
encryption and decryption are shown in Figs. 2.5 and 2.6, respectively, where IV
stands for initialization vector. An IV has the same length as the plaintext block and
is public. We can see that with CBC, the same plaintext is encrypted differently with
different IVs. Figure 2.4a encrypted with CBC mode is shown in Fig. 2.4c, where
no clear pattern can be seen.
Furthermore, if a plaintext block is changed, the corresponding ciphertext block
will also be changed, affecting all the subsequent ciphertext blocks. Hence CBC
mode can also be useful for authentication.
However, with CBC mode, the receiver needs to wait for the previous ciphertext
block to arrive to decrypt the next ciphertext block. In real-time applications, output
feedback (OFB) mode can be used to make communication more efficient. As
128 2 Introduction to Cryptography
shown in Figs. 2.7 and 2.8, the encryption function is not used for encrypting the
plaintext blocks, rather it is used for generating a key sequence. Ciphertext blocks
are computed by XORing the plaintext blocks and the key sequence. Such a design
allows the receiver and the sender to generate the key sequence simultaneously
before the ciphertext is sent.
In a way, OFB mode can be considered as a synchronous stream cipher (see
Sect. 2.1.2). Another advantage of OFB mode is that padding is not needed.
However, the encryption of a plaintext block does not depend on the previous blocks,
which makes it easier for the attacker to modify the ciphertext blocks.
2.4 Further Reading 129
We refer the readers to [Sti05, Chapter 1] for more discussions on classical ciphers
and to [MVOV18] for a detailed presentation on different cryptographic primitives.
As for encryption modes and padding schemes, we refer the readers to [PP09,
Chapter 5].
In Sect. 2.2.7 we introduced a classical stream cipher—one-time pad. The area of
stream ciphers, albeit less discussed in the cryptography books than its block cipher
counterpart, encompasses many modern algorithm designs. We do not go into detail
in this book; interested readers will find more information in [KPP+ 22].
The physical attacks we will present in Chaps. 4 and 5 are for symmetric
block ciphers, one particular public key cipher (RSA), and RSA signatures. There
is also plenty of research on physical attacks on other cryptographic primitives,
e.g., hash functions [HH11, HLMS14, KMBM17], post-quantum public key
algorithms [MWK+ 22, PSKH18, XIU+ 21, PPM17], or stream ciphers [BMV07,
BT12, KDB+ 22].
Chapter 3
Modern Cryptographic Algorithms
and Their Implementations
For the construction of symmetric block ciphers, two important principles are
followed by modern cryptographers—confusion and diffusion. Shannon first intro-
duced them in his famous paper [Sha45].
Confusion obscures the relationship between the ciphertext and the key. To
achieve this, each part of the ciphertext should depend on several parts of the key.
For example, in Vigenère cipher, each letter of the plaintext and each letter of the key
influence exactly one letter of the ciphertext. Consequently, we can use the Kasiski
test (Sect. 2.2.6.2) or index of coincidence (Sect. 2.2.6.3) to attack the Vigenère
cipher. Diffusion obscures the statistical relationship between the plaintext and the
ciphertext. Each change in the plaintext is spread over the ciphertext, with the
redundancies being dissipated. For example, monoalphabetic ciphers (Sect. 2.2.4)
have very low diffusion—the distributions of letters in plaintext correspond directly
to those in the ciphertext. That is also why frequency analysis (Sect. 2.2.6.1) can be
applied to break those ciphers.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 131
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_3
132 3 Modern Cryptographic Algorithms and Their Implementations
S0 = p,
.
S1 = F (S0 , K1 ),
S2 = F (S1 , K2 ),
..
.
SNr = F (SNr−1 , KNr ),
c = SNr .
To perform decryption, we require that for any given round key .Ki , .F (·, Ki ) has an
inverse, i.e.,
F −1 (F (x, Ki ), Ki ) = x,
. ∀x ∈ Fn2 .
. SNr = c,
SNr−1 = F −1 (SNr , KNr ),
1 The round function for the last round might be a bit different, as for the case of AES (see
Sect. 3.1.2).
3.1 Symmetric Block Ciphers 133
..
.
S1 = F −1 (S2 , K2 ),
S0 = F −1 (S1 , K1 ),
p = S0 .
We recall for a vector space over .F2 , vector addition is given by bitwise XOR,
denoted .⊕ (Definition 1.3.6). XOR with the round key is a common operation in
round functions of symmetric block ciphers.
Another common function is a substitution function called Sbox, denoted SB,
SB : Fω2 1 → Fω2 2 .
.
Normally .ω1 or/and .ω2 is a divisor of the block length n and a few Sboxes are
applied in one round function. When .ω1 = ω2 , SB is a permutation on .Fω2 1 and we
say that the Sbox is a .ω1 −bit Sbox.
There are mainly two types of symmetric block ciphers—Feistel cipher and
Substitution–permutation network (SPN) cipher.
For a Feistel cipher, the cipher state at the beginning/end of each round is divided
into two halves of equal length. The cipher state at the end of round i is denoted as
.Li and .Ri , where L stands for left and R stands for right. The round function F is
defined as
(3.1)
We note that f is a function that does not need to have an inverse since the function
F defined as in Eq. 3.1 is always invertible:
Li−1 = Ri ⊕ f (Li , Ki ),
. Ri−1 = Li .
Furthermore, the ciphertext is normally given by .RNr ||LNr (i.e., swapping the left
and right side of the cipher state at the end of the last round). In this case, if we let
.Ri and .Li denote the right and left part of the cipher state at the end of round i in
the decryption, then the decryption computation is the same as in Eq. 3.1 except that
the round keys are in reverse order as that for encryption. An illustration of Feistel
cipher can be seen in Fig. 3.1.
Let .ω be a divisor of n, the block length, and let .𝓁 = n/ω. The design of an SPN
cipher encryption is shown in Fig. 3.2, where SB is an .ω-bit Sbox. In most cases,
.ω = 4, 8.
Each round of an SPN cipher normally consists of bitwise XOR with the round
key, application of .𝓁 parallel .ω-bit Sboxes, and a permutation on .Fn2 . The encryption
starts with XOR with a round key, also ends with XOR with a round key before
outputting the ciphertext. Otherwise, the cipher states in the second (or the last)
134 3 Modern Cryptographic Algorithms and Their Implementations
round are all known to the attacker. Those two operations are called whitening. For
decryption, the inverse of Sbox and permutation are computed, and round keys are
XOR-ed with the cipher state in reverse order compared to that for encryption.
3.1.1 DES
Let us first look at one Feistel cipher—Data Encryption Standard (DES). DES was
developed at IBM by a team led by Horst Feistel and the design was based on Lucifer
cipher [Sor84]. It was used as the NIST standard from 1977 to 2005. Furthermore,
it has a significant influence on the development of cipher design.
The block length of DES is .n = 64, i.e., .P = C = F64 2 . Hence .Li , Ri ∈ F2 .
32
The master key length is 56, i.e., .K = F2 . The round key length is 48. The total
56
number of rounds .Nr = 16. An illustration of DES encryption is shown in Fig. 3.3.
Each DES round function follows the structure as described in Eq. 3.1.
Before the first round function, the encryption starts with an initial permutation
(IP). The inverse of IP, called the final permutation .(I P −1 ) is applied to the
cipher state after the last round before outputting the ciphertext. Initial and final
permutations are included for the ease of loading plaintext/ciphertext. Initial and
final permutations are shown in Table 3.1. For example, in IP, the 1st bit of the
output is from the 58th bit of the input. The 2nd bit of the output is from the 50th
bit of the input.
136 3 Modern Cryptographic Algorithms and Their Implementations
Table 3.1 Initial permutation (IP) and final permutation (IP.−1 ) in DES algorithm
(a) IP (b) IP.−1
58 50 42 34 26 18 10 2 40 8 48 16 56 24 64 32
60 52 44 36 28 20 12 4 39 7 47 15 55 23 63 31
62 54 46 38 30 22 14 6 38 6 46 14 54 22 62 30
64 56 48 40 32 24 16 8 37 5 45 13 53 21 61 29
57 49 41 33 25 17 9 1 36 4 44 12 52 20 60 28
59 51 43 35 27 19 11 3 35 3 43 11 51 19 59 27
61 53 45 37 29 21 13 5 34 2 42 10 50 18 58 26
63 55 47 39 31 23 15 7 33 1 41 9 49 17 57 25
Note For DES specification, we consider the 1st bit of a value as the leftmost
bit in its binary representation. For example, the 1st bit of .3 = 0112 is 0, the
2nd bit is 1 and the last bit is 1.
At the ith round, the function f in the round function of DES takes input .Ri−1 ∈
F32
2 and round key .Ki ∈ F2 , then outputs a 32-bit intermediate value as follows:
48
.EDES (Ri−1 ) is XOR-ed with the round key .Ki , producing a 48-bit intermediate
value. This 48-bit value is divided into eight 6-bit subblocks. Eight distinct Sboxes,
j
SB.DES : F62 → F42 .(1 ≤ j ≤ 8), are applied to each of the 6 bits. Finally, the
resulting 32-bit intermediate value goes through a permutation function .PDES :
F32
2 → F2 . An illustration of f is shown in Fig. 3.4.
32
Details of the expansion function .EDES are given in Table 3.2. 16 bits of the input
are repeated and affect two bits of the output, which influence two Sboxes. Such a
design makes the dependency of the output bits on the input bits spread faster and
achieves higher diffusion.
3.1 Symmetric Block Ciphers 137
Table 3.2 Expansion function .EDES : F32 2 → F2 in DES round function. The 1st bit of the output
48
is given by the 32nd bit of the input. The 2nd bit of the output is given by the 1st bit of the input
32 1 2 3 4 5 4 5 6 7 8 9 8 9 10 11
12 13 12 13 14 15 16 17 16 17 18 19 20 21 20 21
22 23 24 25 24 25 26 27 28 29 28 29 30 31 32 1
Table 3.4 Permutation function .PDES : F32 2 → F2 in DES round function. The 1st bit of the
32
output is given by the 16th bit of the input. The 2nd bit of the output comes from the 7th bit of
the input
16 7 20 21 29 12 28 17 1 15 23 26 5 18 31 10
2 8 24 14 32 27 3 9 19 13 30 6 22 11 4 25
The design of the first Sbox is shown in Table 3.3, and the rest of the Sboxes
are detailed in Appendix C. To use those tables, take an input of one Sbox, say
.b1 b2 b3 b4 b5 b6 , the output corresponds to row .b1 b6 and column .b2 b3 b4 b5 . We note
that each row of each of the Sbox tables is a permutation of integers .0, 1, . . . , 15.
Example 3.1.1 Suppose the input of SB.1DES is
b1 b2 b3 b4 b5 b6 = 100110.
.
According to Table 3.3, the row number is given by .b1 b6 = 2. The column number
is given by .b2 b3 b4 b5 = 0011 = 3. Hence the output is .8 = 1000. Similarly (see
Table C.1 (b)),
The details of the permutation function .PDES are given in Table 3.4.
The key schedule of DES takes a 64-bit master key as input and outputs round
keys of length 48. An illustration of the key schedule is in Fig. 3.5, where PC stands
for permuted choice.
Each 8th bit of the master key is a parity-check bit of the previous 7 bits, i.e.,
the XORed value of those 7 bits. PC1 reduces 64-bit input to 56 bit by ignoring those
parity-check bits and outputs a permutation of the remaining 56 bits. Then the output
is divided into two 28-bit halves (see Table 3.5). Each half rotates left by one or two
bits, depending on the round (see Table 3.6). Finally, PC2 selects 48 bits out of 56
bits, permutes them, and outputs the round key (see Table 3.7).
138 3 Modern Cryptographic Algorithms and Their Implementations
Table 3.5 Left and right part of the intermediate values in DES key schedule after PC1. The 1st
bit of the left part comes from the 57th bit of the master key (input to PC1)
Left Right
57 49 41 33 25 17 9 63 55 47 39 31 23 15
1 58 50 42 34 26 18 7 62 54 46 38 30 22
10 2 59 51 43 35 27 14 6 61 53 45 37 29
19 11 3 60 52 44 36 21 13 5 28 20 12 4
Table 3.6 Number of key bits rotated per round in DES key schedule
Round 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Rotation 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 1
For some master keys, the key schedule outputs the same round keys for more
than one round. Those master keys are called weak keys. Weak keys should not be
used. It can be shown that there are in total four of them:
• 01010101 01010101,
3.1 Symmetric Block Ciphers 139
• FEFEFEFE FEFEFEFE,
• E0E0E0E0 F1F1F1F1,
• 1F1F1F1F 0E0E0E0E.
Remark 3.1.1 From the design of the DES key schedule, we can see that with the
knowledge of any round key, the attacker can recover 48 bits of the master key.
The remaining 8 can be found by brute force. Alternatively, with the knowledge of
another round key, the master key can be recovered.
3.1.2 AES
Fig. 3.6 AES round function for round i, .1 ≤ i ≤Nr.−1. SB, SR, MC, and AK stand for SubBytes,
ShiftRows, MixColumns, and AddRoundKey respectively
Recall that one byte is a vector in .F82 and can be represented as a hexadecimal
number between 00 and FF (see Definition 1.3.7 and Remark 1.3.3). As discussed
in Sect. 1.5.1, a byte can also be identified as an element in .F2 [x]/(f (x)), where
.f (x) = x + x + x + x + 1 ∈ F2 [x] is an irreducible polynomial over .F2 .
8 4 3
( )
Remark 3.1.2 We refer to . si0 si1 si2 si3 as the .(i + 1)th row of the cipher state,
and
⎛ ⎞
s0j
⎜s1j ⎟
.⎜ ⎟
⎝s2j ⎠
s3j
⎛ ⎞ ⎛ ⎞
1 1 1 1 1 0 0 0 0
⎜0
⎜ 1 1 1 1 1 0 0⎟
⎟
⎜1⎟
⎜ ⎟
⎜0
⎜ 0 1 1 1 1 1 0⎟
⎟
⎜1⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜0 0 0 1 1 1 1 1⎟ ⎜0⎟
.A = ⎜ ⎟, a = ⎜ ⎟,
⎜1 0 0 0 1 1 1 1⎟ ⎜0⎟
⎜ ⎟ ⎜ ⎟
⎜1
⎜ 1 0 0 0 1 1 1⎟
⎟
⎜0⎟
⎜ ⎟
⎝1 1 1 0 0 0 1 1⎠ ⎝1⎠
1 1 1 1 0 0 0 1 1
then
⎧
Az−1 + a z /= 0
SBAES (z) =
. (3.3)
a z=0
where .z−1 is the inverse of z as an element in .F2 [x]/(f (x)) (see Sect. 1.5.1).
Example 3.1.2 SB.AES (00) = a = 011000112 = 63.
Example 3.1.3 Suppose the input of AES Sbox is .03 = 000000112 , which
corresponds to .x + 1 ∈ F2 [x]/(f (x)). We have shown in Example 1.5.21 that
.03
−1 = 11110110 . Then
2
142 3 Modern Cryptographic Algorithms and Their Implementations
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1111 1 0 0 0 1 0 0 0 0
⎜ 1⎟ ⎜0 1 1 1 1 1 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜
0⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟ ⎟ ⎜
⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜0 0 1 1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ 1 1 1 ⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜0 0 0 1 1 1 1 1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
.A ⎜ ⎟ + a = ⎜ ⎟⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟.
⎜ 0⎟ ⎜1 0 0 0 1 1 1 1⎟ ⎜0⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜1 1 0 0 1⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ 0 1 1 ⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎝ 1⎠ ⎝1 1 1 0 0 0 1 1 ⎝1⎠ ⎝1⎠ ⎝0⎠ ⎝1⎠ ⎝1⎠
⎠
0 1111 0 0 0 1 0 1 0 1 1
g(z) = Az + a.
.
⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 1 0 0 0 0 0 0
⎜0 0 1 0 1 0 0 1⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 1 0 ⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
−1 ⎜0 1 0 0 1 0 1 0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
.g (z) = ⎜ ⎟⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟,
⎜0 0 1 0 0 1 0 1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 0 1 ⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟
⎝0 1 0 0 1 0 0 1⎠ ⎝1⎠ ⎝0⎠ ⎝0⎠ ⎝0⎠ ⎝0⎠
1 0 1 0 0 1 0 0 1 1 1 1 0
which corresponds to
144 3 Modern Cryptographic Algorithms and Their Implementations
x 6 + x 4 + x 3 + x + 1 ∈ F2 [x]/(f (x)).
.
f (x) = (x 2 + 1)(x 6 + x 4 + x 3 + x + 1) + (x 5 + x 3 + x 2 ),
.
x 6 + x 4 + x 3 + x + 1 = x(x 5 + x 3 + x 2 ) + (x + 1),
x 5 + x 3 + x 2 = (x 4 + x 3 + x + 1)(x + 1) + 1.
1 = (x 5 + x 3 + x 2 ) + (x 4 + x 3 + x + 1)(x + 1)
.
= (x 5 + x 3 + x 2 ) + (x 4 + x 3 + x + 1)((x 6 + x 4 + x 3 + x + 1)
+x(x 5 + x 3 + x 2 ))
= (x 4 + x 3 + x + 1)(x 6 + x 4 + x 3 + x + 1)
+(x 5 + x 4 + x 2 + x + 1)(x 5 + x 3 + x 2 )
= (x 4 + x 3 + x + 1)(x 6 + x 4 + x 3 + x + 1)
+(x 5 + x 4 + x 2 + x + 1)(f (x) + (x 2 + 1)(x 6 + x 4 + x 3 + x + 1))
= (x 5 + x 4 + x 2 + x + 1)f (x) + (x 7 + x 6 + x 5 + x 4 )(x 6 + x 4 + x 3 + x + 1).
And we have
The first row does not change. The second row rotates left by one byte. The third
row rotates left by two bytes. Finally, the last row rotates left by three bytes.
In another representation, let us denote the input of ShiftRows using cipher state
representation in Eq. 3.2. Let the output of ShiftRows be a matrix B with entries .bij
(.0 ≤ i, j ≤ 3). Then
3.1 Symmetric Block Ciphers 145
⎛ ⎞ ⎛ ⎞
b0j s0j
⎜b1j ⎟ ⎜ s1(j +1 mod 4) ⎟
.⎜ ⎟=⎜
⎜
⎟
⎟, 0 ≤ j < 4. (3.4)
⎝b2j ⎠ ⎝s
2(j +2 mod 4) ⎠
b3j s3(j +3 mod 4)
MixColumns multiplies .s3j x 3 + s2j x 2 + s1j x + s0j with another polynomial over
F2 [x]/(f (x)) given by
.
d(x) = (s3j x 3 + s2j x 2 + s1j x + s0j )(03x 3 + 01x 2 + 01x + 0216 ) mod (x 4 + 1)
.
⎛ ⎞ ⎛ ⎞⎛ ⎞
d0 02 03 01 01 s0j
⎜d1 ⎟ ⎜01 02 03 01⎟ ⎜s1j ⎟
.⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝d2 ⎠ = ⎝01 01 02 03⎠ ⎝s2j ⎠ . (3.6)
d3 03 01 01 02 s3j
Then
⎛ ⎞⎛ ⎞ ⎛ ⎞
02 03 01 01 D4 04
⎜01 02 03 01⎟ ⎜BF⎟ ⎜66⎟
.⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝01 01 02 03⎠ ⎝5D⎠ = ⎝81⎠
03 01 01 02 30 E5
02 × D4 = 101100112 ,
. 03 × BF = 110110102 .
we have
where the addition is computed modulo .f (x). As discussed in Remark 1.5.2, this
addition is equivalent to XOR. Consequently, we have
.x 4 + 1 = (x + 1)4
as a polynomial over .F2 [x]/(f (x)). Since 1 is not a root of .g(x), .x + 1 does not
divide .g(x), which gives
. gcd(g(x), x 4 + 1) = 1.
We have shown that .F2 [x]/(f (x)) is a field in Sect. 1.5.1. .g(x)−1 mod x 4 + 1 can
be computed using the extended Euclidean algorithm, similarly to Example 1.5.10.
We have
⎛ ⎞
0E 0B 0D 09
⎜09 0E 0B 0D⎟
.⎜ ⎟ (3.7)
⎝0D 09 0E 0B⎠ .
0B 0D 09 0E
We will discuss the AES key schedule for key length 128, which corresponds to
Nr .= 10. The algorithms for other key lengths are defined similarly (see [DR02]
for more details). The key schedule algorithm is named KeyExpansion, shown in
Algorithm 3.1. The master key k is written as a four-by-four array of bytes, denoted
by .K[4][4] in the algorithm. KeyExpansion expands .K[4][4] to a .4 × 44 array of
bytes, denoted by .W [4][44]. Since Nr .= 10, in total we need 11 round keys. The ith
round key is given by the columns 4i to .4(i + 1) − 1 of W . Note that the 0th round
key, i.e., the round key for whitening at the beginning of the encryption, is given
by the first 4 columns of W , which are equal to the master key (lines 1–3). Round
constants, denoted Rcon (line 6), is an array of ten bytes, computed as follows:
Rcon[1] = x 0 = 01,
. and Rcon[j ] = xRcon[j − 1] = x j −1 , for j > 1.
We have
Rcon = {01, 02, 04, 08, 10, 20, 40, 80, 1B, 36} .
.
148 3 Modern Cryptographic Algorithms and Their Implementations
9 else
10 for i = 0, i < 4, i + + do
11 W [i][j ] = W [i][j − 4] ⊕ W [i][j − 1]
12 return W
The key schedule is also depicted in Fig. 3.7, where the round keys are repre-
sented as four-by-four grids and each box corresponds to one byte. The rotation .⪡
rotates the right-most column by one byte
3.1 Symmetric Block Ciphers 149
⎛ ⎞ ⎛ ⎞
y0 y1
⎜y1 ⎟ ⎜y2 ⎟
.⎜ ⎟ ⎜ ⎟
⎝y2 ⎠ |→ ⎝y3 ⎠ .
y3 y0
Remark 3.1.4 We note that with the knowledge of any round key for AES-128
encryption, the attacker can recover the master key using the inverse of the key
schedule.
3.1.3 PRESENT
PRESENT was proposed in 2007 [BKL+ 07] as a symmetric block cipher optimized
for hardware implementation. It has block length .n = 64, number of rounds Nr
.= 31, and a key length of either 80 or 128. The Sbox for PRESENT is a 4-bit Sbox.
b63 b62 . . . b0
.
Ki = κ63
.
i
. . . κ0i , (1 ≤ i ≤ 32)
bitwise
bj = bj ⊕ κji ,
. 0 ≤ j ≤ 63.
sBoxLayer applies sixteen 4-bit Sboxes to each nibble of the current cipher state.
The 4-bit Sbox is given by Table 3.11. For example, if the input is 0, the output is C.
pLayer permutes the 64 bits of the cipher state using the following formula:
| |
j
pLayer(j ) =
. + (j mod 4) × 16,
4
where j denotes the bit position. For example, the 0th bit of the input stays as the
0th bit of the output, and the 1st bit of the input goes to the 16th bit of the output. It
can also be described using Table 3.12.
Figure 3.9 shows two rounds of PRESENT.
Here we detail the key schedule for PRESENT-80. We refer the readers to
[BKL+ 07] for the key schedule for the 128-bit master key. Let us denote the variable
storing the key by .k79 k78 . . . k0 . At round i, the round key is given by
Ki = κ63
.
i i
κ62 . . . κ0i = k79 k78 . . . k16 .
After extracting the round key, the variable .k79 k78 . . . k0 is updated using the
following steps:
1. Left rotate of 61 bits, .k79 k78 . . . k1 k0 = k18 k17 . . . k20 k19 ;
2. .k79 k78 k77 k76 = SBPRESENT (k79 k78 k77 k76 );
3. .k19 k18 k17 k16 k15 = k19 k18 k17 k16 k15 ⊕ round._counter;
where SB.PRESENT stands for the PRESENT Sbox (Table 3.11) and round._counter
= 1, 2, . . . , 31. A graphical illustration is shown in Fig. 3.10.
.
Remark 3.1.5 With the knowledge of any round key for PRESENT-80, the attacker
can recover 64 bits of the master key. The remaining 16 bits can be recovered by
brute force. Alternatively, with the knowledge of another round key, the master key
can also be revealed.
In Sect. 3.1, we saw that there are mainly three building blocks for a symmetric
block cipher: bitwise XOR with round key, Sbox, and permutation. In this section,
we will discuss how to implement each of them. While we mainly focus on the
152 3 Modern Cryptographic Algorithms and Their Implementations
software implementations of PRESENT and AES, the main ideas apply in general
to other ciphers with similar constructions.
It is easy in both software and hardware to implement bitwise XOR with a round
key. In hardware, there is an XOR gate and almost every processor has a dedicated
XOR instruction.
In software, a naïve way to implement Sbox is to use a lookup table. The table is
stored as an array in random access memory or flash memory. The storage space
required for an Sbox SB.: Fω2 1 → Fω2 2 is .ω2 × 2ω1 . For example, PRESENT
has a 4-bit Sbox (Table 3.11) and the storage required is .24 × 4 = 64 bits, or 8
bytes. A lookup table implementation of PRESENT Sbox in pseudocode is shown
in Algorithm 3.2. As current computer architectures normally use word sizes of
at least one byte (generally multiple bytes), it is not efficient to implement Sbox
nibble-wise. To optimize the execution time, we can merge two PRESENT Sbox
table lookups (Algorithm 3.3). However, even though we can utilize the space
more efficiently, the additional operations take extra computing time. To avoid the
bit shifts and boolean operations, it is better to combine two .4 × 4 Sbox tables into
one bigger .8 × 8 table (Algorithm 3.4):
3.2 Implementations of Symmetric Block Ciphers 153
In this part, we will discuss two methods for implementing PRESENT pLayer by
combining it with sBoxLayer.
The first method is straightforward. We will construct sixteen .4 × 64 lookup
tables, TB1, TB2, .. . . , TB16. The input of TBi is given by the ith nibble of the
cipher state at the input of sBoxLayer. The outputs are 64-bit values with mostly
0s except for 4 bits that are related to this ith input nibble through sBoxLayer and
pLayer.
Let us consider TB1, whose input is the first nibble of the cipher state at the input
of sBoxLayer. By Table 3.12, the Sbox output corresponding to this nibble should
go to bits .0, 16, 32 and 48 of the output of pLayer. Thus, each entry of TB1 is a
64-bit value with bits in positions .0, 16, 32 and 48 given by the Sbox output, and
the other bits are all 0.
Example 3.2.1 For example, if the input is A, the Sbox output should be .F = 11112
and
where the 0th, 16th, 32nd and 48th bits are 1. Similarly, PRESENT Sbox output for
input B is .10002 , and
TB1[B] = 0 . . . 010 . . . 0,
.
TB2[B] = 0 . . . 010 . . . 0,
.
• Table two takes the 1st byte (bits .8−15) of sBoxLayer input, the corresponding
output will be the 2nd and 3rd bits for bytes at positions .0, 1, 3, 5 (bits
.2, 3, 18, 19, 34, 35, 50, 51) in the output of pLayer;
• Table three takes the 2nd byte (bits .16−23) of sBoxLayer input, the correspond-
ing output will be the 4th and 5th bits for bytes at positions .0, 1, 3, 5 (bits
.4, 5, 20, 21, 36, 37, 52, 53) in the output of pLayer;
• Table four takes the 3rd byte (bits .24−31) of sBoxLayer input, the correspond-
ing output will be the 6th and 7th bits for bytes at positions .0, 1, 3, 5 (bits
.6, 7, 22, 23, 38, 39, 54, 55) in the output of pLayer.
The same tables can also be used for the remaining four bytes of the cipher
state:
• Table one takes the 4th byte (bits .32−39) of sBoxLayer input, the correspond-
ing output will be the 0th and 1st bits for bytes at positions .2, 4, 6, 7 (bits
.8, 9, 24, 25, 40, 41, 56, 57) in the output of pLayer;
• Table two takes the 5th byte (bits .40−47) of sBoxLayer input, the corresponding
output will be the 2nd and 3rd bits for bytes at positions .2, 4, 6, 7 (bits
.10, 11, 26, 27, 42, 43, 58, 59) in the output of pLayer;
• Table three takes the 6th byte (bits .48−55) of sBoxLayer input, the correspond-
ing output will be the 4th and 5th bits for bytes at positions .2, 4, 6, 7 (bits
.12, 13, 28, 29, 44, 45, 60, 61) in the output of pLayer;
• Table four takes the 7th byte (bits .56−63) of sBoxLayer input, the correspond-
ing output will be the 6th and 7th bits for bytes at positions .2, 4, 6, 7 (bits
.14, 15, 30, 31, 46, 47, 62, 63) in the output of pLayer.
Since the input for each table is one byte, we will be computing two Sboxes in
parallel. In Algorithm 3.4 we have seen the algorithm for such a computation. To
see how the four tables are computed, we will detail the first three entries of each
table. The other entries are calculated with similar methods.
First, we note that to combine two Sboxes, the lookup table starts with
.CC C5 C6 ...
As mentioned above, one type of input intended for Table one is bits at
positions .0−7 of sBoxLayer input, those bits correspond to bits at positions .0−7
at sBoxLayer output. The corresponding output of Table one are bits at positions
.0, 1, 16, 17, 32, 33, 48, 49 of pLayer output. According to pLayer (Table 3.12)
If we consider the other set of inputs intended for Table one, which are bits at
positions .32−39, they should be first permuted to .32, 36, 33, 37, 34, 38, 35, 39 so
that the output will be bits at positions .8, 9, 24, 25, 40, 41, 56, 57. Then we arrive
at the same values as in Eq. 3.8.
For Table two, the output will later be positioned at the 2nd and 3rd positions in
the eight bytes of the pLayer output. A natural choice is to design it so that the output
can be combined with the outputs of other tables with a binary operation, e.g., .∨. In
particular, since the output of Table one starts with bits from positions .0, 1 and .8, 9,
the output of Table two will put bits from positions .2, 3 and .10, 11 in the 2nd and
3rd positions. Thus, Table two permutes bits .8−15 to .11, 15, 8, 12, 9, 13, 10, 14,
which then will give bits at .50, 51, 2, 3, 18, 19, 34, 35 for pLayer output. Similarly,
bits .40−47 will be permuted to .43, 47, 40, 44, 41, 45, 42, 46 and give bits at
.58, 59, 10, 11, 26, 27, 42, 43 for pLayer output. The first few entries of Table two
are as follows:
3C
. 6C 2D ...
Table three first permutes bits from .16−23 (resp. .48−55) to .18, 22, 19, 23, 16,
20, 17, 21 (resp. .50, 54, 51, 55, 48, 52, 49), which then give bits .36, 37, 52, 53, 4, 5,
20, 21 (resp. .44, 45, 60, 61, 12, 13, 28, 29) of pLayer output. The table starts with
0F
. 1B 4B ...
Table four first permutes bits from .24−31 (resp. .56−63) to .25, 29, 26, 30, 27, 31,
24, 28 (resp. .57, 61, 58, 62, 59, 63, 56, 60), which then give bits .22, 23, 38, 39, 54,
55, 6, 7 (resp. .30, 31, 46, 47, 62, 63, 14, 15) of pLayer output. The table starts with
C3
. C6 D2 ...
Recall that the cipher state of AES can be represented by a four-by-four matrix
of bytes (Eq. 3.2). Let us denote the input of SubBytes by a matrix S. The outputs
of SubBytes, ShiftRows, and MixColumns are represented by matrices .A, B, and
D respectively. By definition, .aij = SB(sij ), 0 ≤ i, j < 4. By Eqs. 3.4 and 3.5,
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞
b0j a0j d0j 02 03 01 01 b0j
⎜b1j ⎟ ⎜ a1(j +1 mod 4) ⎟ ⎜d1j ⎟ ⎜01 02 ⎟
03 01⎟ ⎜b1j ⎟
⎜
.⎜ ⎟=⎜
⎜
⎟
⎟, ⎜ ⎟=⎜ ⎟,
⎝b2j ⎠ ⎝a ⎝d2j ⎠ ⎝01 01 02 03⎠ ⎝b2j ⎠
2(j +2 mod 4) ⎠
b3j a3(j +3 mod 4) d3j 03 01 01 02 b3j
j = 0, 1, 2, 3.
We have
158 3 Modern Cryptographic Algorithms and Their Implementations
⎛ ⎞ ⎛ ⎞⎛ ⎞
d0j 02 03 01 01 SB(s0j )
⎜d1j ⎟ ⎜01 02 03 01⎟ ⎜ SB(s1(j +1 mod 4) )⎟
.⎜ ⎟=⎜ ⎟⎜⎜
⎟
⎟
⎝d2j ⎠ ⎝01 01 02 03⎠ ⎝SB(s
2(j +2 mod 4) ⎠
)
d3j 03 01 01 02 SB(s3(j +3 mod 4) )
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
02 03 01
⎜01⎟ ⎜02⎟ ⎜03⎟
=⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝01⎠ SB(s0j ) ⊕ ⎝01⎠ SB(s1(j +1 mod 4) ) ⊕ ⎝02⎠ SB(s2(j +2 mod 4) )
03 01 01
⎛ ⎞
01
⎜01⎟
⊕⎜ ⎟
⎝03⎠ SB(s3(j +3 mod 4) ),
02
Then
⎛ ⎞
d0j
⎜d1j ⎟
.⎜ ⎟
⎝d2j ⎠ = T0 (s0j ) ⊕ T1 (s1(j +1 mod 4) ) ⊕ T2 (s2(j +2 mod 4) )T3 (s3(j +3 mod 4) ),
d3j
Thus the four tables .T0 , T1 , T2 , T3 of size .8×32 can be used to implement SubBytes,
ShiftRows, and MixColumns. Those four tables are called T-tables for AES. We
note that to store the T-tables we need processors with a word size of 32 or above.
They cannot be used for the last round of AES as there is no Mixcolumns operation.
3.2 Implementations of Symmetric Block Ciphers 159
ϕ : F32 → F2
.
x2 x1 x0 |→ x0 + x1 + x2 .
x2 0 0 0 0 1 1 1 1
x1 0 0 1 1 0 0 1 1
.
x0 0 1 0 1 0 1 0 1
ϕ(x) 0 1 1 0 1 0 0 1
Example 3.2.4 Now let use consider the Boolean function defined as follows:
ϕ0 : F42 → F2
.
x |→ SBPRESENT (x)0
where .SBPRESENT (x)0 is the 0th bit of SB.PRESENT (x), the PRESENT Sbox output
corresponding to .x. The truth table of .ϕ0 is given by the first five and the second
160 3 Modern Cryptographic Algorithms and Their Implementations
Table 3.13 The Boolean function .ϕ0 takes input .x and outputs the 0th bit of SB.PRESENT (x). The
second last row lists the output of .ϕ0 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ0
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ0 (x) 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0
.λx 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0
last (the row for .ϕ0 (x)) rows in Table 3.13. For example, if the input is 0, the Sbox
output is C .= 1100. Then .ϕ0 (x) = 0.
1v : Fn2 → F2
.
∏ ∏
x |→ xi (1 − xi ).
i:vi =1 i:vi =0
With this definition, for any .ϕ : Fn2 → F2 , we can express .ϕ in the following
polynomial expression:
∑
ϕ(x) =
. ϕ(v)1v (x).
v∈Fn2
which is called the algebraic normal form representation of the Boolean function .ϕ.
Example 3.2.5 Continuing Example 3.2.3, we can find the algebraic normal form
of .ϕ as follows
∑
ϕ(x) =
. ϕ(v)1v (x) = 1001 (x) + 1010 (x) + 1100 (x) + 1111 (x)
v∈Fn2
It can be proven that the algebraic normal form of a Boolean function is unique.2
Theorem 3.2.1 Every Boolean function .ϕ : Fn2 → F2 has a unique algebraic
normal form representation
⎛ ⎞
∑ ∏
n−1
ϕ(x) =
. λv xivi . (3.9)
v∈Fn2 i=0
By Eq. 3.9,
⎛ ⎞
∑ ∏
n−1
ϕ(x) =
. λv xivi = λ001 x0 + λ010 x1 + λ100 x2 = x0 + x1 + x2
v∈Fn2 i=0
By Eq. 3.9,
2 For the proof, see, e.g., [MS77, page 372] and [O’D14, page 149].
162 3 Modern Cryptographic Algorithms and Their Implementations
⎛ ⎞
∑ ∏
n−1
ϕ0 (x) =
. λv xivi = λ0001 x0 + λ0100 x2 + λ0110 x1 x2 + λ1000 x3
v∈Fn2 i=0
= x0 + x2 + x1 x2 + x3 . (3.11)
For example, if the input is 0 .= 0000, the PRESENT Sbox output is C .= 1100, then
the output of .ϕ0 is 0 and
x0 + x2 + x1 x2 + x3 = 0 + 0 + 0 + 0 = 0.
.
If the input is 7 .= 1110, the PRESENT Sbox output is D .= 1101, then the output of
ϕ0 is 1 and
.
x0 + x2 + x1 x2 + x3 = 1 + 1 + 1 + 0 = 1.
.
the algebraic normal form for each of .ϕi in a similar way (see Appendix D). They
are given by:
ϕ1 (x) = x1 + x3 + x1 x3 + x2 x3 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 , .
. (3.12)
ϕ2 (x) = 1 + x2 + x3 + x0 x1 + x0 x3 + x1 x3 + x0 x1 x3 + x0 x2 x3 , . (3.13)
ϕ3 (x) = 1 + x0 + x1 + x3 + x1 x2 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 . (3.14)
In this part, we will use PRESENT as a running example to show how the bitsliced
implementation of a symmetric block cipher is designed.
First, we discuss how to transform the plaintext blocks into bitsliced format. As a
simple example, let us consider block length 3 and a 4-bit architecture, which allows
us to encrypt 4 blocks of plaintext simultaneously. We take 4 plaintext blocks, say
The bitsliced format of .pj s is given by a .3 × 4 array, denoted S, where each column
is given by one block of plaintext:
⎛ ⎞
0010
.S = ⎝1 1 0 0⎠ .
0101
3.3 RSA 163
In particular, if we let .S[x] denote the xth row of S, then .S[0] corresponds to the 0th
bits of .pj . .S[1] corresponds to the 1st bits of .pj . And .S[2] corresponds to the 2nd
bits of .pj .
Next, we will show how to encrypt 8 plaintext blocks in parallel with PRESENT
assuming an 8-bit architecture. Let .p1 , p2 , . . . p8 be 8 plaintext blocks, each of
length 64. We convert them into bitsliced format as described above and store them
in a .64 × 8 array .S0 , where .S0 [y] contains the yth bits of each plaintext block.
Furthermore, for each round key .Ki , we construct a .64 × 8 array Keyi whose
columns are given by .Ki , i.e.,
Keyi[y][z] = Ki [y]
. ∀0 ≤ z < 8. (3.15)
3.3 RSA
In Sect. 2.1.2 we have mentioned that there are symmetric key and asymmetric
cryptosystems. Up to now, we have only seen symmetric cryptosystems, both
classical and modern designs. For symmetric key cipher, a prior communication
of the master key (key exchange) is required before any ciphertext is transmitted.
With only a symmetric key cipher, the key exchange may be difficult to achieve
164 3 Modern Cryptographic Algorithms and Their Implementations
due to, e.g., far distance, and too many parties involved. In practice, this is where
asymmetric key cryptosystem comes into use.
For example, Alice would like to communicate with Bob using AES. To
exchange the master key, k, for AES, she will encrypt k by a public key cryptosystem
using Bob’s public key e. Let .c = Ee (k). The resulting ciphertext c will be sent to
Bob, and Bob can decrypt it with his secret private key d, .k = Dd (c). Then Alice
and Bob can communicate with key k using AES.
Clearly, we require that it is computationally infeasible to find the private
key d given the public key e. In practice, this is guaranteed by some intractable
problem.3 However, the cipher might not be secure in the future. For example, if a
quantum computer with enough bits is manufactured, it can break many public key
cryptosystems [EJ96]. Furthermore, we note that a public key cipher is not perfectly
secure (see Sect. 2.2.7) as the attacker can brute force the key.
In this section, we will be discussing one public key cryptosystem—RSA. It
was published in 1977 and named after its inventors Ron Rivest, Adi Shamir, and
3A problem is intractable if there does not exist an efficient algorithm to solve it.
3.3 RSA 165
Leonard Adleman. RSA is the first public key cryptosystem, and still in use today.
The security relies on the difficulty of finding the factorization of a composite
positive integer.
Definition 3.3.1 (RSA) Let .n = pq, where .p, q are distinct prime numbers. Let
P = C = Zn , .K = Z∗ϕ(n) − {1}. For any .e ∈ K, define encryption
.
Ee : Zn → Zn ,
. m |→ me mod n,
Dd : Z n → Zn ,
. c |→ cd mod n,
ϕ(n) = (3 − 1) × (5 − 1) = 2 × 4 = 8.
.
From .Z∗8 = {1, 3, 5, 7}, Bob chooses .e = 3. By the extended Euclidean algorithm,
he computes
8 = 3 × 2 + 2, 3 = 2 × 1 + 1 =⇒ 1 = 3 − 2 × 1 = 3 − (8 − 3 × 2) = −8 + 3 × 3.
.
After receiving the ciphertext c from Alice, Bob computes the plaintext using his
private key
Example 3.3.2 Now we will look at a bit larger values for p and q. Let .p = 29,
.q = 41, then .n = 1189 and .ϕ(n) = 28 × 40 = 1120. It is easy to verify that
.3 ∤ ϕ(n). Let us choose .e = 3. By the extended Euclidean algorithm
Hence
Since
we compute
84 mod 1189 = 4096 mod 1189 = 529, 88 mod 1189 = 5292 mod 1189 = 426,
816 mod 1189 = 4262 mod 1189 = 748, 832 mod 1189 = 7482 mod 1189 = 674,
.
864 mod 1189 = 6742 mod 1189 = 78, 8128 mod 1189 = 782 mod 1189 = 139,
8256 mod 1189 = 1392 mod 1189 = 297, 8512 mod 1189 = 2972 mod 1189 = 223.
And we have
Then
By Corollary 1.4.3,
cd ≡ m mod p,
. cd ≡ m mod q.
Since p and q are distinct prime numbers and .n = pq, by Chinese Remainder
Theorem (see Theorem 1.4.7 and Example 1.4.19)
.cd ≡ m mod n.
We note that, if p or q is known to the attacker, they can factorize n and compute
ϕ(n). Then with e, d can be computed using the extended Euclidean algorithm
.
In this section, we discuss how RSA can be used for digital signatures.
As mentioned in Sect. 2.1, digital signatures provide a means for an entity to
bind its identity to a message stored in electronic form. This normally means that
the sender uses their private key to sign the (hashed) message. Whoever has access
to the public key can then verify the origin of the message. For example, the message
can be electronic contracts or electronic bank transactions.
In more detail, suppose Alice signs a message m with a private key d and
generates signature s. The receiver Bob receives the message and the signature, he
168 3 Modern Cryptographic Algorithms and Their Implementations
can then verify s with public key e and a verification algorithm. Given m and s, the
verification algorithm returns true to indicate a valid signature and false otherwise.
To use RSA for digital signature, we again let p and q be two distinct primes. Let
.n = pq. We choose .e ∈ Z
∗ −1 mod ϕ(n). Same as for RSA,
ϕ(n) and compute .d = e
the public key consists of e and n. And d is the private key. p, q and .ϕ(n) should be
kept secret.
To sign a message m, Alice computes the signature
s = md mod n.
.
Then Alice sends both m and s to Bob. To verify the signature, Bob computes
s e mod n.
.
If
s ≡ m mod n,
.
24 = 5 × 4 + 4, 5 = 4 + 1 =⇒ 1 = 5 − (24 − 5 × 4) = 24 × (−1) + 5 × 5,
.
Alice sends both the message .m = 10 and signature .s = 5 to Bob. Bob verifies the
signature
. s e mod n = 55 mod 35 = 10 = m.
The most common attack for a digital signature is to create a valid signature for a
message without knowing the secret key. Such an attack is called forgery. If the goal
is to create a valid signature given a message that was not signed by Alice before, it
is called selective forgery. If the goal is to create a valid signature for any message
not signed by Alice before, then the attack is called existential forgery.
There are normally three attacker assumptions. Key-only attack assumes the
attacker only has knowledge of e. Known message attack considers an attacker who
3.4 RSA Signatures 169
has a list of messages previously signed by Alice. In a chosen message attack, the
attacker can request Alice’s signature on a list of messages.
Next, we discuss the security of RSA signatures with respect to forgery attacks.
First, we consider a known message existential forgery attack. Suppose the
attacker, Eve, knows messages .m1 , m2 and their corresponding signatures .s1 and
.s2 . Eve computes
s = s1 s2 mod n,
. m = m1 m2 mod n.
Since
m2 = mm−1
.
1 mod n.
s1 = md1 mod n,
. and s2 = md2 mod n
.s = s1 s2 mod n.
Since
s = h(m)d mod n.
.
Then she sends both m and s to Bob. Bob computes .s e mod n and .h(m). If
. s e mod n = h(m),
With a hash function, the two attacks discussed above will not work. Suppose
Eve knows messages .m1 , m2 and their corresponding signatures .s1 and .s2 . She can
compute .h(m1 ) and .h(m2 ) as h is public. However, to repeat the known message
existential forgery attack, she needs to find m such that .h(m) = h(m1 )h(m2 ), which
is computationally infeasible according to property .(c) of hash functions listed in
Sect. 2.1.1.
Suppose Eve chooses a message m, and computes .h(m). To repeat the chosen
message selective forgery attack, she needs to find .m1 such that .h(m1 ) = y for
some .y ∈ Z∗n . For the same reason as above, this is computationally infeasible.
In this section, we discuss several methods for implementing RSA and RSA signa-
ture computations. Section 3.5.1 presents three methods for implementing modular
exponentiation. As we will see, those methods will require the computations of other
modular operations. Then in Sect. 3.5.2, we discuss how to efficiently implement
modular multiplication.
a d mod n
.
for large d. In practice, the bit length of d ranges in thousands, thus making the
calculation infeasible by this naïve method. We will discuss three methods to make
modular exponentiation computations faster.
a d mod n
.
for .a ∈ Zn .
By Theorem 1.1.1, we can write d in the following form
3.5 Implementations of RSA Cipher and RSA Signatures 171
d −1
𝓁∑
d=
. di 2i ,
i=0
d = d𝓁d −1 . . . d2 d1 d0
.
∑𝓁d −1 d −1
𝓁∏ ∏
di 2i i i
a =a
.
d i=0 = (a 2 )di = a2 .
i=0 0≤i<𝓁d ,di =1
i
Thus, to compute .a d mod n, we can first compute .a 2 for .0 ≤ i < 𝓁d . Then .a d is the
i
product of .a 2 for which .di = 1. One can see that compared to the naïve calculation,
requiring .d − 1 multiplications, this method only needs .≈ log2 d multiplications.
This observation leads us to the square and multiply algorithm listed in Algo-
i+1
rithm 3.7. Line 5 computes .a 2 in loop i. We check each bit of d (line 3), if the
i
ith bit of d is 1, then .a 2 is multiplied to the result (line 4). As this algorithm starts
from the least significant bit of d, i.e., .d0 , it is also called the right-to-left square and
multiply algorithm. Accordingly, the left-to-right square and multiply algorithm is
listed in Algorithm 3.8. We can see that compared to Algorithm 3.7, Algorithm 3.8
requires one less variable and hence less storage.
6 return t
using Algorithm 3.7, we get the values of the variables in each loop as follows:
i di t result
. 0 1 4 2
1 1 1 8
The returned value is 8. Similarly, using Algorithm 3.8, the intermediate values are:
i di t
.1 1 2
0 1 8
Where in the last loop, line 3 computes .t = 4 and line 5 calculates .t = 8 mod 15 =
8.
Example 3.5.2 Let .n = 23, d = 4 = 1002 , a = 5. Computing
using Algorithm 3.7, we get the values of the variables in each loop as follows:
i di t result
0 0 2 1
.
1 0 4 1
2 1 16 4
The final result is 4. Using Algorithm 3.8, in the first loop (.i = 2), line 3 computes
t = 1 mod 23 and line 5 calculates .t = 1 × 5 mod 23 = 5 mod 23. The intermediate
.
values are:
3.5 Implementations of RSA Cipher and RSA Signatures 173
i di t
2 1 5
.
1 0 2
0 0 4
d −1
𝓁∑
.d= di 2i .
i=0
For .0 ≤ j ≤ 𝓁d − 1, define
d −1
𝓁∑
Lj :=
. di 2i−j , Hj := Lj + 1.
i=j
Then
d −1
𝓁∑ d −1
𝓁∑ d −1
𝓁∑
2Lj +1 = 2
. di 2i−(j +1) = di 2i−j = −dj + di 2i−j = −dj + Lj .
i=j +1 i=j +1 i=j
We have
Lj = 2Lj +1 + dj = Lj +1 + Hj +1 + dj − 1 = 2Hj +1 + dj − 2,
.
and
⎧ ⎧
2Lj +1 if dj = 0 Lj +1 + Hj +1 if dj = 0
Lj =
. , Hj = .
Lj +1 + Hj +1 if dj = 1 2Hj +1 if dj = 1
174 3 Modern Cryptographic Algorithms and Their Implementations
Since
d −1
𝓁∑
L0 =
. di 2i = d,
i=0
L𝓁d −1 = d𝓁d −1 ,
. H𝓁d −1 = d𝓁d −1 + 1
and
⎧ ⎧
L𝓁d −1 1 if d𝓁d −1 = 0 H𝓁d −1 a if d𝓁d −1 = 0
a
. = , a = . (3.17)
a if d𝓁d −1 = 1 a2 if d𝓁d −1 = 1
j = 1, d1 = 1, R0 = R0 R1 mod n = 2,
. R1 = R12 = 22 mod 15 = 4
j = 0, d0 = 1, R0 = R0 R1 mod n = 2 × 4 mod 15 = 8
10 return R0
Example 3.5.4 Here we repeat the computation in Example 3.5.2. Let .n = 23, d =
4 = 1002 , a = 5. We know that .a d mod n = 4. With Algorithm 3.9, the
intermediate values are
j = 2, d2 = 1, R0 = R0 R1 mod n = 5,
R1 = R12 = 52 mod 23 = 25 mod 23 = 2
j = 1, d1 = 0, R1 = R0 R1 mod n = 5 × 2 mod 23 = 10,
.
R0 = R02 = 52 mod 23 = 2
j = 0, d0 = 0, R1 = R0 R1 mod n = 2 × 10 mod 23 = 20,
R0 = R02 = 22 mod 15 = 4
In this part, we focus on the case when .n = pq is the RSA modulus (p, q are
distinct odd primes) and .d ∈ Z∗ϕ(n) is the private key.
By Chinese Remainder Theorem (see Theorem 1.4.7 and Example 1.4.19),
finding the solution for
x ≡ a d mod n
.
is equivalent to solving
x ≡ a d mod p,
. x ≡ a d mod q.
x ≡ xp mod p,
. x ≡ xq mod q. (3.18)
Mq = q,
. Mp = p, yq = Mq−1 mod p = q −1 mod p,
and
x = xp yq q + xq yp p mod n
. (3.19)
We will show that Eq. 3.20 indeed gives the solution to Eq. 3.18. First, it is
straightforward to see .x ≡ xp mod p. Furthermore,
. x ≡ xp + (xq − xp ) ≡ xq mod q.
Thus .x ∈ Zn .
Example 3.5.5 Let us consider the toy example from Example 3.3.1. We have
.p = 3, q = 5, n = 15, ϕ(n) = 8, e = 3, d = 3.
we compute
3.5 Implementations of RSA Cipher and RSA Signatures 177
5 = 3 × 1 + 2,
. 3 = 2 + 1 =⇒ 1 = 3 − (5 − 3) = 3 × 2 − 5.
Thus
By Gauss’s algorithm,
m = mp yq q + mq yp p mod n = 2 × 2 × 5 + 2 × 2 × 3 = 32 mod 15 = 2.
.
By Garner’s algorithm,
p = 29,
. q = 41, n = 1189, ϕ(n) = 1120, e = 3, d = 747,
41 = 29 + 12,
. 29 = 12 × 2 + 5, 12 = 5 × 2 + 2, 5 = 2 × 2 + 1,
and
1 = 5 − 2 × (12 − 5 × 2) = −2 × 12 + (29 − 12 × 2) × 5
.
We have
178 3 Modern Cryptographic Algorithms and Their Implementations
By Gauss’s algorithm,
By Garner’s algorithm,
p = 29,
. q = 41, n = 1189, ϕ(n) = 1120, e = 3, d = 747.
Then we have
yp = 17,
. yq = 17.
Thus
Similarly,
322 mod 41 = 40
.
By Gauss’s algorithm,
By Garner’s algorithm,
= 21 + 1 × 29 = 50.
Example 3.5.8 Let us consider Example 3.4.1 for RSA signatures computation. We
have
p = 5,
. q = 7, n = 35, ϕ(n) = 24, e = 5, d = 5, m = 10.
7 = 5 + 2,
. 5 = 2 × 2 + 1 =⇒ 1 = 5 − 2 × (7 − 5) = 5 × 3 − 2 × 7
We have
By Gauss’s algorithm,
s = sp yq q + sq yp p mod n = 5 × 3 × 5 mod 35 = 5.
.
By Garner’s algorithm,
Compared to Gauss’s algorithm, Garner’s algorithm does not require the final
modulo n reduction.
CRT-based RSA implementation can improve the efficiency of the computation
in many ways. Firstly, .yp and .yq can be precomputed, which saves time during
communication. Secondly, the intermediate values during the computation are
only half as big compared to those in the computation of .a d mod n since they
are in .Zp or .Zq rather than .Zn . Moreover, .xp = a d mod (p−1) mod p and
.xq = a
d mod (q−1) mod q can be calculated by the square and multiply algorithm
(Algorithms 3.7 and 3.8) or Montgomery powering ladder (Algorithm 3.9) to further
improve the efficiency. In this case, .d mod (p − 1) and .d mod (q − 1) are much
smaller than d, computing .xp or .xq requires fewer multiplications than computing
d d
.a mod p or .a mod q.
From the previous subsection, we see that to have more efficient modular exponen-
tiation implementations, we need to compute modular addition, subtraction, inverse,
and multiplications. For modular addition and subtraction, we can just compute the
corresponding integer operations and then perform a single reduction modulo the
modulus. For inverse modulo an integer, as has been mentioned a few times, we can
utilize the extended Euclidean algorithm. Next, we will discuss two methods for
implementing modular multiplication.
Throughout this subsection, let n be an integer of bit length .𝓁n , in particular
R := ab mod n.
.
Let us assume the computer’s word size (see Sect. 2.1.2) is .ω. Define
⎾ ⏋
𝓁n
.κ := , i.e., (κ − 1)ω < 𝓁n ≤ κω.
ω
We can write
where .|| indicates concatenation. Note that some .ai or .bj might be 0 if the bit length
of a or b is less than .𝓁n . Furthermore, we have
3.5 Implementations of RSA Cipher and RSA Signatures 181
∑
κ−1 ∑
κ−1
a=
. ai (2ω )i , b= bj (2ω )i . (3.22)
i=0 j =0
where
∑
.tx = ai bj , 0 ≤ x ≤ 2κ − 1.
i,j, i+j =x
One drawback of Algorithm 3.10 is that a variable with double word size is being
processed in line 5. To see this, the maximum value of the right-hand side in line 5
is
The product .t = t3 ||t2 ||t1 ||t0 has bit length at most 8. Computations in lines 5–7 for
each loop are as follows:
The values for each variable in Algorithm 3.10 are listed below
j i ai bj T1 T0 t3 t2 t1 t0
0 0 01 00 01 00 00 00 01
.0 1 11 00 11 00 00 11 01
1 0 01 01 00 00 00 00 01
1 1 11 01 00 01 00 00 01
As expected, we get
t = 01000001 = 65 = 13 × 5.
.
Furthermore, if we would like to continue the computation and find .ab mod 15, we
will divide 65 by 15 and calculate the remainder, which is 5.
First proposed in 1983 [Bla83], Blakely’s method for computing modular multipli-
cation interleaves the multiplication steps with the reduction steps. The product ab
is computed as follows
⎛κ−1 ⎞
∑ ∑
κ−1
t = ab =
.
ω i
ai (2 ) b= (2ω )i ai b,
i=0 i=0
where .ai s are given in Eq. 3.22. Algorithm 3.11 lists the steps for computing
R = t mod n = ab mod n
.
Thus, line 4 can be replaced by comparing R with n for .2ω+1 − 2 times and subtract
n from R in case .R ≥ n:
1 for j = 0, 1, 2 . . . , 2ω+1 − 2 do
2 if R ≥ n then
3 R =R−n
4 else break
Example 3.5.10 Same as in Example 3.5.9, let the word size .ω = 2, and
a = 13 = 11012 ,
. b = 5, n = 15, 𝓁n = 4, κ = 2.
We have
And for .i = 0,
a = 55 = 1101112 ,
. b = 46, n = 69, ω = 2.
Computing .ab mod n with Algorithm 3.11 gives us the following intermediate
values:
i=3
. line 3, R = 0,
line 4, R = 0,
i=2 line 3, R = 3 × 46 = 138,
line 4, R = 138 mod 69 = 0,
i=1 line 3, R = 1 × 46 = 46,
line 4, R = 46 mod 69 = 46,
i=0 line 3, R = 22 × 46 + 3 × 46 = 322,
line 4, R = 322 mod 69 = 46.
∑
κ−1 ∑
κ−1 ∑
κ−1
result =
. hj (2ω )j , t= tj (2ω )j , a= aj (2ω )j .
j =0 j =0 j =0
3.5 Implementations of RSA Cipher and RSA Signatures 185
Then, in Algorithm 3.13, lines 5–9 implement .result = result ∗ t mod n (line 4 of
Algorithm 3.7) and lines 10–14 implement .t = t ∗t mod n (line 5 of Algorithm 3.7).
Similarly, in Algorithm 3.14, lines 3–7 implement .t = t ∗ t mod n (line 3 of
Algorithm 3.8) and lines 9–13 implement .t = a ∗ t mod n (line 5 of Algorithm 3.8).
Example 3.5.12 Let us repeat the computation in Example 3.5.2 with Blakley’s
method. We will calculate
Suppose the computer word size .ω = 2. .n = 23 = 101112 has .𝓁n = 5 bits, then
κ = ⎾5/2⏋ = 3. Lines 1 and 2 in Algorithm 3.13 give
.
result = 1,
. h0 = 01, h1 = 00, h2 = 00, t = 5 = 01012 ,
t0 = 01, t1 = 01, t2 = 00.
14 return t
i = 0 d0 = 0
loop line 11 j = 2 R=0
j =1 R = 2ω R + t1 t mod n = 5 mod 23
j =0 R = 2ω R + t0 t mod n = 22 × 5 + 1 × 5 mod 23
= 2 mod 23
line 14 t =2 t0 = 10, t1 = 00, t2 = 00
i = 1 d1 = 0
loop line 11 j = 2 R=0
.
j =1 R=0
j =0 R = t0 t mod n = 2 × 2 mod 23 = 4 mod 23
line 14 t =4 t0 = 00, t1 = 01, t2 = 00
i = 2 d2 = 1
loop line 6 j = 2 R=0
j =1 R=0
j =0 R = h0 t mod n = 4 mod 23
line 9 result = 4
t = 1,
. t0 = 01, t1 = 00, t2 = 00.
We also have
i = 2 d2 = 1
loop line 4 j = 2 R = 0
j =1R=0
j = 0 R = t0 t mod n = 1 mod 23
line 7 t = 1 t0 = 01, t1 = 00, t2 = 00
loop line 10 j = 2 R = 0
j = 1 R = 2ω R + a1 t = 1 mod 23
j = 0 R = 2ω R + a0 t = 22 + 1 = 5 mod 23
line 13 t = 5 t0 = 01, t1 = 01, t2 = 00
i = 1 d1 = 0
.
loop line 4 j = 2 R = 0
j = 1 R = t1 t mod n = 5 mod 23
j = 0 R = 2ω R + t0 t mod n = 22 × 5 + 5 mod 23
= 25 mod 23 = 2 mod 23
line 7 t = 2 t0 = 10, t1 = 00, t2 = 00
i = 0 d0 = 0
loop line 4 j = 2 R = 0
j =1R=0
j = 0 R = t0 t mod n = 2 × 2 mod 23 = 4 mod 23
line 7 t =4
∑
κ−1 ∑
κ−1
. R0 = R0i (2ω )i , R1 = R1i (2ω )i .
i=0 i=0
Then lines 5–9 implement .R1 = R0 R1 mod n (line 5 of Algorithm 3.9). Lines 10–14
implement .R0 = R02 mod n (line 6 of Algorithm 3.9). Lines 16–20 implement .R0 =
R0 R1 mod n (line 8 of Algorithm 3.9). Lines 21–25 implement .R1 = R12 mod n
(line 9 of Algorithm 3.9).
188 3 Modern Cryptographic Algorithms and Their Implementations
26 return R0
Example 3.5.13 Here we repeat the computation in Example 3.5.4 with Algo-
rithm 3.15. Let
.n = 23, d = 4 = 1002 , a = 5.
R0 = 1,
. R00 = 01, R01 = 00, R02 = 00, R1 = 5, R10 = 01,
3.5 Implementations of RSA Cipher and RSA Signatures 189
j = 2 d2 = 1
loop line 17 i = 2 R=0
i=1 R=0
i=0 R = R00 R1 mod n = 5 mod 23
line 20 R0 = 5 R00 = 01, R01 = 01, R02 = 00
loop line 22 i = 2 R=0
i=1 R = 2ω R + R11 R1 mod n = 5 mod 23
i=0 R = 2ω R + R10 R1 mod n = 22 × 5 + 5 mod 23 = 2
line 25 R1 = 2 R10 = 10, R11 = 00, R12 = 00
j = 1 d1 = 0
loop line 6 i = 2 R=0
i=1 R = R01 R1 mod n = 2 mod 23
i=0 R = 2ω R + R00 R1 mod n = 22 × 2
+2 mod 23 = 10
. line 9 R1 = 10 R10 = 10, R11 = 10, R12 = 00
loop line 11 i = 2 R=0
i=1 R = 2ω R + R01 R0 mod n = 5 mod 23
i=0 R = 2ω R + R00 R0 mod n = 22 × 5
+5 mod 23 = 2
line 14 R0 = 2 R00 = 10, R01 = 00, R02 = 00
j = 0 d0 = 0
loop line 6 i = 2 R=0
i=1 R=0
i=0 R = R00 R1 mod n = 2 × 10 mod 23 = 20
line 9 R1 = 20
loop line 11 i = 2 R=0
i=1 R=0
i=0 R = R00 R0 mod n = 2 × 2 mod 23 = 4
line 14 R0 = 4
We have discussed that such a pair of integers .r −1 and .n̂ can be found with the
extended Euclidean algorithm.
Remark 3.5.1 We note that for any positive integer t,
16 = 15 + 1 =⇒ 1 = 16 − 15,
.
32 = 23 + 9,
. 23 = 9 × 2 + 5, 9 = 5 + 4, 5 = 4 + 1,
and
Hence .r −1 = −5 and .n̂ = −7. To make .n̂ positive, we can take .t = 1 as in Eq. 3.24,
we have
r −1 = −5 + n = −5 + 23 = 18,
. n̂ = −7 + r = −7 + 32 = 25.
18r − 25n = 18 × 32 − 25 × 23 = 1.
.
Example 3.5.16 Let .n = 57. Then .𝓁n = 6 and .r = 26 = 64. By the extended
Euclidean algorithm
and we have .r −1 = −8, and .n̂ = −9. To get a positive .n̂, we choose (see
Remark 3.5.1)
r −1 = −8 + n = −8 + 57 = 49,
. n̂ + r = −9 + 64 = 55.
49r − 55n = 49 × 64 − 55 × 57 = 1.
.
Example 3.5.17 Let .n = 1189. Then .𝓁n = 11 and .r = 211 = 2048. By the
extended Euclidean algorithm
2048 = 1189 + 859, 1189 = 859 + 330, 859 = 330 × 2 + 199, 330 = 199 + 131,
= 131 + 68,
. 199 131 = 68 + 63, 68 = 63 + 5, 63 = 5 × 12 + 3,
5 = 3 + 2, 3 = 2 + 1,
and
= (131 − 68) × 27 − 68 × 25
= 131 × 27 − (199 − 131) × 52 = (330 − 199) × 79 − 199 × 52
= 330 × 79 − (859 − 330 × 2) × 131
= (1189 − 859) × 341 − 859 × 131 = 1189 × 341 − (2048 − 1189) × 472
= 2048 × (−472) − 1189 × (−813).
1 + nn̂ ≡ 0 mod r.
.
t + mn = t + t n̂n = t (1 + n̂n)
.
is divisible by r, and the output u is an integer. By our choice of .r = 2𝓁n and Eq. 3.21
t = ab < rn.
.
192 3 Modern Cryptographic Algorithms and Their Implementations
rn + rn
u<
. = 2n,
r
which shows that lines 4–5 calculate .u mod n. Furthermore,
ab + mn
u≡
. ≡ abr −1 + mnr −1 ≡ abr −1 mod n.
r
x −1
𝓁∑
x=
. xi 2i .
i=0
−1,𝓁n −1}
min{𝓁x∑
x mod r =
. xi 2i .
i=0
In other words, to compute .x mod r, we just keep the least significant .𝓁n bits of x.
Note that the integer .r − 1 has binary representation given by a binary string with
.𝓁n 1s. We have
We know that .a, b ≥ 0. Since we also choose .n̂ > 0, line 2 can be replaced by
3.5 Implementations of RSA Cipher and RSA Signatures 193
m = t n̂ & (r − 1).
.
x −1
𝓁∑
x
. = xi 2i . (3.25)
r
i=𝓁n
For any positive integer .s ≤ 𝓁x , we define right shift x by s bits to be the integer
x𝓁x −1 x𝓁x −2 . . . xs .4 We write
.
x ⪢ s := x𝓁x −1 x𝓁x −2 . . . xs .
. (3.26)
Compared with Eq. 3.25, division by r is equivalent to right shift by .𝓁n . We have
shown that .t + mn in line 3 is a multiple of r. Then line 3 can be replaced by
u = (t + mn) ⪢ 𝓁n .
.
We have
53 mod r = 5,
. 53 & 15 = 110101 & 1111 = 101 = 5.
Furthermore,
240 240
. = = 15, 240 ⪢ 4 = 11110000 ⪢ 4 = 1111 = 15.
r 16
Example 3.5.19 Let .n = 23. Then .𝓁n = 5 and .r = 25 = 32. In Example 3.5.15
we have discussed that .r −1 = 18 and .n̂ = 25. We will compute a few modular
multiplications which will be useful for Example 3.5.27.
Let .a = 22, .b = 22. Following Algorithm 3.16, we have
t = ab = 22 × 22 = 484,
.
t = ab = 18 × 18 = 324,
.
and the output is 13. We can verity that .abr −1 mod n = 18 × 18 × 18 mod 23 = 13.
Let .a = 9, .b = 13. We have
t = ab = 9 × 13 = 117,
.
and the output is 13. We can verity that .abr −1 mod n = 9 × 13 × 18 mod 23 = 13.
3.5 Implementations of RSA Cipher and RSA Signatures 195
t = ab = 169,
.
t = ab = 13,
.
m = t n̂ mod r = 13 × 25 mod 32 = 5,
t + mn 13 + 5 × 23
u= = = 4,
r 32
t = ab = 81,
.
m = t n̂ mod r = 81 × 25 mod 32 = 9,
t + mn 81 + 9 × 23
u= = = 9,
r 32
t = ab = 198,
.
and the output is 22. We can verity that .abr −1 mod n = 9 × 22 × 18 mod 23 = 22.
Example 3.5.20 Let .n = 15, .a = 3, b = 5. We have discussed in Example 3.5.14
that .r = 24 = 16, .r −1 = 1 and .n̂ = 1. Following Algorithm 3.16, we have
t = ab = 3 × 5 = 15,
.
t = ab = 3 × 5 = 15
.
Example 3.5.22 Let .n = 57, .a = 21, b = 5. We know from Example 3.5.16 that
r = 64,
. r −1 = 49, n̂ = 55.
t = ab = 21 × 5 = 105
.
ar := ar mod n.
.
MonPro will be more useful when multiple multiplications are computed. We will
discuss a more efficient way of using MonPro.
By Corollary1.4.2, the set
contains the same elements modulo n as in .Zn . We define addition .+Mon and
multiplication .×Mon operation on .Zrn as follows:
ar + 0r = ar mod n + 0 mod n = ar .
.
198 3 Modern Cryptographic Algorithms and Their Implementations
The inverse of .ar with respect to .+Mon is .(−a)r , where .−a is the inverse of a in .Zn
with respect to addition modulo n:
Hence
so the distributive law holds for .×Mon and .+Mon . The identity element for .×Mon is
1r = r mod n since
.
= MonPro(ar , br ).
Thus, .MonPro(ar , br ) implements the multiplication in the ring .(Zrn , +Mon , ×Mon ).
Now we can apply the Montgomery product algorithm MonPro (Algorithm 3.16
or 3.17) for computing multiplications in the right-to-left (Algorithm 3.7) and the
left-to-right (Algorithm 3.8) square and multiply algorithms. The details are listed
in Algorithms 3.19 and 3.20.
By Lemma 3.5.1 and Remark 3.5.2, lines 5 and 6 in Algorithm 3.19 compute
respectively. It follows from Algorithm 3.7 that lines 1–6 in Algorithm 3.19
calculate
Then line 7 removes r from .(a d )r and outputs the final result.
tr = tr ×Mon tr
. and tr = tr ×Mon ar
respectively. It follows from Algorithm 3.8 that lines 1–6 in Algorithm 3.20
calculate
Then line 7 removes r from .(a d )r mod n and outputs the final result.
Example 3.5.27 Let
n = 23,
. d = 4 = 1002 , a = 5.
with square and multiply algorithm. In Example 3.5.12 we showed the steps
when modular multiplications in the square and multiply algorithm are done with
Blakely’s method. Now we calculate the same modular exponentiation with the
square and multiply algorithm and Montgomery’s method for modular multiplica-
tion.
According to Example 3.5.15 that
r = 32,
. r −1 = 18, n̂ = 25.
For the detailed computations with MonPro below, we refer to Example 3.5.19.
Following Algorithm 3.19, lines 1 and 2 give
resultr = 32 mod 23 = 9,
. tr = 5 × 32 mod 23 = 22.
Then line 6 computes (note that this computation does not affect the final output)
tr = 32 mod 23 = 9,
. ar = 5 × 32 mod 23 = 22.
We can also apply the Montgomery product algorithm (Algorithm 3.16 or 3.17)
to Montgomery powering ladder (Algorithm 3.9) for computing modular exponen-
tiation. We have Algorithm 3.21.
By Lemma 3.5.1 and Remark 3.5.2, lines 5 and 6 in Algorithm 3.21 compute
. R1 = R0 ×Mon R1 , R0 = R0 ×Mon R0
R0 = R0 ×Mon R1 ,
. R1 = R1 ×Mon R1 .
It follows from Algorithm 3.9 that lines 1–9 in Algorithm 3.21 calculate
Then line 10 removes r from .(a d )r mod n and outputs the final result.
Example 3.5.28 We repeat the computation in Example 3.5.27 with Algo-
rithm 3.21. We have
For the detailed computations with MonPro below, we refer to Example 3.5.19.
Lines 1 and 2 in Algorithm 3.21 give
Since .d1 = d0 = 0, for the rest of the computations, only .R0 is relevant for the
result. For .j = 1, line 6 calculates
Figures We note that figures in this chapter are adjusted versions of drawings
from [Jea16]. Jean [Jea16] includes plenty of source files for various cryptographic-
related illustrations.
Implementation of symmetric block ciphers For more discussions on implemen-
tations of symmetric block ciphers, we refer the readers to [Osw]. For a detailed
analysis of algebraic normal form and Boolean functions, we refer the readers
to [O’D14].
Bitsliced implementation of DES can be found in, e.g., [MPC00, Kwa00]. For
AES, [KS09] discusses a bitsliced implementation for 64−architecture and [SS16]
presents the design for 32-bit architecture. More efficient bitsliced implementations
of PRESENT can be found in [BGLP13] for 64-bit architecture and in [RAL17]
for 32-bit architecture.
A related novel way of implementing symmetric block ciphers called Fixslicing
was introduced in 2020 [ANP20, AP20] to achieve efficient software constant-time
implementations. The main idea is to have an alternative representation of several
rounds of the cipher by fixing the bits within a certain register to never move.
RSA security Currently, a few hundred qubits (a quantum counterpart to the classi-
cal bit) are possible for a quantum computer [Cho22]. To break RSA, thousands of
qubits are required [GE21]. Nevertheless, post-quantum public key cryptosystems
are being proposed (see, e.g., [HPS98, BS08]) to protect communications after a
quantum computer is built.
Implementations of RSA For more discussions on different methods for imple-
menting RSA, we refer the readers to [Koç94]. Koç [Koç94] also discusses
how Garner’s algorithm (Eq. 3.20) can be designed for solving simultaneous linear
congruences in general. For a more efficient way to implement the extended
Euclidean algorithm, see [Sti05, Algorithm 5.3].
Digital Signatures There are other digital signatures based on different public key
cryptosystems. For more discussions, we refer the readers to [Buc04, Chapter 12].
Secret key In Sect. 2.2.6 we have seen that exhaustive key search can be used
to break shift cipher and affine cipher. The lesson is that the key space should
be big enough so the attacker cannot brute force the secret key. This size is
determined by the current computation power. For example, the 56-bit secret key
of DES was successfully broken in 1998 [Fou98]. The U.S. National Institute
for Standards and Technology (NIST) issues recommendations for key sizes for
government institutions in the USA. According to those, 80-bit keys were “retired”
in 2010 [BBB+ 07], and lesser than 112-bit keys were considered insufficient from
2015 onward [BD16]. National Security Agency (NSA) currently requires AES-
256 for Top Secret classification since 2015 due to the emergence of quantum
computing [Age15].
Chapter 4
Side-Channel Analysis Attacks
and Countermeasures
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 205
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_4
206 4 Side-Channel Analysis Attacks and Countermeasures
• Non-profiled SCA. If the attacker does not have access to a similar device, just
the target device or just the measurements coming from the target device, we
talk about a non-profiled SCA. In a general scenario, this attack utilizes a set
of measurements where a fixed secret key is used to encrypt multiple (random)
plaintexts.
• Profiled SCA. If we assume the attacker has access to a clone device of the
target device, then they can carry out a profiled SCA. This attack operates in two
phases. In the profiling phase, the attacker acquires side-channel measurements
for known plaintext/ciphertext and known key pairs. This set of data is used
to characterize or model the device. Then in the attack phase, the attacker
acquires a few measurements from the target device, which is usually identical
to the clone device, with known plaintext/ciphertext and an unknown key. These
measurements from the target device are then tested against the characterized
model from the clone device.
Source Code
The source code and measurement data for this chapter can be found in the
following link:
https://github.com/XIAOLUHOU/SCA-measurements-and-analysis----
Experimental-results-for-textbook
Power analysis measures the power consumption of the DUT in the form of a voltage
change. The most convenient device to capture the voltage change over time is a
digital sampling oscilloscope—a device that takes samples of the measured voltage
signal over time. We refer to each sample point as a time sample. More information
on measurement setups is provided in Sect. 6.1.
To be able to target the correct time slot, in our experiments, a trigger signal is
raised to high (5V) during the computation that we want to capture and lowered
afterward. One measurement consists of the voltage values for each time sample in
this duration. It can be stored in an array of length equal to the total number of time
samples in the measured time interval. It can also be drawn in a graph where the x-
axis corresponds to time samples and the y-axis records the voltage values.1 Thus,
we refer to the result of one measurement as a (power) trace.
1 Note that, in the case of ChipWhisperer, which will be used for our experiments and analysis, the
y-axis does not show the actual voltage value but a 10-bit value proportional to the current going
through the shunt resistor.
4.1 Experimental Setting 207
Fig. 4.1 Side-channel measurement setup used for the experiments: a laptop, the ChipWhisperer-
Lite measurement board (black), and the CW308 UFO board (red) with the mounted ARM Cortex-
M4 target board (blue). Note that the benchtop oscilloscope in the back was only used for the initial
analysis—all the measurements were done by the ChipWhisperer
Device under test and oscilloscope For the experiments in this chapter, we used
a ready-to-use measurement platform NewAE ChipWhisperer-Lite. The program
code was running on a 32-bit ARM Cortex-M4 microcontroller (STM32F3) with
a clock speed of .≈ 7.4 MHz. ADC was set to capture the samples at .4× that
speed, i.e., .≈ 29.6 MHz with a 10-bit resolution. However, for plotting purposes, we
normally reduced the number of time samples. The measurement setup is depicted
in Fig. 4.1. The ChipWhisperer-Lite board is in the middle of the picture in black
color, handling the communication with the DUT and the acquisition. The red PCB
on the right is the CW 308 UFO board—a breakout board with the DUT—and ARM
Cortex-M4 (blue board) mounted on top. The controlling and data processing were
done from a laptop, from the Jupyter environment available for the ChipWhisperer
platform. In the back, there is a Teledyne T3DSO3504 benchtop oscilloscope that
was used mainly for convenience purposes—to precisely locate the time intervals in
the initial analysis stage.
Figure 4.2 shows one power trace for the first five rounds of PRESENT
encryption. In order to see the trace more clearly, we have added a sequence nop
instructions before and after the five rounds of cipher computation. This trace has in
total 18,500 time samples. Certain patterns can be seen from the trace, and we can
deduce the corresponding operations in each time interval. For example, from time
sample 0–1434 and from time sample 17,514–18,500, we have nop instructions. We
can also see the five repeated patterns in the figure and deduce the duration of each
round, as indicated in the figure by red dotted lines. In terms of time samples, one
round takes on average 3216 time samples. In this particular case, we reduced the
number of samples by a factor of 3 (simply by taking every third sample) so that
the patterns would still be visible to the reader. That means, with the ADC speed of
.≈ 29.6 MHz, one round takes .(3216 × 3)/29.6 ≈ 325.9 μs. It is important to note
208 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.2 Power trace of the first five rounds of PRESENT encryption. A sequence of nop
instructions was executed before and after the cipher computation to clearly distinguish the
operations
.FEDCBA0123456789 (4.1)
There are two main classical power analysis attack methods, simple power analysis
(SPA) and differential power analysis (DPA). SPA assumes the attacker has access
to only one or a few measurements corresponding to some fixed inputs. In DPA, we
assume the attacker can take measurements for a potentially unlimited number of
different inputs. We will present several DPA attacks on symmetric block ciphers
(Sects. 4.3.1 and 4.3.2) and DPA (Sect. 4.4.1) and SPA (Sect. 4.4.2) attacks on RSA.
We will also discuss a newly proposed side-channel assisted differential plaintext
attack (SCADPA) on SPN ciphers (Sect. 4.3.3). Similar to SPA, the attack does
not require statistical analysis of the traces; only visual inspection is enough. The
amount of traces needed is in between that for SPA and DPA, mostly dependent on
the measurement equipment.
In the later parts of the chapter, we will see that by analyzing the power con-
sumption, we can deduce the secret key. Consequently, we also refer to the power
consumption as the leakage of the device. We consider the leakage consists of two
parts: signal and noise. Signal refers to the part of the leakage containing useful
information for our attack; the rest is noise. For example, suppose we would like to
recover the hamming weight of an intermediate value. In that case, the part of the
leakage correlated to the hamming weight of that intermediate value is our signal.
Before we see how leakage can be defined and modeled, we show that it is
dependent on the operations being executed and the data being processed.
We first take the Fixed dataset A described in Sect. 4.1. The average of those
5000 traces is shown in Fig. 4.3. As mentioned in Sect. 4.1, each trace in this dataset
corresponds to one round of PRESENT computation surrounded by nop operations.
By visual inspection, we can deduce that the beginning (time samples 0–209)
and the ending (time samples 3381–3600) parts that consist of relatively uniform
patterns correspond to nop instructions. Other than that, we can see three distinct
patterns between them. Since one round of PRESENT consists of addRoundKey,
sBoxLayer, and pLayer (see Fig. 3.8), we can roughly identify each of these three
operations in the trace—they correspond to the blue (time samples 210–382),
pink (time samples 383–567), and green (time samples 568–3380) parts of the
trace, respectively. In this case, one round computation corresponds to 3170 time
samples, which is fewer than that in Fig. 4.2. Such a difference can be caused
by round counter and loop operations, register updates of round keys, etc., which
are additionally computed in the five-round PRESENT implementation. These
observations demonstrate that the leakage is dependent on the operations being
executed in the DUT.
210 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.3 The averaged trace for 5000 traces from the Fixed dataset A (see Sect. 4.1). The blue,
pink, and green parts of the trace correspond to addRoundKey, sBoxLayer, and pLayer, respectively
Fig. 4.4 The averaged trace for 1000 plaintexts with the 0th bit equal to 0. The computation
corresponds to one round of PRESENT with a fixed round key
For another experiment, with the experimental setup described in Sect. 4.1, we
have conducted measurements for one round of PRESENT with a fixed round key.
A total of 1000 traces were collected, each for a random plaintext with the 0th
bit equal to 0. The averaged trace is shown in Fig. 4.4. With the same key, we
collected traces for 1000 plaintexts with the 0th bit equal to 1. And the averaged
trace is shown in Fig. 4.5. We can see that those two averaged traces are very similar.
Unsurprisingly, they also look similar to the trace in Fig. 4.3. Thus, the time interval
for each operation in the first round of PRESENT corresponds to that in Fig. 4.3 as
well.
We can gain more information when we take the difference between traces in
Figs. 4.4 and 4.5. The difference trace is shown in Fig. 4.6. There are a few peaks in
this difference trace, and apart from those peaks, most of the points are close to zero.
Those peaks indicate that the 0th bit of the plaintext is related to the computations
at the corresponding time samples. Compared with Fig. 4.3, we can see that the first
and second peaks correspond to addRoundKey and pLayer operations. In particular,
4.2 Side-Channel Leakages 211
Fig. 4.5 The averaged trace for 1000 plaintexts with the 0th bit equal to 1. The computation
corresponds to one round of PRESENT with a fixed round key
Fig. 4.6 The difference between traces from Figs. 4.4 and 4.5
these observations show that the leakage is dependent on the data being processed
in the DUT.
Note
In the SCA attacks we will see in this book, we will only be interested in
operation or/and data-dependent leakages.
SPA typically exploits the relationship between the executed operations
and the leakage (power consumption). DPA and SCADPA focus on the rela-
tionship between the processed data and the leakage (power consumption).
To analyze the leakage better, we model the leakage, signal, and noise at a given
point in time as random variables. In particular, for a fixed time sample t, let .Lt , .Xt ,
212 4 Side-Channel Analysis Attacks and Countermeasures
and .Nt denote the random variables corresponding to the leakage, signal, and noise,
respectively. As we consider the leakage consists of signal and noise, we can write
Lt = Xt + Nt .
. (4.2)
Since .Xt contains the part of the leakage that is useful to us and the rest is noise, we
make the “independent noise assumption” (see, e.g., [Pro13]) and assume .Nt and
.Xt are independent random variables. When .Xt is a constant, according to Eqs. 1.36
Var(Lt ) = Var(Nt ),
. Xt = E [Lt ] − E [Nt ] . (4.3)
But how do we decide when .Xt is constant? That depends on the information
we would like to obtain from the traces. Let us consider one round of PRESENT
computation. Suppose we are interested in the 0th Sbox output of the sBoxLayer
(the right-most Sbox in Fig. 3.9), denoted by .v. If we want information revealing the
exact value of .v, then for any given time sample t, the signal .Xt is considered to be
constant across the following dataset: measurements for computations of one round
of PRESENT with a fixed master key and plaintexts with a fixed 0th nibble. The
identical 0th nibble in plaintexts and fixed key guarantees the same 0th Sbox output.
We can also use random master keys that result in the first round key having the
same 0th nibble. If we want information revealing the Hamming weight of .v, then
measurements with master keys and plaintexts that result in a fixed .wt (v) would
correspond to constant .Xt s.
Since we are only interested in either data or operation-related leakages, for a given
point in time t, if we fix the operation and the data, we get a constant signal, i.e.,
.Xt is a constant. In the following, we will show that, in this case, the experimental
2 Roughly speaking, the central limit theorem says that if we combine different independent
Fig. 4.7 Part of five random traces from the Fixed dataset A (see Sect. 4.1)
Fig. 4.8 Histogram of leakages at time sample .t = 3520 across 5000 traces from the Fixed
dataset A
minor differences. As the signal is the same, the minor differences are caused by the
noise. We will further characterize the noise using histograms.
Recall that the averaged trace of those 5000 traces in Fixed dataset A is shown in
Fig. 4.3. Take .t = 3520. As we have mentioned in the discussion regarding Fig. 4.3,
this time sample corresponds to nop operations. If we plot the histogram of leakages
.L3520 across those 5000 traces, we get Fig. 4.8. Most leakages are around .0.0435,
Fig. 4.9 Histogram of leakages at time sample .t = 2368 across 5000 traces from the Fixed
dataset A
a fixed time sample, resulting in a constant .Xt . Thus, the variants in the leakage are
caused by the noise.
Leakage Models One important concept for a power analysis attack is the leakage
model, namely a model that estimates how the leakage is related to the data
being processed. A good leakage model can make the attack more efficient (see
Sect. 4.3.2).
Three commonly used leakage models are identity leakage model, Hamming
distance leakage model, and Hamming weight leakage model. Assume a value .v
is being processed in the DUT, and right before it, another value .u was used by
the DUT. Then, according to the identity leakage model, the leakage is correlated
(see Definition 1.7.8) to .v. The Hamming distance leakage model assumes that the
leakage is correlated to .dis (v, u), the Hamming distance between .v and .u (see
Eq. 1.24). Following the Hamming weight leakage model, the leakage will then be
correlated to .wt (v), the Hamming weight of .v (see Definition 1.6.10). We refer the
readers to Sect. 6.1.1 for more explanations of why there are side-channel leakages
when the data in the DUT is changed.
In particular, let .noise ∼ N(0, σ 2 ) be a normal random variable with mean 0 and
variance .σ 2 . For the identity leakage model, the modeled leakage is given by
L(v) = v + noise.
.
Even though the actual leakage may not be exactly equal to the modeled leakage
L(v), those leakage models can be used to approximate the behavior of the actual
.
leakages or for statistical analysis (see Sect. 4.3.1). For example, our previous
experiments have demonstrated that the identity leakage model is realistic since
when the data is fixed, the distribution of leakages is close to a normal distribution.
It can be shown that the other two leakage models are also realistic (see [MOP08,
Section 4.3]).
In this book, we will focus on two leakage models: the identity leakage model
and the Hamming weight leakage model.
l2368 ≈ 0.2132.
.
Then the sample mean .0.2132 is an estimate for .μ2368 , and the sample variance
8.5196 × 10−6 is an estimate for .σ2368
.
2 .
Example 4.2.2 (Example of interval estimator for the mean) Since we do not
know the variance of .L2368 , by Eq. 1.59, a .100(1 − α) percent confidence interval
for .μ2368 is given by
⎧ ⎫
s2368 s2368
. l2368 − tα/2,M−1 √ , l2368 + tα/2,M−1 √ .
M M
Take .α = 0.01. Then according to Remark 1.8.2 and Table 1.4, we get
By Eq. 4.5,
√
s2368 =
. 8.5196 × 10−6 ≈ 2.9188 × 10−3 .
2
Assume we know the variance of .L2368 is actually given by .σ2368 = 8.5196 × 10−6 .
Suppose we want to find an estimate for .μt with precision .c = 0.001 and .99% of
confidence. By Eq. 1.58, the number of traces we need to collect is given by
2
σ2368 8.5196 × 10−6
. z 2
= × 2.5762 ≈ 57, (4.7)
c2 α/2 0.0012
where .1 − α = 0.99 gives .α = 0.01, and as mentioned above, .z0.005 = 2.576.
Thus, we should collect at least 57 traces to get a .99% percent confidence interval
for .μ2368 .
Since the number of traces to be collected is more than 30, according to Eq. 1.60,
if we do not know the variance of .Lt , we can use the sample variance .s23682 to
compute the number of traces required. In this case, we will get the same result as
in Eq. 4.7 since we have assumed the variance to be equal to this sample variance.
Now we take the Fixed dataset B described in Sect. 4.1. We again look at
the time sample .t = 2368. Let .L'2368 denote the random variable corresponding
to the leakage at time sample 2368 for one round encryption of the plaintext
84216BA484216BA4 with round key FEDCBA0123456789. Let .μ'2368 and .σ '2 denote
the mean and variance of .L'2368 , respectively. Then the Fixed dataset B provides a
sample for .L'2368 . Similarly to Example 4.2.1, we can compute the sample mean and
sample variance for .L'2368 with this sample, and we have
4.2 Side-Channel Leakages 217
'2
'
l2368
. ≈ 0.2133, s2368 ≈ 8.6198 × 10−6 . (4.8)
Example 4.2.3 (Example of interval estimator for the mean) Let us assume
L2368 and .L'2368 are independent. We further assume that we know that the actual
.
variances for .L2368 and .L'2368 are equal to the sample variances we have computed.
Suppose we want to find an estimation for .μ2368 − μ'2368 . By Eq. 1.62, a .99%
confidence interval estimate for .μ2368 − μ'2368 is given by
⎛ / / ⎞
2
σ2368 '2
σ2368 σ 2 σ '2
. ⎝l2368 − l ' − z0.005 + '
, l2368 − l2368 + z0.005 2368
+ 2368 ⎠
2368
M M M M
⎧ /
= 0.2132 − 0.2133 − 2.576 (8.5196 × 10−6 + 8.6198 × 10−6 )/5000,
/ ⎫
−6 −6
0.2132 − 0.2133 + 2.576 (8.5196 × 10 + 8.6198 × 10 )/5000
⎛ ⎞
= −2.5082 × 10−4 , 5.0820 × 10−5 .
On the other hand, by Eq. 1.63, to achieve an estimation with precision, say .c =
0.001, and .100(1 − α) confidence, the number of data required to collect is given by
'2
2368 + σ2368 )
2 (σ 2
zα/2
. .
c2
Take .α = 0.01, then .z0.005 = 2.576, and we have
'2 )
2
z0.005 2
(σ2368 + σ2368 2.5762 × (8.5196 × 10−6 + 8.6198 × 10−6 )
. = ≈ 114.
c2 0.0012
'
If we assume we do not know the variances, but we know that .σ2368 = σ2368 , by
Eq. 1.67, the number of traces to collect is given by
2 s2
2zα/2 p 2 × 2.5762 × 8.5697 × 10−6
. = ≈ 114,
c2 0.0012
where
Remark 4.2.1 We note that the sample variances of .L2368 and .L'2368 are very close.
2
This is expected as it has been shown in Eq. 4.3 that the variances .σ2368 '2
and .σ2368
are both equal to the variance of the noise at time sample 2368.
For now, we have seen how to analyze the leakage at one particular time sample
by approximating its distribution with a normal distribution. Similarly, we can
also approximate the distribution of leakages across different time samples. In
this case, we consider a random vector (see Definition 1.7.9) instead of a random
variable. Thus, we would approximate the distributions induced by the leakages
as multivariate normal distributions (Gaussian distributions). It can be seen from
Definition 1.7.10 that to find a good Gaussian distribution for approximating the
noise/leakage, we just need to approximate the mean vector and covariance matrix.
We will see in Sect. 4.3.2.3 that the profiling phase of the template attack is exactly
to calculate estimations for the mean vector and the covariance matrix. In reality,
leakages at different time samples are correlated (see Definition 1.7.8). However,
the effort to calculate the covariance matrix grows quadratically with the number of
considered time samples. Thus, in practice, only a small part of the traces would be
profiled with a non-diagonal covariance matrix (see Example 1.7.24).
In the rest of this chapter, we will see various attacks on cryptographic imple-
mentations. As a developer, one might want to evaluate the implementation and
conclude if it is vulnerable to SCA. On the other hand, different new attacks are
being developed, and it is impractical to verify the security of our implementation
against all of them. Leakage assessment aims to solve this problem by analyzing the
power trace and answering whether any data-dependent information can be detected
in the traces of the DUT.
Note
We note that the leakage assessment methods do not provide any conclusions
in cases where data-dependent leakage is not detected. Therefore, the absence
of data-dependent leakage indicated by a particular method does not prove
that the implementation is leakage-free.
We take the signal as the part of the leakage related to .v. Let .Lt and .L't denote
the leakages at time sample t corresponding to two encryptions with different fixed
values of .v.
Example 4.2.4 (Example of .Lt and .L't ) If we take .v to be the plaintext, following
the convention, we require the key to be the same, then .Lt and .L't would correspond
to encryptions of two different fixed plaintexts with the same key. For example, we
can take Fixed dataset A and Fixed dataset B as samples of .Lt and .L't for the first
round of the encryption.
If we take .v to be the 0th Sbox output in the first round of PRESENT, then .Lt and
'
.Lt would correspond to encryptions that result in two different 0th Sbox outputs. For
Lt = Xt + Nt ,
. L't = Xt' + Nt' ,
with
Lt ∼ N(μt , σt2 ),
. L't ∼ N(μ't , σt'2 ).
Lt = Xt + Nt
.
and the signal .Xt is a constant, the variance of .Lt is given by the variance of .Nt ,
and the mean of .Lt is given by the sum of the constant .Xt and the mean of .Nt , as
shown in Eq. 4.3. In other words,
μt = Xt + E [Nt ] ,
. σt2 = Var(Nt ). (4.9)
Similarly, we have
⎡ ⎤ ( )
μ't = Xt' + E Nt' ,
. σt'2 = Var Nt' .
μt − Xt = μ't − Xt' ,
. σt2 = σt'2 . (4.10)
Before going into details about the TVLA methodology, we recall hypothesis
testing techniques from Sect. 1.8.3. We can use those techniques to test hypotheses
about .μt and .μ't .
220 4 Side-Channel Analysis Attacks and Countermeasures
H0 : μ2368 = 0,
. H1 : μ2368 /= 0.
Suppose we know the variance is equal to the sample variance we have computed
in Example 4.2.1, namely we assume
.
2
σ2368 = 8.5196 × 10−6 , which gives σ2368 ≈ 2.9188 × 10−3 .
There are in total 5000 traces in Fixed dataset A. For a test with significance level
α = 0.01, the critical region is given by Eq. 1.69, with (see Eq. 1.72)
.
we reject the null hypothesis and conclude that .μ2368 /= 0. The probability that our
decision is wrong is given by .α = 0.01.
Example 4.2.7 (Example of one-sided hypothesis testing concerning .μx ) With
the same notation as in Example 4.2.6, suppose we know that the mean of .L2368 ,
.μ2368 , is at least 0; we would like to know if it is bigger than 0. We set .μ0 = 0 in
Eq. 1.75 and get the following null hypothesis and the alternative hypothesis:
H0 : μ2368 = 0,
. H1 : μ2368 > 0.
First, let us assume we know the variance is equal to the sample variance we
computed in Example 4.2.1. There are in total 5000 traces in Fixed dataset A, and
for a test with significance level .α = 0.01, the critical region is given by Eq. 1.76,
with (see Eq. 1.77)
4.2 Side-Channel Leakages 221
where .z0.01 = 2.326 (see Table 1.4). Since our sample mean
we reject the null hypothesis and conclude that .μ2368 > 0. The probability that our
decision is wrong is given by .α = 0.01.
Furthermore, we also would like to check how many traces are required for a
test with significance level .α = 0.01. For this, we need to choose a value of c.
Considering the value of the sample mean and sample variance, let us choose .c =
0.001 in Eq. 1.78. According to Eq. 1.79, the number of traces to collect is then
2
σ2368 8.5196 × 10−6
. z 2
= × 2.3262 ≈ 46. (4.11)
c2 α 0.0012
2 . Since the number of traces is
Now, suppose we do not know the variance .σ2368
big, according to Eq. 1.80, we compute
√ l2368 √ 0.2132
. 5000 × = 5000 × ≈ 5165,
s2368 2.9188 × 10−3
which is bigger than .z0.01 = 2.326. Thus, we can reject the null hypothesis and
conclude that .μ2368 > 0. The probability of a wrong decision is given by .α = 0.01.
As for the number of traces needed, by Eq. 1.81, we will use the sample variance
and reach the same result as in Eq. 4.11.
Example 4.2.8 (Example of two-sided hypothesis testing about .μx and .μy ) The
same as in Example 4.2.3, we take the leakages at .t = 2368 from the Fixed dataset
B as a sample for .L'2368 . We have computed the sample mean and sample variance
for this random variable, given in Eq. 4.8. We would like to know if the mean of
.L2368 (.μ2368 ) and the mean of .L
' '
2368 (.μ2368 ) are the same. We set the following
hypotheses (see Eq. 1.82):
H0 : μ'2368 = μ2368 ,
. H1 : μ'2368 /= μ2368 .
Assume we know the variances for both random variables are equal to the sample
variances that we have computed (see Eqs. 4.5 and 1.82). There are in total 5000
traces in both Fixed dataset A and Fixed dataset B, and for a test with significance
level .α = 0.01, the critical region is given by Eq. 1.83, with (see Eq. 1.86)
/ /
'2
2
σ2368 σ2368 8.5196 ×10−6 + 8.6198 ×10−6
c = z0.005
. + = 2.576× ≈ 0.00015,
5000 5000 5000
222 4 Side-Channel Analysis Attacks and Countermeasures
where .z0.005 = 2.576 (see Table 1.4). Since our sample mean
'
l2368
. − l2368 ≈ 0.0001 < c,
we accept the null hypothesis and conclude that .μ'2368 = μ2368 . The probability that
our decision is wrong is given by .α = 0.01.
Moreover, to check how many traces are needed for a test with significance level
.α = 0.01, we choose .c = 0.001 in Eq. 1.86. According to Eq. 1.87, the number of
'
|l2368 − l2368 | 0.0001
. / 2 '2
=/ ≈ 1.7 < z0.005 .
s2368 +s2368 8.5196×10−6 +8.6198×10−6
5000 5000
We accept the null hypothesis and conclude that .μ'2368 = μ2368 . The probability
that our decision is wrong is given by .α = 0.01.
Set .c = 0.001, then the number of traces needed for a student’s t-test with
significance level .α = 0.01 is given by (see Eq. 1.90)
'2
2
s2368 + s2368 8.5196 × 10−6 + 8.6198 × 10−6
2
z0.005
. = 2.5762
× ≈ 114.
c2 0.0012
Example 4.2.9 (Another example of two-sided hypothesis testing about .μx and
μy ) Similar to Example 4.2.8, let us now look at a different time sample .t = 392.
.
We can compute the sample mean and sample variance of .L392 with Fixed dataset
A. They are given by
l392 ≈ −0.0525,
.
2
s392 ≈ 1.5141 × 10−6 .
With Fixed dataset B, we get the sample mean and sample variance of .L'392 as
follows:
'2
'
l392
. ≈ −0.0501, s392 ≈ 1.4801 × 10−6 .
Similar to Example 4.2.8, we set the following hypotheses (see Eq. 1.82):
H0 : μ'392 = μ392 ,
. H1 : μ'392 /= μ392 .
4.2 Side-Channel Leakages 223
Let .α = 0.01. Then according to student’s t-test with significance level .α, we
compute (see Eq. 1.89)
'
|l392 − l392 | 0.0024
. / 2 '2
=/ ≈ 98.1 > z0.005 (z0.005 = 2.576).
s392 +s392 1.5141×10−6 +1.4801×10−6
5000 5000
We reject the null hypothesis and conclude that .μ'392 /= μ392 . The probability that
our decision is wrong is given by .α = 0.01.
Set .c = 0.001, then the number of traces needed for a student’s t-test with
significance level .α = 0.01 is given by (see Eq. 1.90)
2 + s '2
s392 392 1.5141 × 10−6 + 1.4801 × 10−6
2
z0.005
.
2
= 2.5762 × ≈ 20.
c 0.0012
Example 4.2.10 (Example of one-sided hypothesis testing about .μx and .μy )
With the same notations as in Example 4.2.9, suppose we know that
. μ'392 ≥ μ392 .
We would like to know if .μ'392 > μ392 . Then we have the following hypotheses:
H0 : μ'392 = μ392 ,
. H1 : μ'392 > μ392 .
Firstly, suppose we know the variances for both random variables are equal to
the sample variances that we have computed. There are 5000 traces in both Fixed
dataset A and Fixed dataset B. For a test with significance level .α = 0.01, the value
of c in the critical region given by Eq. 1.92 is (see Eq. 1.93)
/ /
2 + σ '2
σ392 1.5141 × 10−6 + 1.4801 × 10−6
c = zα
.
392
= 2.326× ≈ 5.692×10−5 ,
5000 5000
.
'
l392 − l392 = 0.0024 > c,
we reject the null hypothesis and conclude that .μ'392 > μ392 . The probability of this
choice being wrong is given by .α = 0.01.
Set .c = 0.001. Then, the number of traces to collect for a hypothesis test with a
level of significance .α = 0.01 (zα = 2.326) is given by (see Eq. 1.94)
2 + σ '2 )
zα2 (σ392 392 2.3262 × (1.5141 × 10−6 + 1.4801 × 10−6 )
.
2
= ≈ 17.
c 0.0012
224 4 Side-Channel Analysis Attacks and Countermeasures
Then Example 4.2.8 concludes that when we take the signal to be part of the
leakage related to the plaintext value, the signals at time sample 2368 for one
round encryption of plaintexts ABCDEF1234567890 and 84216BA484216BA4 with
the same round key FEDCBA0123456789 are very likely to be equal, according
to our measurements Fixed dataset A and Fixed dataset B. The probability of the
conclusions being wrong is .0.01. On the other hand, Example 4.2.9 concludes that
the signals at time sample 392 are likely to be different (with a probability of .0.01
being wrong).
Furthermore, we see that to decide if the signals are different at a particular time
sample with .c = 0.0013 and significance level .0.01 (i.e., probability of making
wrong conclusions) we do not need that many traces.
Next, let us consider .v being the 0th Sbox output in the first round of PRESENT.
In this case, we can take .Lt to be the leakages for a fixed value of .v at time sample
t and .L't to be the leakages for another fixed value of .v at t.
Example 4.2.11 When we consider .v to be the 0th Sbox output in the first round
of PRESENT, there are 16 different values of .v that we can consider. Let .Lt and .L't
denote the random variable for leakages corresponding to .v = 0 and .v = F at time
sample t. We would like to know if the signals at time sample .t = 392 are the same
for those two values of .v.
Take the Random dataset. As mentioned in Example 4.2.4, we have 634 traces
for .v = 0 and 651 traces for .v = F. We take those 634 (respectively, 651) traces
as a sample for .Lt (respectively, .L't ). The same as in Examples 4.2.8 and 4.2.9, we
make the following hypotheses:
H0 : μ'392 = μ392 ,
. H1 : μ'392 /= μ392 .
Firstly, we compute the sample means and sample variances for .L392 and .L'392 :
Let .α = 0.01. Then following student’s t-test with significance level .α, we compute
(see Eq. 1.65)
' |
|l392 − l392 |−0.0425 + 0.0539|
. / ≈√
1
sp2 ( 634 + 651
1
) 2.5199 × 10−6 × 3.1134 × 10−3
We reject the null hypothesis and conclude that .μ'392 /= μ392 . The probability that
our decision is wrong is given by .α = 0.01.
Remark 4.2.3 Note that in Examples 4.2.8 and 4.2.9, the sample sizes (number
of traces) are the same (both are 5000), but in Example 4.2.11, the sample sizes
are different for .Lt and .L't . Thus, instead of using Eq. 1.89 as in Examples 4.2.8
and 4.2.9, we applied Eq. 1.88. But those two equations are the same when the
sample sizes are equal.
We have seen before that the leakage .Lt is dependent on the data being processed
in the device. In fact, as mentioned at the beginning of Sect. 4.2, some SCA attacks
(see Sects. 4.3.1, 4.3.3, and 4.4.2) exploit the dependency of the leakage on certain
intermediate values. If the leakage is not exploitable, we would expect, at least, that
the signals at time sample t should be the same when the only difference is the
values of the data being processed. With our notations above, this means that we
would like to test if .Xt = Xt' , or equivalently .μt = μ't (see Remark 4.2.2), for two
different fixed values of a certain intermediate value .v.
Example 4.2.12 Continuing Remark 4.2.2 and the above discussion, we can
conclude that the leakages at time sample 392 for our implementation of PRESENT
on our DUT are very likely to be vulnerable to SCA attacks.
Another approach to analyzing whether the leakage is exploitable is to consider
the signals for a fixed value of .v and that for random values of .v. Let .Lrt denote
the random variable corresponding to the leakage at time sample t for encryptions
corresponding to random values of .v. Let .Xtr and .Ntr be the random variables for
the corresponding signal and noise. We have
With our assumptions and modeling, the signal .Xt is a constant for a fixed value
of .v at time t. When the value of .v is random, .Xtr is itself a random variable that
varies depending on .v. It is not easy to approximate the distribution induced by .Lrt
226 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.10 Histogram of leakages at time sample .t = 392 across 5000 traces from the Random
plaintext dataset
Fig. 4.11 Histogram of leakages at time sample .t = 392 across 10,000 traces from the Random
dataset
in this case. However, following the convention, we still use a normal distribution
for the approximation.
To see that this makes sense, let us take the Random plaintext dataset and plot
the histogram of leakages at .t = 392 across 5000 traces from this dataset. We
get Fig. 4.10. This corresponds to random values of .v when .v is taken to be the
plaintext. As another example, the histogram for leakages at .t = 392 across the
10,000 traces from the Random dataset is shown in Fig. 4.11. In this case, we can
consider the leakage corresponds to random values of .v when .v is taken to be the
0th Sbox output. Those two figures demonstrate that it is reasonable to approximate
the distribution induced by the leakage .Lrt with a normal distribution.
Suppose
Since the noise is independent of the signal, we have .Nt = Ntr . By Eqs. 1.33, 1.36,
and 4.9,
4.2 Side-Channel Leakages 227
⎡ ⎤ ⎡ ⎤ ( ) ( ) ( )
μrt = E Xtr + E Ntr ,
. σtr2 = Var Xtr + Var Ntr = Var Xtr + σt2 .
We have
⎡ ⎤
μt − Xt = μrt − E Xtr ,
. σtr2 /= σt2 . (4.14)
Same as before, in case the leakage is not exploitable at time sample t, we expect
the signal to be a constant at t, namely
Xt = Xtr ,
. and equivalently μrt = μt . (4.15)
l392 ≈ −0.0525,
.
2
s392 ≈ 1.5141 × 10−6 .
r ≈ −0.0488,
l392
.
r2
s392 ≈ 1.1700 × 10−5 .
We would like to test if .μt = μrt . Thus, we set the following hypotheses:
H0 : μr392 = μ392 ,
. H1 : μr392 /= μ392 .
Let .α = 0.01. Then following Welch’s t-test with significance level .α, we compute
(see Eq. 1.91)
|l392 − l392
r |
| − 0.0525 + 0.0488|
. / 2 =/ ≈ 72.0 > z0.005 .
s392 r2
s392 1.5141×10−6 1.1700×10−5
5000 + 5000 5000 + 5000
We reject the null hypothesis and conclude that .μr392 /= μ392 . The probability that
our decision is wrong is equal to .α = 0.01.
Example 4.2.14 Now we consider the signal to be given by the 0th Sbox output.
For the fixed signal we choose .v = 0. Take the Random dataset. Let .Lt and .Lrt
denote the random variables corresponding to leakages for .v = 0 and random values
of .v at time sample t, respectively.
We know that there are 634 traces for .v = 0. Fix .t = 392. In Example 4.2.11 we
have computed
228 4 Side-Channel Analysis Attacks and Countermeasures
l392 ≈ −0.0425,
.
2
s392 ≈ 2.2962 × 10−6 .
For the random values of .v, we can take the whole dataset, which contains 10,000
traces, as a sample for .Lrt . We have
r ≈ −0.0487,
l392
.
2
s392 ≈ 1.1624 × 10−5 .
Let .α = 0.01. Then according to Welch’s t-test with significance level .α, we
compute (see Eq. 1.91)
r |
|l392 − l392 | − 0.0425 + 0.0487|
. / 2 =/ ≈ 89.6 > z0.005 .
r2 2.2962×10−6 −5
s392
+
s392
634 + 1.1624×10
10,000
634 5000
We reject the null hypothesis and conclude that .μr392 /= μ392 . The probability that
our decision is wrong is .α = 0.01.
The rationale of the TVLA methodology is that if the leakage is not exploitable,
the encryptions corresponding to two different intermediate values (or the encryp-
tion corresponding to one fixed intermediate value and that to a random intermediate
value) should exhibit identical signals. Then according to Eq. 4.13 (or Eq. 4.15), the
corresponding leakages will have the same means. With the help of the student’s
t-test (or Welch’s t-test), we make hypotheses about means of leakages and test if
they are equal.
Recall that for student’s t-test and Welch’s t-test (when the sample size is big), we
need to choose a significance level .α and compare computations using our samples
with a threshold .zα/2 (see Eqs. 1.88 and 1.91). For TVLA, following the convention,
we set .zα/2 = 4.5. By Eq. 1.43, this threshold corresponds to
α
. = 1 − Ф(zα ) = 1 − Ф(4.5) = 1 − 0.9999966023268753 ≈ 3.4 × 10−6 .
2
The significance level is given by
α ≈ 6.8 × 10−6 .
.
This means that there is a .6.8 × 10−4 percent chance that we would reject the null
hypothesis (i.e., conclude that the means are different) in case it is true (i.e., the
means are in fact the same).
The steps for TVLA are as follows:
TVLA Step 1 Identify the cryptographic implementation for analysis. In prin-
ciple, TVLA can be used for analyzing leakages of implementations
for any type of algorithm. In practice, they are mostly used for the
analysis of symmetric block cipher implementations.
4.2 Side-Channel Leakages 229
TVLA Step 2 Choose the intermediate value .v. The choice of .v determines how
we measure our traces. TVLA tests if different values of .v result in
different signals.
TVLA Step 3 Experimental setup and measure leakages. As we can imagine, for
the actual attacks, experimental setups are crucial factors for success.
For leakage assessment, it would be better to carry out measurements
with equipment that is expected to be used by attackers that we would
like to protect against.
We will prepare two datasets, denoted by .T1 and .T2 . To get the
first dataset .T1 , we choose a fixed value for .v. Then we randomly
take .M1 inputs for the cryptographic implementation such that the
value of .v is equal to this fixed value. One trace is taken for each
input.
For the second dataset .T2 , there are two options.
(a) Fixed versus fixed. Choose a different fixed value for .v. Then
randomly take .M2 inputs for the cryptographic implementation
such that the value of .v is equal to this fixed value. One trace is
collected for each input.
(b) Fixed versus random. Randomly take .M2 inputs for the crypto-
graphic implementation so that the value of .v is random. One
trace is collected for each input.
Let us represent those two sets of traces as follows:
T1 = Fixed dataset A,
. T2 = Fixed dataset B
T1 = Fixed dataset A,
. T2 = Random plaintext dataset
for the fixed versus random setting. For both cases, we will demon-
strate the results for .M1 = M2 = 5000 and .M1 = M2 = 50.
When .v is given by the 0th Sbox output, we take
choose
and .M1 = 634, .M2 = 10,000. For all our traces, .q = 3600.
(1) (2)
TVLA Step 4 t-Test for one time sample. Fix a time sample t. Let .Lt and .Lt
denote the random variable corresponding to leakages at time sample
t for computations resulting in datasets .T1 and .T2 , respectively.
Suppose
By definition (see Eqs. 1.49 and 1.53), we compute the sample mean
and sample variance for .L(1) (2) (1)
t (respectively, .Lt ), denoted by .lt and
(1)2 (2) (2)2
st
. (respectively, .lt and .st ):
1 Σ 1 Σ
M1 M2
(1) (1) (2) (2)
lt
. = lj t , lt = lj t ,
M1 M2
j =1 j =1
and
M1 ⎧
Σ ⎫2 M2 ⎧
Σ ⎫2
(1)2 1 (1) (1) (2)2 1 (2) (2)
.st = lj t − lt , st = lj t − lt .
M1 − 1 M2 − 1
j =1 j =1
H0 : μ(1)
.
(2)
t = μt , H1 : μ(1) (2)
t /= μt . (4.16)
σt(1)2 = σt(2)2 ,
.
hence the usage of student’s t-test. In the fixed versus random setting,
the noises are different, and we have (see Eq. 4.14)
4.2 Side-Channel Leakages 231
(1)2 (2)2
σt
. /= σt ,
(1) (2)
lt − lt
t − valuet := /
. . (4.17)
sp2 (1/M1 + 1/M2 )
(b) Welch’s t-test for the fixed versus random setting. When the
second dataset .T2 is measured according to the fixed versus
random setting, following Welch’s t-test, we compute
(1) (2)
l − lt
t − valuet := / t
. . (4.18)
(1)2 (2)2
st st
M1 + M2
Fig. 4.12 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with Fixed dataset A
and Fixed dataset B. The signal is given by the plaintext value, and the fixed versus fixed setting is
chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5
Fig. 4.13 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Fixed dataset A and 50 traces from Fixed dataset B. The signal is given by the plaintext value, and
the fixed versus fixed setting is chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5
of the threshold. This is not surprising as the implementation does not have any
countermeasures. In Sect. 4.3.1 we will see that using this implementation, with just
a few traces, we can recover the first round key. If we reduce the number of traces for
computing the t-values, we will get different results. For example, when we take 50
traces, i.e., .M1 = M2 = 50, we have Fig. 4.13. Compared to Fig. 4.12, the absolute
values of t-values are much smaller. This shows that when the sample size is bigger,
it is more likely for us to capture information about the inputs from the leakages.
For the fixed versus random setting, t-values with Welch’s t-test (Eq. 4.18)
are computed with Fixed dataset A and Random plaintext dataset. The results
are shown in Figs. 4.14 and 4.15 for .M1 = M2 = 5000 and .M1 = M2 =
50, respectively. Similarly, we also observe higher .|t|-values with more traces.
Furthermore, compared to Figs. 4.12 and 4.13, the .|t|-values are much lower. This
shows that it is more likely for us to distinguish between leakages corresponding
4.2 Side-Channel Leakages 233
Fig. 4.14 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with Fixed dataset
A and Random plaintext dataset. The signal is given by the plaintext value, and the fixed versus
random setting is chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5
Fig. 4.15 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Fixed dataset A and 50 traces from Random plaintext dataset. The signal is given by the plaintext
value, and the fixed versus random setting is chosen. Blue dashed lines correspond to the threshold
.4.5 and .−4.5
to two fixed plaintexts rather than between leakages for a fixed plaintext and for
random plaintexts.
Next, we take .v to be the 0th Sbox output. We use Random dataset as samples for
our random variables corresponding to leakages. For the fixed versus fixed setting,
we take the .M1 = 634 traces for .v = 0 as .T1 and .M2 = 651 traces for .v = F as .T2 .
The t-values with the student’s t-test (Eq. 4.17) are shown in Fig. 4.16. For the fixed
versus random setting, we take the .M1 = 634 traces for .v = 0 as .T1 and the whole
dataset as .T2 (.M2 = 10,000). Following Welch’s t-test, the t-values (Eq. 4.18) are
shown in Fig. 4.17. Again, we also show the results when fewer traces are used for
the computations. The t-values can be found in Figs. 4.18 and 4.19.
In summary, we have the following observations:
• When more traces are used (i.e., when the sample size is bigger), it is more likely
for us to capture information about the intermediate values from the leakages.
234 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.16 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. .T1 contains .M1 = 634 traces and .T2 contains .M2 = 651 traces. The signal
is given by the 0th Sbox output, and the fixed versus fixed setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5
Fig. 4.17 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. .T1 contains .M1 = 634 traces and .T2 contains .M2 = 10,000 traces. The signal
is given by the 0th Sbox output, and the fixed versus random setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5
We will see in Sect. 4.3.2.4 that more traces indeed indicate higher chances for
the attacks to be successful.
• When .v is given by the 0th Sbox output, the highest .|t|-value is obtained at 392
for all cases we have analyzed. We will see that this is the point of interest (POI)
for our attack (Sect. 4.3.2).
• Compared to .v being the plaintext, the .|t|-values are in general smaller with much
fewer time samples crossing the threshold when .v is given by the 0th Sbox output.
This is unsurprising as we would expect more computations to be correlated with
the plaintext rather than a single Sbox output.
4.2 Side-Channel Leakages 235
Fig. 4.18 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. Both .T1 and .T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is given
by the 0th Sbox output, and the fixed versus fixed setting is chosen. Blue dashed lines correspond
to the threshold .4.5 and .−4.5
Fig. 4.19 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. Both .T1 and .T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is given by
the 0th Sbox output, and the fixed versus random setting is chosen. Blue dashed lines correspond
to the threshold .4.5 and .−4.5
Var(signal)
SNR =
. ,
Var(noise)
where .Var refers to the variance of a random variable (see Eq. 1.35).
In our case, for a fixed time sample t, .Xt represents the signal, which is part of
the leakage relevant to our attack. And the SNR at time t is given by
236 4 Side-Channel Analysis Attacks and Countermeasures
Var(Xt )
.SNRt = . (4.19)
Var(Nt )
Var(Xt ) measures how much the leakage varies at time sample t due to the signal.
.
Var(Nt ) measures how much the leakage varies due to the noise. Thus, SNR
.
quantifies how much information is leaked at time sample t from the measurements.
The higher the SNR, the lower the noise.
Example 4.2.15 Suppose we are interested in the Hamming weight of an 8-bit
intermediate value at time sample t. In particular, the intermediate value we would
like to analyze is from .F82 . We further assume that the leakage .Lt is equal to the
modeled leakage following the Hamming weight leakage model (Eq. 4.4). Thus
.Xt = wt (v) for .v ∈ F . Then the variance of the signal is given by .Var(wt (v))
8
2
for .v ∈ F2 . By definition (Eq. 1.31),
8
8 ⎧ ⎫
1 Σ 1 Σ 8 1 Σ
8
8!
E [wt (v)] =
. wt (v) = i =
|F82 | 28 i 28 (i − 1)!(8 − i)!
v∈F82 i=1 i=1
7 ⎧ ⎫
8 Σ 8 Σ 7
8
7! 8 × 27
= = = = 4.
28 (i − 1)!(7 − (i − 1))! 28 j 28
i=1 j =0
And
⎡ ⎤ ⎧ ⎫
1 Σ ⎛ 2⎞ 1 Σ 2 8 1 Σ
8 8
8!
.E wt (v)
2
= 8 wt v = 8 i = 8 i
|F2 | 8
2 i 2 (i − 1)!(8 − i)!
i=1
v∈F2 i=1
⎧ 8 ⎫
8 Σ Σ8
7! 7!
= 8 (i − 1) +
2 (i − 1)!(8 − i)! (i − 1)!(7 − (i − 1))!
i=1 i=1
⎛ ⎞
7 ⎧ ⎫
1 ⎝ Σ Σ
8
6! 7 ⎠
= 5 7 +
2 (i − 2)!(6 − (i − 2))! j
i=2 j =0
⎛ ⎞
6 ⎧ ⎫
Σ
1 ⎝ 7 6 ⎠
= 5 2 +7
2 j
j =0
1 7
= (2 + 7 × 26 ) = 22 + 7 × 2 = 18.
25
By Eq. 1.35,
⎡ ⎤
Var(wt (v)) = E wt (v)2 − E [wt (v)]2 = 18 − 42 = 2.
.
4.2 Side-Channel Leakages 237
Example 4.2.16 In this example, let .Lt denote the random variable corresponding
to the leakage of one round of PRESENT encryption at time t. We take the Random
dataset (see Sect. 4.1) as a sample for .Lt . Suppose we are interested in the exact
value of the 0th Sbox output in the first round of PRESENT. Let us denote this
intermediate value by .v.
Fix a time sample t. .Xt is given by the part of the leakage related to the value
of .v. To compute .Var(Xt ), we first divide the traces in Random dataset into 16 sets
according to the value of .v. Let us denote those 16 sets of traces by .A1 , A2 , . . . , A16 ,
where .As contains traces corresponding to .v = s − 1.
As discussed in Sect. 4.2.1, for a fixed value of .v, .Xt is a constant, and the leakage
and the noise can be modeled by normal random variables. Let .Lt,s and .Nt,s denote
the random variables corresponding to leakage and noise at time sample t for .v =
s − 1. Let .Xt,s denote the constant leakage in this case.
Similar to Example 4.2.1, we can approximate the mean of .Lt,s using sample
mean computed with set .As . For example, take .t = 600, and we have
l600,1 ≈ 0.08212,
. l600,2 ≈ 0.08221, l600,3 ≈ 0.08209, ...
( ⎡ ⎤)
. Var(Xt ) = Var E Lt,s ,
⎡ ⎤
which can be estimated with the sample variance of .E Lt,s . For .t = 600, we have
2
sX
.
600
≈ 1.0088 × 10−8 .
By Eq. 4.2,
Var(Nt ) = Var(Lt − Xt ).
.
238 4 Side-Channel Analysis Attacks and Countermeasures
⎡ ⎤
On the other hand, since .E Nt,s is a constant for different values of s, by Eqs. 1.36
and 4.20,
( ⎡ ⎤) ( ⎡ ⎤) ( )
Var Lt − E Lt,s =Var Lt − Xt,s − E Nt,s = Var Lt − Xt,s = Var(Lt − Xt ).
.
⎡ ⎤
Thus .Var(Nt ) can be approximated by the sample variance of .Lt − E Lt,s . For
.t = 600, we have
2
sN
.
600
≈ 6.4184 × 10−6 .
1.0088 × 10−8
2
sX
Var(X600 )
SNR600 =
. ≈ 2 600 = ≈ 0.00157.
Var(N600 ) sN600 6.4184 × 10−6
Example 4.2.17 For now, we have discussed the definition of SNR for one point
in time. With the same method as in Example 4.2.16, we can compute the sample
variance for .Var(Xt ) and .Var(Nt ), as well as SNR values for all time samples. They
are shown in Figs. 4.20, 4.21, and 4.22, respectively.
We can see that the shape of variance of noise has similarities to one round of
PRESENT computations (e.g., Fig. 4.3). This is reasonable since most of the leakage
is not related to .v.
Furthermore, the peaks for the variance of signal and SNR correspond to each
other. The first two peaks are likely related to AddRoundKey and sBoxLayer. The
peaks after 1000 are probably caused by the permutation of 4 bits of .v (the 0th
Sbox output). These observations can be confirmed by comparing them to Fig. 4.3.
In particular, we can deduce that the peak at .t = 392 is related to the 0th Sbox
computation—as observed in Fig. 4.3, sBoxLayer starts from around time sample
382.
Fig. 4.20 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the exact value of the 0th Sbox output
4.2 Side-Channel Leakages 239
Fig. 4.21 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the exact value of the 0th Sbox output
Fig. 4.22 SNR for each time sample, computed using Random dataset. The signal is given by the
exact value of the 0th Sbox output
Example 4.2.18 We again look at the Random dataset. Instead of the exact values
of .v as in Example 4.2.16, we focus on the Hamming weight of the 0th Sbox output,
i.e., .wt (v). Then, in this case, for a fixed time sample t, we divide the traces into
five sets according to the value of .wt (v). Let us denote those five sets of traces
by .A1 , A2 , . . . , A5 , where .As contains traces corresponding to .wt (v) = s − 1.
Following similar computations as in Example 4.2.16, for .t = 600, we have
l600,1 ≈ 0.08212,
. l600,2 ≈ 0.08206, l600,3 ≈ 0.08214,
l600,4 ≈ 0.08211, l600,5 ≈ 0.08206.
And
2
sX
.
600
≈ 1.1043 × 10−9 , 2
sN 600
≈ 6.4271 × 10−5 , SNR600 ≈ 0.0001718.
The results for all time samples are shown in Figs. 4.23, 4.24, and 4.25.
240 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.23 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the Hamming weight of the 0th Sbox output
Fig. 4.24 SNR for each time sample, computed using Random dataset. The signal is given by the
Hamming weight of the 0th Sbox output
Fig. 4.25 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the Hamming weight of the 0th Sbox output
4.2 Side-Channel Leakages 241
The sample variance of the noise is very similar to Fig. 4.21 and also resembles
the leakage of PRESENT computation since most of the leakage is not related to
.wt (v). The peaks in the variance of signal and SNR also correspond to each other.
Compared to Fig. 4.22, the locations of the peaks are similar. It is worth noting that
the highest peak in both Figs. 4.22 and 4.24 is at time sample 392. As mentioned in
Example 4.2.17, this time sample corresponds to the computation of the 0th Sbox in
sBoxLayer. We also note that Fig. 4.24 has a higher SNR value than Fig. 4.22 at this
point. This suggests that the Hamming weight leakage model is closer to our DUT
leakage than the identity leakage model.
Normally in DPA attacks, we would like to focus on time samples where the
corresponding SNRs are high. We refer to those time samples as points of interest
(POIs).
Example 4.2.19 Continuing Example 4.2.17, the time sample with the highest
SNR is given by .t = 392. We can then take this point as our POI. Or, we can
also take a few time samples that achieve the higher SNRs. For example, the top
three SNRs are obtained at .t = 392, 218, 1328.
Similarly, suppose we focus on the Hamming weight of the 0th Sbox output.
Following the results from Example 4.2.18, in case we take just one POI, we have
.t = 392. And for three POIs, we have .t = 392, 1309, 1304.
Those POIs will be further used for our attacks in Sect. 4.3.2.
Example 4.2.20 As another example, suppose instead of the exact value or Ham-
ming weight of the 0th Sbox output .v, we are interested in the 0th bit of .v.
With the same dataset Random dataset, we divide the traces into two sets .A1 , A2 ,
corresponding to the 0th bit of .v equal to 0 and 1, respectively. Following similar
computations as in Example 4.2.16, for .t = 600, we have
l600,1 ≈ 0.08206,
. l600,2 ≈ 0.08216.
And
2
sX
.
600
≈ 2.6879 × 10−9 , 2
sN 600
≈ 6.4256 × 10−6 , SNR600 ≈ 0.0004183.
The results for all time samples are shown in Figs. 4.26, 4.27, and 4.28.
We can see that Fig. 4.28 is similar to Figs. 4.21 and 4.25. Compared to Figs. 4.22
and 4.24, there are fewer peaks in Fig. 4.27. Furthermore, the highest peak is not
around the sBoxLayer, but during pLayer computation. This is expected since now
we only consider 1 bit instead of 4 bits of .v.
242 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.26 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the 0th bit of the 0th Sbox output
Fig. 4.27 SNR for each time sample, computed using Random dataset. The signal is given by the
0th bit of the 0th Sbox output
Fig. 4.28 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the 0th bit of the 0th Sbox output
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 243
In this section, we will discuss two types of attacks on symmetric block cipher
implementations: differential power analysis (DPA) in Sects. 4.3.1 and 4.3.2 and
side-channel assisted differential plaintext attack (SCADPA) in Sect. 4.3.3.
DPA Step 3 Choose the part of the key to recover. The DPA attack is normally
carried out in a divide-and-conquer manner. In particular, we focus on a
small part (e.g., a nibble and a byte) of a round key in each attack, and
each part of the round key can be recovered independently. With the
inverse key schedule, one (e.g., for AES) or two (e.g., for PRESENT,
DES) round keys will reveal the master key (see Remarks 3.1.1, 3.1.4,
and 3.1.5). Let k denote the target part of the key, and let .Mk denote
the number of possible values of k. For our attacks, we will focus on
the 0th nibble of the first round key for PRESENT and .Mk = 16.
DPA Step 4 Choose the target intermediate value. To recover the part of the key
chosen in the last step, we exploit relationships between leakages and
a certain intermediate value being processed in the DUT. The goal
is to gain information about this intermediate value, which reveals
information about our chosen part of the key. Let .v denote the target
intermediate value. We require that there is a function .ϕ, such that
v = ϕ(k, p),
.
where p denotes (part of) the plaintext. For our attack, to recover the
0th nibble of the first round key of PRESENT, we will target the 0th
Sbox output of the first round. Then we have
.v = SBPRESENT (k ⊕ p),
where k and p denote the 0th nibble of the first round key and that of
the plaintext.
DPA Step 5 Compute hypothetical target intermediate values. By our choice of
the target intermediate value, a small part of the key is related to it.
Thus, when we make a guess of this part of the key, with knowledge
of the plaintext we can obtain a hypothetical value for our target
intermediate value. In particular, for each key hypothesis .k̂i of k and
each (part of the) plaintext .pj , we can compute a hypothesis for .v,
denoted .v̂ ij , as follows:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 245
v̂ ij = ϕ(k̂i , pj ),
. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .
For our illustration, with each key hypothesis of the 0th nibble of the
first round key and each plaintext, we have a hypothetical value for the
0th Sbox output:
where .pj is the 0th nibble of the plaintext corresponding to the attack
trace .𝓁j . Furthermore, we set
k̂i = i − 1,
. i = 1, 2, . . . , 16.
DPA Step 6 Choose the leakage model. For each hypothetical target intermediate
value, we can compute the hypothetical signal depending on our
leakage model
where we subtract the noise component from the leakage model. For
example, if we choose the Hamming weight leakage model, according
to Eq. 4.4, we have
( )
. Hij = wt v̂ ij , i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .
{ }
j
. (Hij , lt ) | j = 1, 2, . . . , Mp .
We would like to know how good the modeled signals are compared
to the actual leakages for each key hypothesis. For the correct key
hypothesis and the time samples corresponding to POIs, we expect
246 4 Side-Channel Analysis Attacks and Countermeasures
i = 1, 2, . . . , Mk , t = 1, 2, . . . , q.
In our case,
Σ5000 j
j =1 (Hij− Hi )(lt − lt )
ri,t = /Σ
. /Σ ,
5000 5000 j
j =1 (H ij − H i ) 2 (l
j =1 t − l t ) 2
Since the target intermediate value .v we have chosen will be processed in our
DUT at certain points in time, we expect the leakages at those corresponding time
samples to be correlated to .v. Those time samples are our POIs. If a good leakage
model (i.e., a model that is close to the actual leakage of the DUT) is chosen, we
expect .Hi and .Lt to be correlated for the correct key hypothesis .k̂i and POIs t. Thus,
the key hypothesis that achieves the highest absolute value of .ri,t is expected to be
the correct key. Furthermore, the time samples that achieve higher absolute values
of .ri,t will be our POIs in the attack.
In practice, if all .rit s are low, we will need more traces for the attack.
Note
According to Eq. 4.1, the correct value of the 0th nibble of the first round key
is given by 9.
{(1, 11), (0, 9), (1, 12), (1, 14), (0, 9)}
.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 247
for a pair of random variables .(U, W ). Then the sample mean for U is given by
1+0+1+1+0 3
u=
. = .
5 5
And the sample mean for W is given by
11 + 9 + 12 + 14 + 9 55
w=
. = = 11.
5 5
The sample correlation coefficient for U and W is given by
Σ5
− u)(wi − w)
i=1 (ui
r = /Σ
. /Σ
5 5
i=1 (ui − u) i=1 (wi − w)
2 2
Let us first consider the identity leakage model. Then in DPA Step 6, we have
Hij = v̂ ij ,
. i = 1, 2, . . . , 16, j = 1, 2, . . . , 5000.
p1 = 9,
. p2 = C.
Fig. 4.29 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for all time samples .t =
1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model and the Random
plaintext dataset. The blue line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient
Fig. 4.30 Sample correlation coefficients .r10,t (corresponds to the correct key hypothesis 9) for
all time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
the other peak clusters are related to permutations of bits of .v in pLayer. Those
observations agree with the duration of each PRESENT round operation in Fig. 4.3.
We also notice that the biggest peak in Fig. 4.30 is obtained at .t = 392, which
corresponds to the point with the highest SNR from Fig. 4.22 (Example 4.2.17).
For further illustration, the plots of .ri,t (.t = 1, 2, . . . , 3600) for .i = 1, 5, 14
(corresponding to key hypotheses 0, 4, D) are shown in Figs. 4.31, 4.32, and 4.33,
respectively. Comparing those figures with Fig. 4.30, we can see some peaks appear
at similar time samples in all figures. This is due to the fact that .Hi s are not
independent random variables, and for those time samples t, .Hi s are also correlated
with .lt for .i /= 10.
Remark 4.3.1 The correlation between .Hi s also influences the magnitude of the
correlation coefficients for the wrong key hypotheses. If the correlation between .Hi s
is higher, we would also see higher peaks in some wrong key hypotheses. For AES,
the correlations between the first AddRoundKey outputs are higher than correlations
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 249
Fig. 4.31 Sample correlation coefficients .r1,t (corresponds to a wrong key hypothesis 0) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
Sample correlation coefficient
Fig. 4.32 Sample correlation coefficients .r5,t (corresponds to a wrong key hypothesis 4) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
Sample correlation coefficient
Fig. 4.33 Sample correlation coefficients .r14,t (corresponds to a wrong key hypothesis D) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
250 4 Side-Channel Analysis Attacks and Countermeasures
between the first SubBytes operation outputs, which is why in DPA Step 4 we chose
the target intermediate value to an Sbox output.
In this part, let us consider the Hamming weight leakage model. In DPA Step 6, we
have
( )
. Hij = wt v̂ ij , i = 1, 2, . . . , 16, j = 1, 2, . . . , 5000.
Fig. 4.34 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for all time samples .t =
1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model and the Random
plaintext dataset. The blue line corresponds to the correct key hypothesis .k̂10 = 9
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 251
Fig. 4.35 Sample correlation coefficients .r10,t (corresponds to the correct key hypothesis 9) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
Sample correlation coefficient
Fig. 4.36 Sample correlation coefficients .r1,t (corresponds to a wrong key hypothesis 0) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
Fig. 4.37 Sample correlation coefficients .r5,t (corresponds to a wrong key hypothesis 4) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
Sample correlation coefficient
Fig. 4.38 Sample correlation coefficients .r14,t (corresponds to a wrong key hypothesis D) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
different from the non-profiled setting where only certain basic knowledge of the
implementation is required.
For our illustrations, we suppose the Random dataset is obtained from a clone
device, and the Random plaintext dataset is from the target device. Then before the
attack, we can analyze the Random dataset to obtain more information about the
leakage behavior of the DUT in the profiling phase.
The first major step in the profiling phase is to find the POIs, namely, time sam-
ples that will give us more information or with better signal. After identifying the
POIs, in the attack phase, instead of computing the sample correlation coefficients
for all time samples, we can just focus on the POIs.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 253
P-DPA Step 3 Choose the part of the key to recover. This step is the same
as DPA Step 3 in Sect. 4.3.1.1. Let k denote the target part of the
key, and let .Mk denote the number of possible values of k. For our
attacks, the same as in Sect. 4.3.1.1, we will focus on the 0th nibble
of the first round key for PRESENT and .Mk = 16.
P-DPA Step 4 Choose the target intermediate value. This step is the same
as DPA Step 4 in Sect. 4.3.1.1. Let .v denote the target intermediate
value. We require that there is a function .ϕ, such that
.v = ϕ(k, p),
where p denotes (part of) the plaintext. For our attack, to recover
the 0th nibble of the first round key of PRESENT, we will target
the 0th Sbox output of the first round. Then we have
v = SBPRESENT (k ⊕ p),
.
where k and p denote the 0th nibble of the first round key and that
of the plaintext.
P-DPA Step 5 Decide on the target signal. Before we do further analysis of the
profiling traces, we need to choose what information related to the
target intermediate value .v we are looking for, for example, the
Hamming weight of .v or the 0th bit of .v. In our illustrations, we
will look at two types of target signals, one given by the exact value
of .v and the other one given by .wt (v), the Hamming weight of .v.
P-DPA Step 6 Group the profiling traces. We take our set of profiling traces and
divide them into .Msignal sets according to the target signal from P-
DPA Step 5. Let us denote those sets by .A1 , .A2 , .. . . , .AMsignal .
For our illustrations, when the target signal is given by .v,
the exact value of the output of the 0th Sbox in PRESENT,
254 4 Side-Channel Analysis Attacks and Countermeasures
P-DPA Step 7 Modeling leakage, signal, and noise. Let us fix a time sample t
(.1 ≤ t ≤ q), and let .Lt , .Xt , and .Nt denote the random variables
corresponding to leakage, signal, and noise at t, respectively. When
we fix the signal, as discussed in Sect. 4.2.1, the leakage .Lt and the
noise .Nt can be modeled by normal random variables. When we
focus on one particular target signal, i.e., when we only consider
computations that result in the traces belonging to a particular set
.As , let .Lt,s and .Nt,s denote the random variables corresponding to
Hence
( ) ( ⎡ ⎤ ⎡ ⎤)
Var(Xt ) = Var Xt,s = Var E Lt,s − E Nt,s .
.
⎡ ⎤
Var(Xt ) sample variance of E Lt,s
.SNRt = ≈ ⎡ ⎤.
Var(Nt ) sample variance of Lt − E Lt,s
v̂ ij = ϕ(k̂i , pj ),
. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .
For our attacks, with each key hypothesis of the 0th nibble of the
first round key and each known plaintext, we have a hypothetical
value for the 0th Sbox output:
. k̂i = i − 1, i = 1, 2, . . . , 16.
P-DPA Step 12 Identify the leakage model and compute the hypothetical sig-
nals. By our choice of the target signal from P-DPA Step 5, we have
a corresponding leakage model. For example, if the target signal is
the exact value of .v, a natural choice of leakage model will be the
identity leakage model.
For each hypothetical target intermediate value, we can compute
the hypothetical signal depending on our leakage model
j
where .lPOI is the POI-th entry of the attack trace .𝓁j obtained in P-
DPA Step 10 (.j = 1, 2, . . . , Mp ) and .2 ≤ M̂p ≤ Mp .4 With this
sample, we can compute the sample correlation coefficient between
.Hi and .LPOI for each key hypothesis .k̂i (.i = 1, 2, . . . , Mk ):
ΣM̂p j
M̂p − Hi )(lPOI − lPOI )
j =1 (Hij
.r
i,POI := / / . (4.23)
ΣM̂p ΣM̂ j
j =1 (Hij − Hi ) j =1 (lPOI − lPOI )
2 p 2
M̂
p
Figure 4.39 presents the values of .ri,POI (i = 1, 2, . . . , 16) for .POI = 392
computed with the identity leakage model. The x-axis indicates the number of
4 When .M̂
p = 1, the denominator in Eq. 4.23 is equal to 0.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 257
M̂
Fig. 4.39 Sample correlation coefficients .ri,POIp
(.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the identity leakage model and the Random plaintext dataset. The blue
line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient
M̂
p
Fig. 4.40 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the Hamming weight leakage model and the Random plaintext dataset.
The blue line corresponds to the correct key hypothesis .k̂10 = 9
traces .M̂p used. The figure shows that with just roughly 20 traces, we can clearly
distinguish the correct key hypothesis from the wrong ones.
Similarly, the results for the Hamming weight model are shown in Fig. 4.40. In
this case, we need less than five traces to identify the correct key. This indicates that
the Hamming weight leakage model is closer to our DUT leakage compared to the
identity leakage model.
Note
A good leakage model is beneficial to our attack.
Remark 4.3.3 Except for computing SNR in P-DPA Step 8 to identify the POIs,
other methods, e.g., t-test (Sect. 4.2.3) with a properly chosen intermediate value,
can also be used for this purpose.
258 4 Side-Channel Analysis Attacks and Countermeasures
To fully utilize the cloned device in the profiled setting, we can further characterize
the leakages instead of just identifying the POI. In this part, we will study a
leakage model that assumes each bit of the target intermediate value (P-DPA
Step 4 from Sect. 4.3.2.1) results in a different signal. In particular, suppose the
target intermediate value .v = vmv −1 vmv −2 . . . v1 v0 has bit length at most .mv ,5 the
stochastic leakage model assumes that
v −1
mΣ
L(v) =
. αs vs + noise, (4.24)
s=0
where .noise ∼ N(0, σ 2 ) denotes the noise with mean 0 and variance .σ 2 . .αs (.s =
0, 1, . . . , mv −1) are real numbers. We refer to .αs as the coefficients of the stochastic
leakage model.
The attack with stochastic leakage model follows the same steps as described in
Sect. 4.3.2.1. The only difference is in P-DPA Step 12, where we need extra effort to
find our leakage model by profiling. We note that since the stochastic leakage model
assumes each value of .v has different signals, to identify the POI, we will choose
the target signal to be the exact value of .v in P-DPA Step 5. Then, using the leakages
at the POI, we will find estimations for .αs values. Those estimated values together
with Eq. 4.24 provide us with hypothetical signals in P-DPA Step 12.
To estimate .αs , we adopt the ordinary least square method from linear regression
[DPRS11]. Let
⎛ ⎞
pf j,pf j,pf j,pf
𝓁j = l1 , l2 , . . . , lq
.
denote the j th profiling trace, where .j = 1, 2, . . . , Mpf . The steps for computing
estimations of coefficients .αs for the stochastic leakage model are as follows:
SLM Step a Compute the vector of leakages. We only focus on the leakage at the
POI from each profiling trace. Let
⎛ M ,pf
⎞
1,pf 2,pf
𝓁pf := lPOI , lPOI , . . . , lPOIpf
.
5 When the bit length of .v is less than .mv , some bits .vmv −1 , .. . . are zero.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 259
pf pf pf pf
v j = vj (mv −1) . . . vj 1 vj 0 ,
. j = 1, 2, . . . , Mpf
.v to appear.
SLM Step c Compute estimated values of coefficients .αs . The estimated values
.α̂s for .αs are given by
( )T ⎛ ⎞−1
. α̂0 α̂1 . . . α̂mv −1 = MvT Mv MvT 𝓁T
pf . (4.26)
j,pf
For each actual leakage .lt , define
v −1
mΣ
.lˆt
j,pf pf
= α̂s vj s .
s=0
And let
Then by the ordinary least square method from linear regression, .αˆs values
computed with Eq. 4.26 minimize the Euclidean distance (Definition A.2.1) between
ˆ pf and .𝓁pf (see, e.g., [Ros20, Section 9.8]).
.𝓁
260 4 Side-Channel Analysis Attacks and Countermeasures
Example 4.3.4 The first trace in Random dataset corresponds to the plaintext with
the 0th nibble.= 4 and the key with the 0th nibble.= 7. Then in SLM Step b we have
(see Table 3.11 for PRESENT Sbox)
pf
v 1 = SBPRESENT (4 ⊕ 7) = SBPRESENT (3) = B = 10112 .
.
With .POI = 392 and the Random dataset, we get the following estimated values
for the coefficients .αs :
α̂0 ≈ −0.02019,
. α̂1 ≈ −0.02027, α̂2 ≈ −0.01920, α̂3 ≈ −0.02039.
According to the stochastic leakage model in Eq. 4.24, the estimated leakage of
v = v3 v2 v1 v0 is given by
.
For example,
And the estimated signal of .E = 1110 according to the stochastic leakage model is
given by .−0.052. Similarly, we can compute the estimated signals for all 16 possible
values of the target intermediate value .0, 1, . . . , F:
.p1 = 9, p2 = C.
Then following computations from Example 4.3.2 and the estimated signals given
in Eq. 4.27, with the profiled stochastic leakage model, in P-DPA Step 12, we have
M̂
Fig. 4.41 Sample correlation coefficients .ri,POIp
(.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the stochastic leakage model and the Random plaintext dataset. The blue
line corresponds to the correct key hypothesis .k̂10 = 9
Following P-DPA Step 13 from Sect. 4.3.2.1, the attack results are shown in
Fig. 4.41. Compared to Figs. 4.39 and 4.40, the attack results based on the stochastic
leakage model are similar to that based on the Hamming weight leakage model,
better than the results based on the identity leakage model. This shows that both the
stochastic and the Hamming weight leakage models are better approximations of
the DUT leakage than the identity leakage model.
We have seen how to characterize the leakage assuming each bit of the target
intermediate value leaks differently focusing on one POI. We can also char-
acterize/profile the leakages of each possible value of the target intermediate
value (P-DPA Step 4 from Sect. 4.3.2.1) at several POIs. The result of this profiling
process is a set of templates. Then during the attack phase, instead of computing
correlation coefficients, we use those templates to see which of them fits better to
the measured power trace and deduce a probability for each key hypothesis.
As discussed in Sect. 4.2, for a computation with constant signal, the distribution
of leakages at a single time sample can be modeled with a normal distribution. And
leakages at a few time samples can be considered as a Gaussian random vector.
The goal of profiling in template-based DPA is to estimate the mean and variance
(respectively, mean vector and covariance matrix) of the normal random variable
(respectively, Gaussian random vector). The resulting estimations are our templates.
The steps for template-based DPA are similar to those in Sect. 4.3.2.1, except
for P-DPA Step 9, P-DPA Step 12, and P-DPA Step 13. P-DPA Step 9 will be
replaced by two steps (Template Step a and Template Step b below), P-DPA Step
262 4 Side-Channel Analysis Attacks and Countermeasures
12 will be removed, and P-DPA Step 13 will be replaced by the following Template
Step c:
Template Step a Identify point(s) of interest. The same as in P-DPA Step 9, POIs
are given by time samples that achieve the highest SNR values.
The difference is that we can choose more than one POI. With
more POIs, the effort for building the templates will increase, but
the attack results will be better. Normally the attacker decides on
the number of POIs based on experience.
Let .qPOI denote the total number of chosen POIs, and let
.t1 , t2 , . . . , tqPOI denote the time samples that have been identified
t1 = 392,
. t2 = 218, t3 = 1328.
Template Step b Build the templates. Let us fix a particular target signal value
and only consider inputs to the cryptographic algorithm that result
in traces belonging to the corresponding set .As (see P-DPA
Step 6 from Sect. 4.3.2.1). Let .Lt,s denote the random variable
representing the leakage for such encryption computations at time
sample t. Then the random vector
For our illustrations, when the target signal is .v, we will have
16 templates. And when the target signal is .wt (v), we will have
five templates.
Template Step c Statistical analysis. In this step, we would like to compute a
probability for each key hypothesis given the attack traces. For
a fixed key hypothesis .k̂i , we divide the .Mp attack traces from P-
DPA Step 10 into .Msignal sets, .A1 , A2 , . . . , AMsignal , depending
on the hypothetical target intermediate value .v̂ ij obtained in P-
DPA Step 11. In particular, for an attack trace .𝓁j , let .sij denote the
index of the set that it belongs to. Namely
𝓁j ∈ Asij
. given key hypothesis k̂i .
We are only⎛ interested in ⎞the leakages at the POIs for each attack
j j j
trace .𝓁j = l1 , l2 , . . . , lq . Define
⎛ ⎞
j j j
𝓁j,POI := lt1 , lt2 , . . . , ltq
. . (4.29)
POI
With the mean vector .μsij and the covariance matrix .Qsij
obtained in Template Step b, we can compute the probability of
.𝓁j given .k̂i using the PDF of the Gaussian random vector (see
1
P (𝓁j |k̂i ) = P (Lsij = 𝓁j,POI ) =
. qPOI √
(2π ) det Qsij
2
⎧ ⎫
1
exp − (𝓁j,POI −μsij )T Q−1
sij (𝓁j,POI −μ sij ) .
2
(4.30)
⎧ | ⎫ || M̂p
M̂p ||
P {𝓁j }j =1
. k̂
| i = P (𝓁j |k̂i ). (4.31)
j =1
⎧ | ⎫
M̂p ||
⎧ | ⎫ P {𝓁j }j =1 | k̂i P (k̂i )
|
k̂i || {𝓁j }j =1
M̂p
.P = ⎧ | ⎫ .
Σ
Mk
M̂p ||
P {𝓁j }j =1 | k̂m P (k̂m )
m=1
P (k̂m ) = P (k̂i )
.
For the attack, we expect the correct key hypothesis to have the
highest probability. In other
⎧ words,
| we are
⎫ mainly interested in the
|
| M̂p
ordering of the values .P k̂i | {𝓁j }j =1 . Since the denominators
are the same for all key hypotheses in Eq. 4.32, we can ignore
them. Then Eq. 4.32 is reduced to Eq. 4.31, which can be further
simplified by leaving out the common term (see Eq. 4.30)
1
. qPOI .
(2π ) 2
And we get
M̂p
|| ⎧ ⎫
1 1 T −1
. √ exp − (𝓁j,POI − μsij ) Qsij (𝓁j,POI − μsij ) .
det Qsij 2
j =1
By taking the natural logarithm, the ordering does not change, and
we have
M̂p
1Σ ( )
. − ln det Qsij + (𝓁j,POI − μsij )T Q−1
sij (𝓁j,POI − μsij ).
2
j =1
M̂p
Σ ( )
Pk̂i = −
. ln det Qsij + (𝓁j,POI − μsij )T Q−1
sij (𝓁j,POI − μsij ).
j =1
(4.33)
The higher the score, the more likely the hypothesis is equal to the
correct key.
Remark 4.3.4 Since the computation of covariances grows quadratically with the
number of chosen POIs, in practice, it is also common to assume leakages at
different time samples are independent. In this case, the covariance matrix .Qs
in Template Step b becomes a diagonal matrix.
First, let us choose the target signal to be the exact value of .v. We have built
16 templates. Three POIs (time samples .392, 218, 1328) were chosen as described
in Template Step a. Thus for each template, the mean vector has length 3, and
the covariance matrix has dimension .3 × 3. For example, the template for .L1 ,
corresponding to the intermediate value .v = 0, is given by
⎛ ⎞
1.6110 × 10−6 −6.2968 × 10−9 −1.0592 × 10−7
Q1 = ⎝−6.2968 × 10−9 2.2925 × 10−6 3.7191 × 10−7 ⎠ .
−1.0592 × 10−7 3.7191 × 10−7 2.2567 × 10−6
As another example, the template for .L12 , corresponding to the intermediate value
v = B, is given by
.
⎛ ⎞
1.6390 × 10−6 1.6328 × 10−7 6.3454 × 10−8
Q12 = ⎝1.6328 × 10−7 2.0256 × 10−6 1.7985 × 10−7 ⎠ .
6.3454 × 10−8 1.7985 × 10−7 2.1778 × 10−6
The probability scores for each key hypothesis are shown in Fig. 4.42, where the
blue line corresponds to the correct key hypothesis .k̂10 = 9. We can see that with
just a few traces, the correct key hypothesis can be distinguished from the other key
hypotheses.
Next, we take the target signal to be the Hamming weight of .v, .wt (v). Then
we have five templates. The POIs were chosen as described in Template Step a:
.392, 1309, 1304. The template for .L1 , corresponding to .wt (v) = 0, is given by
⎧ ⎫
2.2925 × 10−6 −8.7422 × 10−8 1.9156 × 10−7
Q1 = .
−8.7422 × 10−8 1.4864 × 10−6 −4.9987 × 10−8
266 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.42 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of .v, the 0th
Sbox output. Three POIs (time samples .392, 218, 1328) were chosen. The blue line corresponds to
the correct key hypothesis .k̂10 = 9
Fig. 4.43 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by .wt (v), the Hamming weight
of the 0th Sbox output. Three POIs (time samples .392, 1309, 1304) were chosen. The blue line
corresponds to the correct key hypothesis .k̂10 = 9
The probability scores for each key hypothesis are shown in Fig. 4.43. Similar to
Fig. 4.42, with just .2, 3 traces we can distinguish the correct key hypothesis from
the rest.
Attack results on other nibbles For now, we have seen practical demonstrations
of how the 0th nibble of the PRESENT first round key can be recovered. As we have
mentioned, DPA attacks work in a divide-and-conquer manner, recovering parts of
the key in parallel using the same set of traces. As an example, we will detail the
attack that recovers the first nibble of the first round key for PRESENT.
In P-DPA Step 1, our target cryptographic implementation is the same as before.
The profiling traces from P-DPA Step 2 will still be the Random dataset. The chosen
part of the key, k, in P-DPA Step 3 is now the first nibble of the first round key.
Consequently, the target intermediate value, .v, in P-DPA Step 4 will be the first
Sbox output. We have the same relation between k, p, and .v:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 267
Fig. 4.44 SNR for each time sample, computed using Random dataset. The signal is given by the
exact value of the 1st Sbox output
v = SBPRESENT (k ⊕ p),
.
where k and p denote the 1st nibble of the first round key and that of the plaintext.
For the target signal in P-DPA Step 5, let us choose the exact value of .v. Following P-
DPA Step 6–P-DPA Step 8, the SNR values are shown in Fig. 4.44. We will choose
one POI in Template Step a, which is given by the time sample corresponding to the
highest point in the figure-404.
Following Template Step b, 16 templates were computed. For example, the
template corresponding to the 1st Sbox output .v = 0 is given by
μ1 = −0.039027,
. σ12 = 2.1679112 × 10−6 .
As for attack traces in P-DPA Step 10, we use the same traces—Random plaintext
dataset. Then according to P-DPA Step 11 and Template Step c, the probability
scores for each key hypothesis are shown in Fig. 4.45. By Eq. 4.1, the correct value
of the 1st nibble of the first round key is given by 8. We can see that similar to the
template-based DPA attacks on the 0th key nibble (see Figs. 4.42 and 4.43), with
just a few traces, we can recover the correct key nibble value.
As another example, the attack results for attacking the 6th nibble of the first
round key are shown in Fig. 4.46, where by profiling, we have identified .POI = 464.
By Eq. 4.1, the correct value of the 6th nibble of the first round key is given by 3.
Comparing Figs. 4.42 and 4.43 to Figs. 4.39 and 4.40, we cannot draw a clear
conclusion about which attack method is better. In fact, a different ordering of the
traces in Random plaintext dataset may affect our attack results. For example, by
268 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.45 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of the 1st
Sbox output. One POI (time samples 404) was chosen. The blue line corresponds to the correct
key hypothesis .8
Fig. 4.46 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of the 1st
Sbox output. One POI (time samples 464) was chosen. The blue line corresponds to the correct
key hypothesis 3
arranging the traces in reverse order, we get Figs. 4.47 and 4.48 instead of Figs. 4.39
and 4.40.
To have a fair comparison between different attack methods (e.g., different
choices of leakage models, POIs, etc.), we introduce the notion of success rate and
guessing entropy [SMY09].
Note
In this part, our aim is to evaluate the DUT and our implementation against
DPA attacks with different settings. Thus, we assume we have the knowledge
of the key for the evaluation after the attack.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 269
M̂
p
Fig. 4.47 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the identity leakage model and the Random plaintext dataset arranged in
reverse order. The blue line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient
M̂
p
Fig. 4.48 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the Hamming weight leakage model and the Random plaintext dataset
arranged in reverse order. The blue line corresponds to the correct key hypothesis .k̂10 = 9
Fix a number of attack traces .M̂p , and for each profiled DPA attack that we
have discussed, we can assign a score to each key hypothesis after the attack: For
leakage model-based DPA attacks, the score of a key hypothesis .k̂i is given by
the absolute value of the corresponding sample correlation coefficient (Eq. 4.23),
and for template-based DPA attacks, the score of a key hypothesis is given by its
M̂p
corresponding probability score (Eq. 4.33). Let sc.i denote the score for the key
hypothesis .k̂i . We have
⎧| |
⎪ | |
⎪|r M̂p | leakage model-based DPA attack, where r M̂p is computed
⎪
⎪
⎪ | i,POI | i, POI
⎨
M̂p following Eq. 4.23
sci
. =
⎪
⎪
⎪Pk̂
⎪ template-based DPA attack, where Pk̂i is computed following
⎪
⎩ i Eq. 4.33.
(4.34)
270 4 Side-Channel Analysis Attacks and Countermeasures
We further define .scoreM̂p to be a vector consisting of the scores obtained for each
key hypothesis with our DPA attack, sorted in descending order:
⎧ ⎫
M̂ M̂ M̂ M̂ M̂
scoreM̂p = sci1 p , sci2 p , . . . , sciMp ,
. where scij p ≥ scij +1
p
k
for j = 1, 2, . . . , Mk − 1.
M̂p M̂p
The key rank of a key hypothesis .k̂i , denoted rank. , is given by the index of .sci
k̂i
in .scoreM̂p . In particular, let .k̂c denote the correct key hypothesis. We have
M̂p M̂p
.rank = index of scc in scoreM̂p . (4.35)
k̂c
With the same number of traces, we may also get different key ranks for the correct
M̂
key hypothesis due to the different plaintexts/measurements. We consider .rank p
k̂c
as a random variable whose randomness comes from the different plaintexts and
measurements.
M̂
The ultimate goal of the attack is to achieve .rank p = 16 so that we can retrieve
k̂c
the correct key hypothesis. Thus, we say that an attack is successful with .M̂p traces
M̂p
if rank. = 1. Then the success rate of an attack method with .M̂p traces, denoted
k̂c
M̂p
SR.M̂p , is defined to be the probability that rank. = 1:
k̂c
⎧ ⎫
M̂p
SRM̂p
. = P rank = 1 . (4.36)
k̂c
6 We note that if the key rank is low enough, it is possible to use key enumeration algorithms
M̂p
[VCGRS13] that enable the key recovery even in the case when .rank > 1.
k̂c
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 271
With the terminologies from Sect. 1.8.2, we can approximate .GEM̂p with a point
M̂p
estimator (see Remark 1.8.4) given by the sample mean of .rank .
k̂c
Furthermore, when we vary the number of traces .M̂p used for computing
M̂p
rank
. , we will get different key ranks for the correct key hypothesis. Thus the
k̂c
M̂p
probability for the random variable .rank = 1 and its expectation will also vary.
k̂c
To analyze how .SRM̂p and .GEM̂p change with increasing values of .M̂p , we compute
estimations for .SRM̂p and .GEM̂p according to Algorithm 4.1.
The input of Algorithm 4.1 takes two user-specified values max_trace and
no_of_attack. max_trace is the maximum number of traces (or the biggest value
of .M̂p ) we would like to use for estimating .SRM̂p and .GEM̂p . In line 3, sizes
of .Ssr and .Sge are set to be max_trace.+1 so that the .M̂p th entry of each array
corresponds to the estimation for .SRM̂p and .GEM̂p , respectively. For a fixed value
of .M̂p , no_of_attack is the number of attacks to simulate, or equivalently, the
M̂p
number of elements in the sample of .rank for computing the sample mean (i.e.,
k̂c
M̂
estimation of .GEM̂p ) and frequency of .rank p = 1 (i.e., estimation of .SRM̂p ). The
k̂c
set of attack traces from P-DPA Step 10 is denoted by dataset (line 2). For each
value of .M̂p between 2 and max_trace (line 4), we simulate no_of_attack attacks
(line 6). Thus we randomly select .M̂p ×no_of_attack traces from dataset. Those
traces are stored in an array A (line 5). Each simulated attack takes .M̂p traces from
the array A without repetition (line 7). The key rank of the correct key hypothesis
is computed following Eq. 4.35 and the attack steps described in the earlier parts of
the section. .Sge [M̂p ] stores the sum of the key ranks of the correct key hypothesis
for each attack (line 10); then the averaged value is computed as an estimate for the
guessing entropy .GEM̂p (line 14). When the key rank of the correct key hypothesis
is 1, .Ssr [M̂p ] is increased by 1 (line 12). At the end .Ssr [M̂p ] divided by the number
of total simulated attacks gives the frequency of successful attacks (line 13).
As discussed in Sect. 4.2.3, by comparing Figs. 4.12 and 4.13 (or Figs. 4.14
and 4.15), we notice that with more traces, it is more likely for us to capture infor-
mation about the inputs (or intermediate values) from the side-channel leakages.
Naturally, we expect the value of .SRM̂p to be higher and the value of .GEM̂p to
be lower when .M̂p is bigger. And the attack method that achieves .SRM̂p = 1 or
GEM̂p = 1 with smaller .M̂p is considered to be a better attack.
.
Now we are ready to compare our attack methods with attack traces from the
Random plaintext dataset. We have discussed in Example 4.2.19 that by analyzing
the Random dataset, we identified one POI for the identity leakage model and for
the Hamming weight leakage model: 392. For comparison, we also consider the
attack with a different POI 1328 for the identity leakage model and 1304 for the
Hamming weight leakage model.
272 4 Side-Channel Analysis Attacks and Countermeasures
As for template-based DPA, we consider two target signals: .v and .wt (v). For
each target signal, we look at two choices of POIs: one POI (392) and three POIs
(.392, 218, 1328 for .v and .392, 1309, 1304 for .wt (v)). When three POIs are chosen,
we also analyze the case when leakages at those POIs are assumed to be independent
(see Remark 4.3.4).
We note that when just one POI is considered, .Ls from Eq. 4.28 becomes
j
a normal random variable .Ls . .𝓁j,POI (Eq. 4.29) becomes one single point .lPOI .
According to the PDF of a normal random variable (Eq. 1.37), Eq. 4.30 becomes
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 273
Fig. 4.49 Estimations of success rate computed following Algorithm 4.1 for profiled DPA attacks
based on the stochastic leakage model, the identity leakage model, and the Hamming weight
leakage model using the Random plaintext dataset as attack traces
⎧ j ⎫
j 1 (lPOI − μsij )2
P (𝓁j |k̂i ) = P (Lsij =
. lPOI ) =/ exp − ,
2σs2ij π 2σs2ij
where .μsij and .σs2ij are estimations (template) for the mean and variance of .Lsij .
Consequently, the score of .k̂i in Eq. 4.33 is given by
M̂p
Σ j
(lPOI − μsij )2
Pk̂i = −
. ln(σs2ij ) + .
σs2ij
j =1
Following Algorithm 4.1, we can compute estimations of .SRM̂p and .GEM̂p for
our profiled DPA attacks with different settings. We have chosen
no_of_attack = 100,
. max_trace = 50.
For a fair comparison, for a given value of .M̂p , the same traces are used for all
attacks.
The results for leakage model-based profiled DPA are shown in Figs. 4.49
and 4.50. We have seen in Figs. 4.39, 4.40, and 4.41 that with the Hamming weight
or the stochastic leakage models, we can distinguish the correct key using fewer
traces as compared to using the identity leakage model. As expected, we can see
from Fig. 4.49 that fewer traces are needed for SR to reach 1 with the Hamming
weight or the stochastic leakage models. Furthermore, we can also see that attack
results for the Hamming weight or the stochastic leakage models are similar, with
the stochastic leakage model giving slightly better performance. Similarly, Fig. 4.50
shows that fewer traces are needed for GE to reach 1 using the Hamming weight or
the stochastic leakage models as compared to the identity leakage model. Moreover,
the results also demonstrate that the choice of POI is important for the attack. When
the chosen POI has a lower SNR, the attack will need many more traces.
274 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.50 Estimations of guessing entropy computed following Algorithm 4.1 for profiled DPA
attacks based on the stochastic leakage model, the identity leakage model, and the Hamming weight
leakage model using the Random plaintext dataset as attack traces
Fig. 4.51 Estimations of success rate computed following Algorithm 4.1 for template-based DPA
attacks using the Random plaintext dataset as attack traces and the Random dataset as profiling
traces
The results for template-based DPA are shown in Figs. 4.51 and 4.52. Note that
in this case the results are shown for up to 20 traces instead of 50 for leakage model-
based DPA attacks, since much fewer traces are needed for a successful attack. We
have the following observations:
• When the target signal is given by .v, the attack requires fewer traces as compared
to the case when the target signal is given by .wt (v). This is expected as for the
former case we have 16 templates while for the latter we have 5. Of course, the
attack results demonstrated that we had enough traces for profiling to get good
templates. Without enough profiling traces, different attack results might appear.
• Assuming independence between the leakages at different POIs does not affect
the attack results significantly. Especially for the case when the target signal is
given by .v with three POIs, those two lines are overlapping.
• Using three POIs gives better results than just one POI.
• Compared to Figs. 4.49 and 4.50, template-based DPA, in general, performs
better than leakage model-based DPA. This is not surprising as more information
is retrieved from the profiling traces using template-based attacks.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 275
Fig. 4.52 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks using the Random plaintext dataset as attack traces and the Random dataset as
profiling traces
Fig. 4.53 Estimations of success rate computed following Algorithm 4.1 for leakage model-based
and template-based DPA attacks with the Random plaintext dataset as attack traces
For easy comparison, we have also plotted the results for template-based DPA
with one POI and leakage model-based DPA in Figs. 4.53 and 4.54
Fig. 4.54 Estimations of guessing entropy computed following Algorithm 4.1 for leakage model-
based template-based DPA attacks with the Random plaintext dataset as attack traces
Definition 4.3.1 For an Sbox SB.: Fω2 1 → Fω2 2 , the (extended) difference distribu-
tion table (DDT)7 of SB is a two-dimensional table T of size .(2ω1 − 1) × 2ω2 such
that for any .0 < δ < 2ω1 and .0 ≤ Δ < 2ω2 , the entry of T at the .Δth row and .δth
column is given by
{ }
.T [Δ, δ] = a | a ∈ Fω2 1 , SB(a ⊕ δ) ⊕ SB(a) = Δ .
Remark 4.3.5 Suppose we know the input difference and output difference for a
particular Sbox input. Then with the DDT we can deduce the possible values of the
input. For example, if we know one PRESENT Sbox input .a with input difference
A gives output difference 2. Then by Table 4.1, .a = 5 or .F. We will utilize such
observations for SCADPA attacks and for certain fault attacks in Sect. 5.1.
Attack assumption of SCADPA For SCADPA, we have the following assump-
tions for the attacker’s knowledge and ability:
• The attacker does not have knowledge of the exact details of the implementation.
However, the attacker knows certain basic parameters of the implemented
7 In the original definition of DDT [BS12], the entries are .|T [Δ, δ]|, i.e., the cardinalities of
.T [Δ, δ].
Table 4.1 Difference distribution table for PRESENT Sbox (Table 3.11). The columns correspond to input difference .δ, and the rows correspond to output
difference .Δ. The row for .Δ = 0 is omitted since it is empty
❍
❍❍ δ 1 2 3 4 5 6 7 8 9 A B C D E F
Δ ❍
1 9A 36 078F 5E 1C 24BD
2 8E 34 09 5F 1D 67AB 2C
3 CDEF 46 12 3B 0A 58 79
4 47 8D 35AC 0B 2F 169E
5 CDEF 0145 2389 67AB
6 9B CDEF 37 06 25 18 4A
7 67AB 03 8C 5D 2E 49 1F
8 17 AD 6F 4E 2389 0C 5B
9 0145 9D BE 2A 7C 3F 68
A 02 56 BF 9C 7D 1A 48 3E
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers
B 8B 27 35AC 169E 4F 0D
C 8a 26 0145 9F BC 7E 3D
D 2389 57 AF 4C 1B 6D 0E
E 13 AE 24BD 6C 59 078F
F 24BD 169E 078F 35AC
277
278 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.55 A possible sequence of XOR differences between the cipher states of two encryptions,
where colored squares correspond to active bytes. AK, SB, SR, and MC stand for AddRoundKey,
SubBytes, ShiftRows, and MixColumns, respectively
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 279
Fig. 4.56 An example of how the XOR differences between the cipher states can change after each
round operation of PRESENT. The output differences of the four active Sboxes in round 1 are 1.
The output difference of the single active Sbox in round 2 is also 1
bits of sBoxLayer output is 1111. Thus, after the first round, .S1 ⊕ S1' has 4 active
bits which correspond to the 0th Sbox input of round 2. Then we again get an output
difference 1 for this Sbox, giving us just 1 active bit in .S2 ⊕ S2' . Consequently, we
have one active nibble after the sBoxLayer operation in round 3.
A cipher state can be written as the concatenation of several small parts of the
same bit length .ω. In particular, let .𝓁 = n/ω, where n is the block length of the SPN
cipher. We have
where each .sij and .sij' is a binary string of length .ω. A differential characteristic for
round i, denoted .ΔSi , is a binary string of length .𝓁:
We say that the intermediate values of two encryptions .Si and .Si' achieve the
differential characteristic .ΔSi if
{
' =0 if Δsij = 0
.sij ⊕ sij ∀j = 0, 1, . . . , 𝓁 − 1.
/= 0 if Δsij = 1
A sequence of .ΔSi s
is called a differential pattern. If .wt (ΔSr ) = 1, we say that the differential pattern
converges in round r. A plaintext pair is said to achieve a differential pattern if the
corresponding intermediate values achieve each of the differential characteristics in
this differential pattern.
Example 4.3.9 [Differential pattern—AES] Continuing Example 4.3.7, we choose
ω = 8, and then .𝓁 = 128/8 = 16. Figure 4.55 corresponds to the following
.
differential pattern:
S0 = 4C3C3F54C7AAD34E607110C753C5E990,
.
S0' = 033C3F54C725D34E607131C753C5E90F,
34463146344638383341464542413731.
. (4.40)
Then
achieves the differential characteristic .ΔS0 from Eq. 4.39. After one round of AES,
we have
S1 = 1F1DABAE4071BDD502563FBF63841BAE,
.
S1' = C81DABAE4071BDD502563FBF63841BAE,
and
. S1 ⊕ S1' = D7000000000000000000000000000000.
Hence .S1 and .S1' achieve the differential characteristic .ΔS1 from Eq. 4.39. The
differential value for the 2 active bytes in .S1 ⊕ S1' is D7. We can conclude that
the pair of plaintexts .S0 and .S0' achieves the differential pattern given in Eq. 4.39.
Remark 4.3.6
• Following the convention for AES intermediate value representations
(see [NIS01]), the string of hexadecimal values is transferred to the 4 .×
4 matrix of bytes (see Eq. 3.2) column by column. For example, .S0 =
4C3C3F54C7AAD34E607110C753C5E990 in the matrix format is as follows:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 281
⎛ ⎞
4C C7 60 53
⎜3C AA 71 C5⎟
.⎜ ⎟.
⎝3F D3 10 E9⎠
54 4E C7 90
S0 = DCFC2D56F32EC070,
. S0' = DCFC2D56F32E3F8F,
1234567812345678.
. (4.42)
Then
which achieves the differential characteristic .ΔS0 as given in Eq. 4.41. After the first
round, we get
S1 = 0A93D18CAF9C888B,
. S1' = 0A93D18CAF9C8884,
which achieves the differential characteristic .ΔS1 from Eq. 4.41 since .4 ⊕ B = F. In
other words, the differential value for the active nibble in .S1 ⊕ S1' is F. Finally, after
the second round, we get
S1 = C09B5DFC8AF48EF3,
. S2' = C09B5DFC8AF48EF2,
pr + 1
Mp =
. . (4.43)
2
Example 4.3.11 [Probability of convergence—AES] Let us consider AES and the
differential characteristic .ΔS0 given by
ΔS0 = 1000010000100001.
. (4.44)
We would like to compute the probability that .ΔS0 results in a differential pattern
that converges in round 1, namely
If we take any plaintext pair that achieves differential characteristic .ΔS0 , after
AddRoundKey and SubBytes operations, those 4 active bytes in the main diagonal
will remain active. ShiftRows changes their positions to be all in the first column.
Then after MixColumns and AddRoundKey, any byte in the first column can
be active. Thus, all the possible differential characteristics .ΔS1 following the
differential characteristic .ΔS0 are of the form
where .a i ∈ F82 for .i = 0, 1, 2, 3 and .a i0 /= 0 for some .i0 ∈ {0, 1, 2, 3}. Then there
are in total
(28 )4 − 1 = 232 − 1
.
4 × (28 − 1) ≈ 210
. satisfy wt (ΔS1 ) = 1.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 283
Fig. 4.57 Illustration of how active bytes change for all four differential patterns that start with
.ΔS0 = 1000010000100001 and converge in round 1. Blue squares correspond to active bytes. AK,
SB, SR, and MC stand for AddRoundKey, SubBytes, ShiftRows, and MixColumns, respectively
There are in total .232 − 1 possible differential values for the 4 active bytes before
MixColumns operation. According to Remark 3.1.3, any value of .S1 ⊕ S1' comes
from exactly one differential value for those 4 active bytes. Suppose differential
values of those 4 active bytes follow a uniform distribution on .F32 2 . Then the
probability of any value of .S1 ⊕ S1' to occur is .≈ 2−32 . Consequently, we have
210
P (wt (ΔS1 ) = 1|ΔS0 = 1000010000100001) ≈
. = 2−22 .
232
In this case, .pr = 22. By Eq. 4.43,
22 + 1
Mp =
. = 11.5.
2
Thus, we need .211.5 chosen plaintexts to get a differential pattern that starts with
.ΔS0 as given in Eq. 4.44 and converges in round 1.
ΔS0 = 000000000000FFFF.
. (4.46)
Let SB denote the PRESENT Sbox. We would like to compute the probability of a
differential pattern that starts with .ΔS0 and converges in round 2, namely
Let .SBij denote the j th Sbox in round i. Recall that the 0th Sbox is the right-most
Sbox (see Fig. 3.9). Let .δSBi and .ΔSBi denote the input and output differences of
j j
Sbox .SBij , respectively.
For .ΔS2 to have Hamming weight 1, we need to have just one active Sbox in
round 2 with output difference having Hamming weight 1. Let .SB2j0 be the single
active Sbox in round 2.
By the design of pLayer (see Table 3.12), the four active Sboxes in round 1
SB10 ,
. SB11 , SB12 , SB13
We also notice that the j th bit of all the four Sboxes in round 1 goes to the .(4 ∗ j )th
Sbox in round 2. Since none of the output differences of those four Sboxes in round
1 is equal to 0, to have just one active Sbox in round 2, the output differences of
those four active Sboxes in round 1 should all be the same with Hamming weight
1. This implies that the input difference of the single active Sbox in round 2, .SB2j0 ,
is F. Furthermore, by Eq. 4.46, those four active Sboxes in round 1 all have input
difference F.
According to Table 4.1, for input difference F, the possible output differences
with Hamming weight 1 are 1 and 4. By counting the number of elements in each
entry of column F in Table 4.1, we can get that the probability for the output
difference to be 1, given that the input difference is F, is .4/16 = 1/4. The same
result holds for output difference 4. The probability that all output differences of the
four active Sboxes in round 1 are equal to 1 is then given by
⎛ | ⎞ ⎧ 1 ⎫4
|
P ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 1|δSB1 = δSB1 = δSB1 = δSB1 = F =
.
0 1 2 3 0 1 2 3 4
= 2−8 .
Similarly, we have
⎛ | ⎞ ⎧ 1 ⎫4
|
P ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 4|δSB1 = δSB1 = δSB1 = δSB1 = F =
.
0 1 2 3 0 1 2 3 4
= 2−8 .
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 285
The probability for the single active Sbox in round 2 to have output difference with
Hamming weight 1 is given by
⎧ ⎧ ⎫ | ⎫ ⎧ | ⎫
| |
P wt ΔSB2 = 1|δSB2 = F = P ΔSB2 = 1|δSB2 = F
.
j0 j0 j0 j0
⎧ | ⎫
| 1 1
+ P ΔSB2 = 4|δSB2 = F = + = 2−1 .
j0 j0 4 4
SB10 ,
. SB11 , SB12 , SB13
are all equal to 1 (respectively, 4), the single active Sbox in round 2 is given by .SB20
(respectively, .SB28 ). We have
−1 −8 −1 −9 −9 −8
=2 ×2
8
+2 ×2 =2 +2 =2 .
8+1
. Mp = = 4.5.
2
Thus we need .24.5 chosen plaintexts to get a differential pattern that starts with .ΔS0
as given in Eq. 4.46 and converges in round 2.
From the discussions above, we can see that there are in total four such
differential patterns, corresponding to
We have seen the first one in Fig. 4.56 (see Example 4.3.8). The remaining three are
shown in Figs. 4.58, 4.59, and 4.60, respectively.
In SCADPA, the attacker queries the encryption with pairs of plaintexts that
achieve a target differential characteristic .ΔS0 and potentially result in a differential
pattern that converges in round r. .ΔS0 and the round number r are chosen so that
the probability of convergence is not too small. Then by comparing side-channel
286 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.58 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converging in round 2. The output differences of the four
active Sboxes in round 1 are 1. The output difference of the single active Sbox in round 2 is 4
Fig. 4.59 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converges in round 2. The output differences of the four
active Sboxes in round 1 are 4. The output difference of the single active Sbox in round 2 is 1
leakages of a middle round from both encryptions for a pair of plaintexts, the
attacker tries to confirm if the convergence is achieved and identify the differential
characteristic .ΔSr when convergence happens. Thus, we need to choose .ΔS0 and r
in a way that we can find a point for side-channel observation so that the leakages
can tell us whether the convergence has happened, and if yes, what is the value of
.ΔSr .
Fig. 4.60 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converges in round 2. The output differences of the four
active Sboxes in round 1 are 4. The output difference of the single active Sbox in round 2 is 4
1000010000100001. Then we query the encryption with plaintext pairs that achieve
this .ΔS0 . For each plaintext, we take, say, .Np traces and use the averaged trace as
the leakages for this plaintext. By averaging, the noise can be reduced. Then the
difference between averaged traces of each pair of plaintext is computed.
As discussed in Example 4.3.11, there are four differential patterns that start with
.ΔS0 and converge in round 1. They are given by the following four values of .ΔS1 :
1000000000000,
. 0000100000000, 0000000010000, 0000000000001,
corresponding to the single active byte at the end of round 1 being the first, second,
third, and fourth bytes in the first column. Figure 4.61 shows how the active bytes
change from round 1 to round 3 for all four differential patterns. In the second round,
SubBytes does not change the position of this single active byte. ShiftRows changes
its position to a different column unless this active byte is the first byte. Due to
the property of MixColumns operation (see Remark 3.1.3), this single active byte
will influence 4 bytes, leading to 4 active bytes in one single column of the cipher
state. Finally, AddRoundKeys in round 2 and SubBytes operation in round 3 will
not change the position or number of active bytes.
As discussed in Example 4.3.11, all possible differential characteristics .ΔS1 are
of the form as given in Eq. 4.45. In case .wt (ΔS1 ) /= 1, we will have more than 1
active byte at the end of round 1, which will be in more than one column after the
SubBytes and ShiftRows operations in round 2. Consequently, there will be at least
two active columns at the end of round 2. We can then conclude that
Fig. 4.61 Illustration of how active bytes change from round 1 to round 3 of AES computation,
for differential patterns that start with .ΔS0 = 1000010000100001
. 4C3C3F54C7AAD34E607110C753C5E990,
033C3F54C725D34E607131C753C5E90F; (4.47)
8 https://github.com/kokke/tiny-AES-c
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 289
. 3B06201F5EAA0BD6794C249610FBE927,
5F06201F5E750BD6794CB79610FBE995; (4.48)
. 2D2A49F26A79655214056A7B5F35A9E9,
D12A49F26ACC655214052D7B5F35A9C6; (4.49)
. 0EDB19A25C7EF1FDDED31178EE6E7478,
FADB19A25C06F1FDDED30E78EE6E7415. (4.50)
Np = 100 traces were collected for each plaintext. All pairs of plaintexts achieve
.
the same differential characteristic .ΔS0 = 1000010000100001. The .ΔS1 values are
given by
1000000000000,
. 0000100000000, 0000000010000, 0000000000001,
respectively. Illustrations of the active bytes change for each pair correspond to the
four rows of Fig. 4.61.
In Fig. 4.62, the difference between the averaged traces of each pair of the
plaintexts is in red, blue, green, and yellow, respectively. We have also plotted
the averaged traces for the first plaintext in Eq. 4.47 (in gray), for the purpose of
identifying the round operations. Similar to Fig. 4.3, we can find the rough time
interval for the SubBytes operation in round 3, which is colored in pink. This is the
point for our side-channel observation. After zooming in, we get Fig. 4.63. Recall
that the SubBytes operation was implemented column-wise starting from the first
column. By the choice of the plaintext pairs, the red, blue, green, and yellow traces
correspond to a single active column (see Fig. 4.61) at the first, second, third, and
Fig. 4.62 The difference between the averaged traces of plaintext pairs from Eqs. 4.47, 4.48, 4.49,
and 4.50 is in red, blue, green, and yellow, respectively. The averaged trace for the first plaintext
in Eq. 4.47 is in gray. With this gray plot, similar to Fig. 4.3 we can find the rough time interval for
the SubBytes operation in round 3, which is colored in pink
290 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.63 Zoom in to the SubBytes computation (pink area) in Fig. 4.62. The difference between
the averaged traces of plaintext pair from Eqs. 4.47, 4.48, 4.49, and 4.50 is in red, blue, green,
and yellow, respectively. They correspond to a single active column at the first, second, third, and
fourth positions, respectively, during the SubBytes operation in round 3
fourth positions, respectively. This agrees with what we see in Fig. 4.63—the four
colored peaks are in sequential order.
Example 4.3.14 [Point for side-channel observation—PRESENT] In this example,
we look at PRESENT encryption and let SB denote the PRESENT Sbox. We take
.ω = 1, and we choose
ΔS0 = 000000000000FFFF.
.
We aim to find a pair of plaintexts .S0 and .S1 that achieves a differential pattern
starting with .ΔS0 and converging in round 2. For each plaintext, we take .Np traces
and use the averaged trace as the leakages for this plaintext. Then the difference
between averaged traces of each pair of plaintext is computed. We assume that
the sBoxLayer operation is implemented nibble-wise, starting from the 0th nibble
(right-most) to the 15th nibble (left-most).
Convergence in round 2 means that there is just 1 active bit at the end of round
2. Consequently, we will have just one active Sbox before pLayer in round 3.
On the other hand, suppose there is just one active Sbox in round 3. As discussed
in Example 4.3.12, with .ΔS0 , the four active Sboxes in round 1 are
SB10 ,
. SB11 , SB12 , SB13 .
By the design of pLayer we know each of those four Sboxes from round 2 will affect
four Sboxes in round 3 as shown below:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 291
In particular, they all influence different Sboxes in round 3. Since there is just one
active Sbox in round 3, there is just one active Sbox in round 2. We also note that
different bits of the output of an Sbox in round 2 go to different Sboxes in round
3, and we can then conclude that there is just 1 active bit at the end of round 2.
Moreover, with the position of the active Sbox in round 3, we can further identify
the position of the active bit in round 2 with our knowledge of pLayer.
Thus, by observing the leakages around sBoxLayer in round 3, we will be able
to see if the convergence has happened and identify the value of .ΔS2 .
As an example, let us take the master key to be the one given by Eq. 4.42. We
also take the plaintext pair from Example 4.3.10, namely
The experimental setup is as described in Sect. 4.1, and measurements were done
for three rounds of PRESENT computations. .Np = 2000 traces were collected for
each plaintext. Recall that this pair of plaintext achieves the following differential
pattern:
ΔS2 = 0000000000000001.
In particular, there is one single active Sbox SB.30 before the pLayer operation of
round 3.
For comparison, we also collected 2000 traces for each of the following four
plaintexts:
8F5F8BD2E7CF5989,
. 8F5F8BD2E7CFA676 (4.52)
and
F2DCDC8341D45F79,
. F2DCDC8341D4A086, (4.53)
where the first pair of plaintext (Eq. 4.52) achieves the same differential character-
istics .ΔS0 and .ΔS1 , but at the end of round 2, the differential characteristic is given
by
0000000100000000.
.
292 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.64 The difference between the averaged traces of .S0 and .S0' from Eq. 4.51 (in red), plaintext
pair from Eq. 4.52 (in blue), and plaintext pair from Eq. 4.53 (in green). The averaged trace for .S0 is
in gray. With this gray plot, similar to Fig. 4.3 we can find the rough time interval for the sBoxLayer
operation in round 3, which is colored in pink
In this case, we have one single active Sbox SB.38 before the pLayer operation of
round 3.
The second pair of plaintext (Eq. 4.53) also achieves the same .ΔS0 and .ΔS1 ,
while the differential characteristic at the end of round 2 is given by
0001000100010000.
.
Then for this pair of plaintext, there are three active Sboxes (SB.34 , SB.38 , SB.312 ) before
the pLayer operation of round 3.
In Fig. 4.64, the difference between the averaged traces of .S0 and .S0' (Eq. 4.51),
plaintext pair from Eq. 4.52, and plaintext pair from Eq. 4.53 are in red, blue, and
green, respectively. We have also plotted the averaged traces for .S0 (in gray) for
the purpose of identifying the round operations. Similar to Fig. 4.3, we can find the
rough time interval for the sBoxLayer operation in round 3, which is colored in
pink. This time interval corresponds to our point of side-channel observation. After
zooming in, we get Fig. 4.65.
Recall that the sBoxLayer is implemented nibble-wise. From the above discus-
sions, we know that the red, blue, and green traces correspond to active Sboxes
before round 3 pLayer operation, respectively. This agrees with what we see in
Fig. 4.65. There is a single peak in the red line and the blue line, while the green line
has three peaks. The peak in the red line (.SB30 ) is at the beginning of the sBoxLayer.
The first peak of the green line (.SB34 ) is between the peaks of the red (.SB30 ) and blue
(.SB38 ) lines. The peak of the blue line coincides with the second peak of the green
line (.SB38 ). The last peak of the green line (.SB312 ) is in the last quarter of the whole
time interval.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 293
Fig. 4.65 Zoom in to the sBoxLayer computation (pink area) in Fig. 4.64. The difference between
the averaged traces of .S0 and .S0' from Eq. 4.51 (in red), plaintext pair from Eq. 4.52 (in
blue), and plaintext pair from Eq. 4.53 (in green). They correspond to active Sboxes .SB30 ; .SB38 ;
3 3 3
.SB4 , SB8 , SB12 before pLayer of round 3
β0 Eα Eα α α
β1 AK 9α SR 9α MC AK
β2 SB Dα Dα
β3 Bα
S0 ⊕ S0' S1 ⊕ S1'
Fig. 4.66 An illustration of differential values for the differential pattern .ΔS0 =
1000010000100001 and .ΔS1 = 1000000000000000
Our ultimate goal is to recover information about the secret keys. Thus, another
criterion for choosing .ΔS0 and r is that the possible key hypotheses can be reduced
once we find a pair of plaintexts that achieves a converging differential pattern, and
we know the value of .ΔSr .
Example 4.3.15 [Reduce key hypotheses—AES] Let us consider AES with
.ω = 8. As an attacker, we choose the target differential characteristic .ΔS0 =
1000010000100001. Then we query the encryption with plaintext pairs that achieve
this .ΔS0 . Suppose with the help of side-channel leakages, we have identified a pair
of plaintexts .S0 and .S1 that gives a differential pattern converging in round 1 with
.ΔS1 = 1000000000000000. Let .α be the differential value of the single active byte
at the end of round 1. Then, using InvMixColumns (see Eq. 3.7), the differential
value of the 4 active bytes right after the SubBytes operation in round 1 is given by
0E · α,
. 09 · α, 0D · α, 0B · α.
Let .β0 , β1 , β2 , and .β0 be the differential values of the 4 active bytes in the main
diagonal of the plaintexts. An illustration is shown in Fig. 4.66.
294 4 Side-Channel Analysis Attacks and Countermeasures
We represent the master key of AES (which is also the whitening key used at the
beginning of the encryption) as a matrix:
⎛ ⎞
k00 k01 k02 k03
⎜k10 k11 k12 k13 ⎟
.K = ⎜ ⎟.
⎝k20 k21 k22 k23 ⎠
k30 k31 k32 k33
We represent the plaintext .S0 as the following matrix (note that this representation
follows the same notation as in Eq. 3.2, which is different from the notations in
Eq. 4.38):
⎛ ⎞
s00 s01 s02 s03
⎜s10 s11 s12 s13 ⎟
.S0 = ⎜ ⎟.
⎝s20 s21 s22 s23 ⎠
s30 s31 s32 s33
Then we have
Thus,
s00 ⊕ k00 ,
. s11 ⊕ k11 , s22 ⊕ k22 , s33 ⊕ k33
0E · α,
. 09 · α, 0D · α, 0B · α
β1 ,
. β2 , β3 , β4 ,
respectively. Then, by using the difference distribution table for AES Sbox and
with the knowledge of the plaintexts, we can reduce the key hypotheses (see
Remark 4.3.5).
As an example, let us take the master key to be the one given by Eq. 4.40.
Continuing Example 4.3.13, with side-channel leakages, we have identified the
following pair of plaintexts that achieves the differential pattern mentioned above,
namely,
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 295
S0 = 4C3C3F54C7AAD34E607110C753C5E990,
.
S0' = 033C3F54C725D34E607131C753C5E90F.
β1 = 4C ⊕ 03 = 4F
.
β2 = AA ⊕ 25 = 8F
β3 = 10 ⊕ 31 = 21
β4 = 90 ⊕ 0F = 9F.
And
s00 = 4C,
. s11 = AA, s22 = 10, s33 = 90.
Thus,
4C ⊕ k00 ,
. AA ⊕ k11 , 10 ⊕ k22 , 90 ⊕ k33
0E · α,
. 09 · α, 0D · α, 0B · α
4F,
. 8F, 21, 9F,
respectively. To find the possible values of .k00 , k11 , k22 , k33 , we first find values of
α such that the following entries of the AES Sbox DDT are nonempty:
.
(0E · α, 4F),
. (09 · α, 8F), (0D · α, 21), (0B · α, 9F).
There are in total 13 of them, as shown in Table 4.2. Each of those values gives a
few hypotheses for .k00 ⊕ 4C, k11 ⊕ AA, k22 ⊕ 10, k33 ⊕ 90.
Consequently, we can find all the possible values for the 4 key bytes, as shown
in Table 4.3. The correct master key in Eq. 4.40 and the corresponding correct value
of .α are marked in blue. We note that the remaining number of key hypotheses is
given by
24 × 12 + 23 × 4 = 224,
.
while the number of all possible key hypotheses for those 4 bytes is
(28 )4 = 232 .
.
296 4 Side-Channel Analysis Attacks and Countermeasures
We can see that the attack can significantly reduce the key hypotheses.
Example 4.3.16 [Reduce key hypotheses—PRESENT] Now we look at PRESENT
encryption. Take .ω = 1, and let
ΔS0 = 000000000000FFFF.
. (4.54)
We aim to find a pair of plaintexts .S0 and .S1 that achieve a differential pattern
starting with .ΔS0 and converging in round 2. Suppose by analyzing the side-
channel leakages, we have identified such a pair of plaintexts .S0 and .S1 that gives a
differential pattern converging in round 2 with
ΔS2 = 0000000000000001.
. (4.55)
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 297
Since there is only 1 active bit (bit 0) at the end of round 2, we know by the design of
PRESENT that this means there is only one active Sbox in round 2—Sbox .SB20 (see
Fig. 4.56). By analyzing the pLayer operation, we know that the output differences
of Sboxes .SB10 , SB11 , SB12 , SB13 are all equal to 1. By our choice of plaintexts, we
also know that the input differences of those Sboxes are all equal to F. According to
PRESENT Sbox DDT in Table 4.1, the inputs of those four Sboxes are among 2, 4,
B, and D. In other words, let
S0 = b63 b62 . . . b1 b0 .
.
And let
K1 = κ63
.
1 1
κ62 . . . κ01
44 = 28 = 256,
.
while the total number of all possible key hypotheses for those 16 bits is .216 .
As an example, let us take the master key to be the one given by Eq. 4.42. We can
compute that the first round key is given by
K1 = 0000123456781234.
. (4.57)
. 0 ⊕ κ31 κ21 κ11 κ01 ∈ {2, 4, B, D} , 7 ⊕ κ71 κ61 κ51 κ41 ∈ {2, 4, B, D} ,
1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} , C ⊕ κ 1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} .
0 ⊕ κ11 10 9 8 15 14 13 12
We can then reduce all the possible key hypotheses for the 0th–15th bits of .K1 :
. κ31 κ21 κ11 κ01 ∈ {2, 4, B, D} , κ71 κ61 κ51 κ41 ∈ {5, 3, C, A} ,
298 4 Side-Channel Analysis Attacks and Countermeasures
1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} , κ 1 κ 1 κ 1 κ 1 ∈ {E, 8, 7, 1} ,
κ11 10 9 8 15 14 13 12
where the correct key nibbles given by Eq. 4.57 are marked in blue.
Up to now, we have seen how SCADPA can reduce the key hypotheses on 4 bytes
of AES master key and 4 nibbles of the first round key for PRESENT. In general,
the steps for SCADPA are as follows:
SCADPA Step 1 Choose the target cryptographic implementation. SCADPA
applies to all SPN ciphers that have been proposed so far. As
running examples, we will continue to discuss the attacks on AES-
128 and PRESENT.
SCADPA Step 2 Choose the value .ω. Based on our chosen cipher, we need
to decide the value of .ω for our attack. This value is highly
dependent on the cipher design. In general, for AES-like ciphers,
we would choose .ω to be the same as the size of the Sbox. And
for bit permutation-based ciphers (e.g., PRESENT), we choose .ω
to be 1.
SCADPA Step 3 Identify a target differential characteristic .ΔS0 , a round
number r for convergence, and a point for side-channel obser-
vation. We would like to look for plaintext pairs that achieve a
differential pattern starting with .ΔS0 and converging in round
r. We also need to decide on a point for side-channel leakage
analysis during the computation after round r. The choice of .ΔS0 ,
r, and the point for side-channel observation should satisfy the
following conditions:
• The probability of convergence is not too small. In particular,
if the probability is .2−pr , we will need .2Mp chosen plaintexts
for the attack, where .Mp = 0.5pr + 0.5.
• Using side-channel leakages at the chosen point of mea-
surement, we should be able to confirm if the convergence
has appeared for the differential pattern between a pair of
plaintexts. Furthermore, it is possible to identify the value of
.ΔSr in case the convergence appears.
ω = 8,
. ΔS0 = 1000010000100001, r = 1,
and the point for side-channel observation being the SubBytes operation in round 3.
Then we query AES encryption with .211.5 (see Example 4.3.11) chosen plaintexts
such that each pair of them achieves the differential characteristic .ΔS0 . With side-
channel leakages, we can deduce if convergence has happened, and if yes, we record
the value of .ΔS1 (see Example 4.3.13). Finally with a similar computation as in
Example 4.3.15, we reduce the key hypotheses for the 4 bytes in the main diagonal
of the master key.
Similar attacks can be carried out on the other “diagonals” of the master key to
reduce the key hypotheses of the whole master key. In particular, the other values of
.ΔS0 can be
The possible differential patterns for each .ΔS0 are shown in Fig. 4.67, where each
figure represents four different differential patterns starting with the same .ΔS0 . The
blue-colored squares represent active bytes, and only one of those 4 colored bytes
is active in the last two cipher states (so that the differential pattern converges in
round 1).
.ω = 1, ΔS0 = 000000000000FFFF, r = 2,
and point for side-channel observation being the sBoxLayer operation in round
3. Then we query PRESENT encryption with .24.5 (see Example 4.3.12) chosen
plaintexts such that each pair of them achieves the differential characteristic .ΔS0 .
With side-channel leakages, we can deduce if convergence has happened, and if yes,
we record the value of .ΔS2 (see Example 4.3.14). Finally with a similar computation
as in Example 4.3.16, we reduce the key hypotheses for the 0th–15th bits of the first
round key. We have also computed that the remaining number of key hypotheses
will be .28 instead of the original .216 .
Similar attacks can be carried out on the other bits of the first round key to reduce
the key hypotheses of the whole round key. In particular, the other values of .ΔS0
can be
300 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.67 The possible differential patterns for AES encryption with .ΔS0 equal to
.1000010000100001, 0100001000011000, 0010000110000100, 0001100001000010, respectively.
Each figure represents four different differential patterns starting with the same .ΔS0 . The blue-
colored squares represent active bytes and only one of those 4 colored bytes is active in the last
two cipher states
Fig. 4.68 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
00000000FFFF0000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 4th, 6th, 32nd, or 34th bit
The possible differential patterns for each of the three values of .ΔS0 are shown in
Figs. 4.68, 4.69, and 4.70. Each figure shows four differential patterns that converge
in round 2. Each differential pattern has 1 active nibble at the end of round 1 and a
single active bit at the end of round 2.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 301
Fig. 4.69 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
0000FFFF00000000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 8th, 10th, 36th, or 38th bit
Fig. 4.70 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
FFFF000000000000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 12th, 14th, 40th, or 42nd bit
In this section, we will discuss one SPA and one DPA attack on implementations of
RSA and RSA signatures.
Following the same notations from Sect. 3.3, let .p, q be two distinct odd primes.
.n = pq and .e ∈ Z
∗ −1 mod ϕ(n) is the private key.
ϕ(n) are the public keys. .d = e
Furthermore, let
d𝓁d −1 d𝓁d −2 . . . d1 d0
.
a d mod n
. (4.58)
for some .a ∈ Zn . For both attacks, we focus on one particular method for
implementing the modular exponentiation—the left-to-right square and multiply
algorithm (Algorithm 3.8). A similar SPA attack can also be applied to the right-to-
left square and multiply algorithm (Algorithm 3.7). We note that the attacks can be
carried out during either the decryption of RSA or the signature signing procedure
of RSA signatures.
For the experiments, we have set the values of the parameters as given in
Examples 3.3.2:9
We have seen that DPA exploits the relationship between leakages at specific
time samples and the data being processed in the DUT. SPA, on the other hand,
analyzes leakages along the time axis, exploiting relationships between leakages
9 Note that for easy illustration, the values we choose for p and q are much smaller than practical
values.
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 303
9 return t
and operations. Similar to profiled DPA, SPA requires knowledge of the exact
implementation.
We have seen in the analysis of Fig. 4.3 that different operations can be deduced
from observing the power traces. An SPA attack on the square and multiply
algorithm works with a similar method—we examine the traces to figure out if both
square and multiplication are executed in one loop from line 5 (the corresponding
bit of d is 1) or not (the corresponding bit of d is 0). Following Kerckhoffs’ principle
(see Definition 2.1.3), we assume the attacker has the knowledge of Algorithm 4.2
except for the values of bits of d in line 2.
With the experimental setting as described in Sect. 4.1, we measured one power
trace for the computation of Algorithm 4.2 on our DUT. The trace is shown in
Fig. 4.71. We can see ten similar patterns. By examining Algorithm 4.2, we have
two guesses:
Guess a Each pattern corresponds to one modular operation (modular square from
line 6 or modular multiplication from line 8).
Guess b Each pattern corresponds to one loop from line 5.
Let S denote the modular square operation from line 6 and M the modular
multiplication from line 8. We observe that the loop in line 5 contains either one
square operation (S) or one square followed by one multiplication operation (SM).
We also have the following correspondence between operations in loop i and the ith
bit of the secret key d:
We further notice that there are mainly two types of patterns in Fig. 4.71, one
with a single cluster of peaks and one with more than one cluster of peaks. They are
colored in green and blue in Fig. 4.72, respectively.
304 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.71 One trace corresponding to the computation of Algorithm 4.2. We can see ten similar
patterns
Fig. 4.72 Highlighted two types of patterns from Fig. 4.71. One pattern with a single cluster of
peaks (colored in green) and one with more than one cluster of peaks (colored in blue)
Let us first assume that Guess a is correct. Based on the above observations, we
have two possibilities to consider:
• The (green colored) single peaked patterns correspond to modular square opera-
tion (S), and the (blue colored) multiple peaked patterns correspond to modular
multiplication operation (M).
• The (green colored) single peaked patterns correspond to modular multiplication
operation (M), and the (blue colored) multiple peaked patterns correspond to
modular square operation (S).
We know that .d0 = 1 (see Remark 4.4.1). Then we can deduce that the last blue-
colored pattern in Fig. 4.72 does not represent a single modular square operation
(S). On the other hand, the start of the computation will always be a modular square
operation, which then indicates that the first blue-colored pattern corresponds to S.
We have reached a contradiction, and we conclude that Guess a is not correct.
Next, we assume Guess b is correct. Similarly, we have two possibilities to
consider:
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 305
• The (green colored) single peaked patterns represent a single modular square
operation (S), i.e., the corresponding bit of d is 0, and the (blue colored) multiple
peaked patterns represent SM and the corresponding bit of d is 1.
• The green-colored patterns correspond to SM, and the blue-colored patterns
correspond to S.
As discussed above, the end of the computation does not stop with .d0 = 0, and
thus the blue-colored patterns represent SM, i.e., the corresponding bit of d is 1.
Consequently, the green-colored patterns correspond to loops with the bit of d being
0. We can then read out the value of bits .di (.i = 𝓁d − 1, . . . , 0, 1) from Fig. 4.72:
1 0 1 1 1 0 1 0 1 1.
.
d = 10111010112 = 747.
.
One might argue that the first green pattern in Fig. 4.72 may also be a multiple
peaked blue pattern. We note that this pattern is shorter than the other blue patterns.
Hence it is more likely to correspond to one operation instead of two. Nevertheless,
in a realistic attack, one could use brute force to recover this bit.
Remark 4.4.2 By the design of the Montgomery powering ladder (Algorithm 3.9),
there is always a multiplication followed by a square operation, making it safe
against our SPA attack presented above.
compared to lines 6 and 8 in Algorithm 4.2. This missing modular n operation might
be the main reason for the missing pattern structure in Fig. 4.73.
Nevertheless, we can still gain important information from the trace. First, we
note that there are 18 similar patterns in Fig. 4.73. By examining Algorithm 4.4,
similar to Guess a and Guess b from Sect. 4.4.1, we can assume each of those 18
patterns corresponds to either one execution of MonPro or one loop from line 6.
Since there is one extra MonPro operation in line 10, we know that the last pattern
will not represent a loop. If each of the other patterns corresponds to one loop, we
will have a secret key of bit length 17, which is longer than the bit length of n
(bit length of 1189 is 10) and hence impossible. We conclude that there is a high
possibility that each pattern corresponds to one execution of MonPro. Since when
.di = 1 there are two executions of MonPro and when .di = 0 there is one execution
Fig. 4.73 One trace corresponding to the computation of Algorithm 4.4. We can see 18 similar
patterns
𝓁d + wt (d) = 17.
.
DPA-RSA Step 5 Compute the hypothetical signal for each target intermediate
value. Our attack does not rely on finding the best key hypothesis
that achieves the highest absolute correlation coefficient as in
DPA attacks on symmetric block ciphers. The information we
exploit is that when the absolute correlation coefficient between
leakages and the target intermediate value is high, the corre-
sponding loop has secret key bit .= 1. For each of the M inputs .aj ,
we compute the target intermediate value, denoted .v j , as follows:
v j = bits 0, 1, 2, . . . , 7 of aj r mod n,
. j = 1, 2, . . . , 10,000.
(4.61)
As we have seen in Sect. 4.3.2.1, the Hamming weight leakage
model (Eq. 4.4) is a good estimate for leakages of our DUT. We
compute the hypothetical signal corresponding to .aj , denoted
.Hj , as follows:
( )
Hj = wt v j ,
. j = 1, 2, . . . , 10,000. (4.62)
a1 = 900,
. a2 = 1083, a3 = 881, a4 = 852.
Then
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 309
v 1 = FA,
. v 2 = F3, v 3 = 3F, v 4 = 79.
Then the hypothetical signals from DPA-RSA Step 5 are given by (Eq. 4.62)
H1 = wt (FA) = wt (11111010) = 6,
.
H2 = wt (F3) = wt (11110011) = 6,
H3 = wt (3F) = wt (00111111) = 6,
H3 = wt (79) = wt (01111001) = 5.
The sample correlation coefficients for all time samples are shown in Fig. 4.74.
We can see a sequence of 18 patterns. To recover the secret key, we need the help of
SPA. We have discussed before that there are 18 patterns in Fig. 4.73, and each of
them most likely corresponds to one execution of MonPro.
If we put Figs. 4.73 and 4.74 together, we get Fig. 4.75. We can see that the 18
patterns corresponding to sample correlation coefficients and those corresponding
to leakages coincide. Thus, we can assume each pattern in Fig. 4.74 represents one
execution of MonPro.
Let us then take a closer look at Fig. 4.74. We can see there are mainly two types
of patterns: one with a lower peak and one with a higher peak and a small high peak
Sample correlation coefficient
Fig. 4.74 Sample correlation coefficients .rt (Eq. 4.63) for time samples .t = 1, 2, . . . , 9500. We
can see a sequence of 18 patterns
310 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.75 Sample correlation coefficients from Fig. 4.73 (in red) with one power trace from
Fig. 4.74 in gray. We can see that the 18 patterns corresponding to sample correlation coefficients
and those corresponding to leakages coincide
Sample correlation coefficient
Fig. 4.76 There are mainly two types of patterns in Fig. 4.74: one with a lower peak and one with
a higher peak and a small high peak at the end of the pattern. In this figure, they are highlighted in
green and blue, respectively
at the end of the pattern. They are highlighted in green and blue, respectively, in
Fig. 4.76.
We know that the last pattern in Fig. 4.76 corresponds to line 10 in Algorithm 4.4.
Then each of the remaining 17 patterns represents the computation of either line 9
or line 7. Let S and M denote the modular square and modular multiplication
computations in lines 7 and 9, respectively. Since .ar is only used in M, we can
assume that a higher peaked (blue-colored) pattern corresponds to M. Consequently,
a lower peaked (green-colored) pattern corresponds to S. Using Fig. 4.76, we
can deduce the sequence of square and multiply operations in one execution of
Algorithm 4.4:
SMSSMSMSMSSMSSMSM.
.
A loop in Algorithm 4.4 contains either a single S or SM. We can then map this
sequence of operations into different loops (separated by spaces)
4.5 Countermeasures Against Side-Channel Analysis Attacks 311
. SM S SM SM SM S SM S SM SM.
1 0 1 1 1 0 1 0 1 1,
.
.d = 1011101011 = 747.
4.5.1 Hiding
In Sect. 4.3.2.2, we have discussed the stochastic leakage model. In this part, we
will show a countermeasure that is based on analyzing the stochastic leakage of the
DUT [MSB16].
Recall that with the stochastic leakage model, we can characterize the leakage at
a single time sample. For the countermeasure, we also focus on one time sample.
The coefficients (see Eq. 4.24) of the stochastic leakage model will be estimated
using the measured traces. Based on the estimated leakage model, we choose a
binary code (Definition 1.6.1) that results in a lower SNR at this particular time
sample and makes the attacks require more effort.
In Sects. 4.3.1 and 4.3.2 we have seen attacks based on the Hamming weight
leakage model on PRESENT implementations. Thus, to provide more protection,
we further require the codewords in our code to have the same Hamming weight,
as shown in [HBK23]. In this way, attacks based on the Hamming weight of the
intermediate values will not be possible.
The steps for the countermeasure are as follows:
Code-SCA Step 1 Identify the target instruction and target intermediate value.
As the stochastic leakage model is specific to one time sample,
we need to first decide what is the most vulnerable instruction
and which intermediate value needs to be protected the most.
Let .v = vmv −1 vmv −2 . . . v1 v0 denote the target intermediate
value of bit length at most .mv . In general, it is recommended
that the implementation is done in assembly to identify the most
vulnerable instruction.
For our illustrations, we will choose the instruction MOV
for our microcontroller, and we focus on the PRESENT Sbox
output. Hence .mv = 4. The operation we implemented is then
MOV
. r0 a, (4.64)
⎧ ⎫
nC
. > 2mv , (4.65)
wH
C −1
nΣ
L(x) =
. αs xs + noise, (4.66)
s=0
where .noise ∼ N(0, σ 2 ) denotes the noise with mean 0 and vari-
ance .σ 2 . Estimations for the coefficients .αs (.s = 0, 1, . . . , nC −
1) will be computed with profiling traces from .T2 . We can see
that it is important for traces in .T1 and .T2 to be aligned so that
the profiling of .T2 is carried out with the correct POI.
314 4 Side-Channel Analysis Attacks and Countermeasures
Fig. 4.77 An example of a trace from dataset .T1 , obtained in Code-SCA Step 3, which corre-
sponds to MOV instruction surrounded by NOPs
Fig. 4.78 SNR values for each time sample computed with dataset .T1 obtained in Code-SCA Step
3. The highest point is our .POI = 430
.qpf = q (P-DPA Step 7). The POI is taken to be the time sample
Code-SCA Step 5 Estimate coefficients for the stochastic leakage model. Fol-
lowing SLM Step a–SLM Step c in Sect. 4.3.2.2, we compute
the estimations for .αs in Eq. 4.66 using dataset .T2 . With the
notations from Sect. 4.3.2.2, we have .Mpf = M2 . The POI was
identified in Code-SCA Step 4. Note that the target intermediate
value is not the value .v from Code-SCA Step 1, but words
2n
from .F2C . Let .α̂s denote the estimated value for .αs (.s =
0, 1, . . . , nC − 1).
Using our dataset .T2 for MOV instruction, .POI = 430, and
.nC = 8, we get the following estimations .α̂s for .αs :
c −1
nΣ
SG(x) =
. α̂s xs . (4.68)
s=0
Words
. =[3F, 5F, 6F, 77, 7B, 7D, 7E, 9F, AF, B7, BB, BD, BE, CF,
D7, DB, DD, DE, E7, EB, ED, EE, F3, F5, F6, F9, FA, FC].
(4.70)
Σ
5
TSG [0] = SG(3F) = SG(00111111) =
. α̂i ≈ −0.009795.
i=0
Algorithm 4.5: Finding the optimal code for encoding countermeasure against
SCA
Input: mv , nC , wH , Words, TSG // mv is the maximum bit length of the
target intermediate value identified in Code-SCA Step 1; nC is the
code length and wH is the Hamming weight for each codeword chosen
in Code-SCA Step 2; Words is the table of integers between 0 and
2nC − 1 with Hamming weight wH as discussed in Code-SCA Step 6; TSG
is the table of estimated signals for each integer from Words as
specified in Eq. 4.69.
Output: An (nC , 2mv )-binary code with each codeword having Hamming weight wH
1 code_size = 2mv // number of codewords in our code
( )
2 total_word = wnCH // total number of words of length nC and Hamming
weight wH
3 array of size total_word−code_size+1 D
4 array of size total_word I
// C will store the codewords
5 array of size code_size C
// Tsorted [0] contains the lowest value from TSG
6 Tsorted = TSG sorted in ascending order
7 for j = 0, j < total_word, j + + do
// I records the corresponding word in Words for each estimated
signal in Tsorted
8 I [j ] = Words [index of Tsorted [j ] in TSG ]
9 for j = 0, j ≤ total_word − code_size, j + + do
// the j th entry of D is given by the difference between the value
in Tsorted [j + code_size − 1] and Tsorted [j ]
10 D[j ] = Tsorted [j + code_size − 1] − Tsorted [j ]
// ind is the index of the smallest value in D
11 ind = arg minj D[j ]
// the code consists of codewords that correspond to estimated signals
in the range D[ind] and D[ind + code_size − 1]
12 for j = 0, j < code_size, j + + do
13 C[j ] = I [ind + j ]
14 return C
{ }
d(A) := max |ai − aj | | ai , aj ∈ A
.
1Σ
β
Var(A) :=
. (ai − a)2 ,
β
i=1
318 4 Side-Channel Analysis Attacks and Countermeasures
1Σ
β
a=
. ai .
β
i=1
and hence
Var(A) ≤ d(A)2 .
.
A(C) := {SG(c) | c ∈ C}
.
to be the set of estimated signals for codewords in C. When C is used for encoding
the target intermediate value, the variance of the signal at POI is then given by
. Var(XPOI ) = Var(A(C)).
The goal of Algorithm 4.5 is to find a C such that .d(A(C)) is the minimum among
all .(nC , 2mv )-binary codes whose codewords have Hamming weight .wH . According
to the above discussions, we can conclude that the SNR of the code found by
the algorithm is also relatively small. Even though this code may not be the one
that achieves the lowest SNR, another code with a lower SNR will have a bigger
.d(A(C)), which might be exploited to improve the attack results.
v = SBPRESENT (p ⊕ 9).
.
Then we carried out the measurement for the operation described in Eq. 4.64 with .v
as the input .a. .100, 000 traces were collected for random plaintext nibbles p. Attack
traces for the protected implementation were obtained in a similar manner. Instead
of .v, we pass the corresponding codeword from .C(8,16) , .C(8,16) [v], as input .a in
Eq. 4.64. We have also measured .100, 000 traces with random plaintext nibbles.
4.5 Countermeasures Against Side-Channel Analysis Attacks 319
The attacks follow steps from Sect. 4.3.2.3, where we have set the target signal
to be the exact value of .v (or the corresponding codeword). Since we only focus on
one POI, according to Template Step b, we have computed a mean leakage for each
value of .v for unprotected implementation and for each value of codeword in .C(8,16)
for the protected implementation. The mean leakage values for different .v are given
by
It is easy to see that the differences between mean leakages in the first set are
bigger compared to those in the second set. If we compute the variance between
mean leakages in those two sets, we get
This shows that it is more difficult to distinguish between the leakages of codewords
in .C(8,16) than that of different values of .v. Since DPA attacks rely on exploiting the
difference between leakages for different data, we expect the protected implemen-
tation to be more challenging to attack with DPA.
The attack results are shown in Figs. 4.79 and 4.80. Computations of estimations
for success rates and guessing entropy followed Algorithm 4.1, where we have set
max_trace = 1000,
. no_of_attack = 100.
We can see that the unprotected implementation can be broken with about 150
traces, while the protected implementation cannot be broken with even 1000 traces.
We note that the number of traces required for a successful attack on unprotected
implementation is more than what we have obtained in Sect. 4.3.2.3 (see Figs. 4.51
and 4.52). This is expected as the highest SNR we have for MOV instruction
(Fig. 4.78) is much less than that for one round of PRESENT (Fig. 4.20).
For comparison, we have also repeated the same steps for the proposed counter-
measure with different values of .wH = 2, 3, 4, 5. The template-based DPA attack
results are shown in Figs. 4.81 and 4.82. We can see that all the codes increase the
number of traces needed for a successful attack. And the code with .wH = 4 behaves
the best.
320 4 Side-Channel Analysis Attacks and Countermeasures
0.8
Success rate
0.6
0.4
0.2
0
100 200 300 400 500 600 700 800 900 1,000
Number of traces
Unprotected wH = 6
Fig. 4.79 Estimations of success rate computed following Algorithm 4.1 for template-based DPA
attack on the MOV instruction taking the PRESENT Sbox output as an input. The black line
corresponds to unprotected intermediate values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all codewords have Hamming weight 6
10
8
Guessing entropy
100 200 300 400 500 600 700 800 900 1,000
Number of traces
Unprotected wH = 6
Fig. 4.80 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black line
corresponds to unprotected intermediate values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all codewords have Hamming weight 6
0.8
Success rate
0.6
0.4
0.2
0
100 200 300 400 500 600 700 800 900 1,000
Number of traces
Unprotected wH = 2 wH = 3
wH = 4 wH = 5 wH = 6
Fig. 4.81 Estimations of success rate computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black
line corresponds to unprotected intermediate values. The other lines correspond to encoded
intermediate values with .(8, 16)-binary codes obtained following Code-SCA Step 1–Code-SCA
Step 7, where we have set .wH = 2, 3, 4, 5, 6
10
Guessing entropy
100 200 300 400 500 600 700 800 900 1,000
Number of traces
Unprotected wH = 2 wH = 3
wH = 4 wH = 5 wH = 6
Fig. 4.82 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black
line corresponds to unprotected intermediate values. The other lines correspond to encoded
intermediate values with .(8, 16)-binary codes obtained following Code-SCA Step 1—Code-SCA
Step 1, where we have set .wH = 2, 3, 4, 5, 6
322 4 Side-Channel Analysis Attacks and Countermeasures
the whole encryption computation will be discussed in Sect. 5.2.1. Different codes
might work for different devices, but as implementers, we would have access to
the device we want to protect and can choose the best code that is suitable for the
device.10
In Sect. 4.4.1 we have seen one SPA attack on RSA implementations that exploits
the part of the square and multiply algorithm where multiplication is carried out only
when the secret key bit is 1. A natural countermeasure is that we always compute
multiplication no matter what the value of the secret key bit is. Such an algorithm is
called the square and multiply always algorithm [Cor99].
We keep the notations from Sect. 3.3. Let .n = pq be the product of two distinct
odd primes. Let .d ∈ Z∗ϕ(n) be the secret key of RSA/RSA signatures. We would like
to compute
a d mod n
.
for some .a ∈ Zn .
Recall that we have presented the right-to-left (Algorithm 3.7) and left-to-right
(Algorithm 3.8) square and multiply algorithms. Correspondingly, we have the
right-to-left and left-to-right square and multiply always algorithms, detailed in
Algorithms 4.6 and 4.7, respectively. In both algorithms, the modular multiplication
computation is always carried out. And when the secret bit is 0, the result is
discarded (line 6 in Algorithm 4.6 and line 7 in Algorithm 4.7).
As an illustration, let us consider our attack presented in Sect. 4.4.1. With the
square and multiply always countermeasure, Algorithm 4.2 becomes Algorithm 4.8.
With the same experimental setting as described in Sect. 4.1, we have measured one
trace for the computation of Algorithm 4.8 with our DUT. To make sure line 10 will
be executed, we have turned off the compiler optimization. The trace is shown in
Fig. 4.83. We can see that we still observe ten patterns, the same as in Sect. 4.4.1.
But in this case, all of them have more than one peak cluster. We know from the
discussions in Sect. 4.4.1 that this is because each of the patterns corresponds to
one loop from line 5 and each loop contains one modular square (line 6) and one
modular multiplication operation (line 8 or line 10). Thus, we cannot repeat the
same attack as presented in Sect. 4.4.1. However, we can deduce that the secret key
has bit length 10. In practical settings, the bit length will be much bigger. To the best
of our knowledge, this information alone cannot reveal the secret key.
On the other hand, we will show that the square and multiply always algorithm is
still vulnerable to the DPA attack presented in Sect. 4.4.2. With square and multiply
always countermeasure, Algorithm 4.4 becomes Algorithm 4.9.
10 Naturally, creating a different code for every device would be impractical for serial production.
4.5 Countermeasures Against Side-Channel Analysis Attacks 323
Algorithm 4.6: Right-to-left square and multiply always algorithm for com-
puting modular exponentiation. A hiding-based countermeasure against SCA
attacks
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 result = 1, t = a
2 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
3 if di = 1 then
i
// mutiply by a 2
4 result = result ∗ t mod n
5 else
// ith bit of d is 0, compute multiplication and discard the
result
6 tmp = result ∗ t mod n
i+1
// t = a 2
7 t = t ∗ t mod n
8 return result
Algorithm 4.7: Left-to-right square and multiply always algorithm for com-
puting modular exponentiation. A hiding-based countermeasure against SCA
attacks
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n)
Output: a d mod n
1 t =1
2 for i = 𝓁d − 1, i ≥ 0, i − − do
3 t = t ∗ t mod n
// ith bit of d is 1
4 if di = 1 then
5 t = a ∗ t mod n
6 else
// ith bit of d is 0, compute multiplication and discard the
result
7 tmp = a ∗ t mod n
8 return t
With the same experimental setting as in Sects. 4.1 and 4.4.2, one trace for
computation of Algorithm 4.9 is shown in Fig. 4.84. We note that there are 21 similar
patterns in the figure. By examining Algorithm 4.9, we can guess that each of them
might correspond to one loop from line 6 or one execution of MonPro. If the former
is true, we will have a private key d with a bit length bigger than the bit length of n.
Thus, we can conclude that most likely each of them corresponds to one execution
of MonPro. Then the last one corresponds to line 12, and the remaining 20 tells us
that .𝓁d = 10.
324 4 Side-Channel Analysis Attacks and Countermeasures
11 return t
Fig. 4.83 One trace corresponding to the computation of Algorithm 4.8. We can see ten similar
patterns
Fig. 4.84 One trace corresponding to the computation of Algorithm 4.9. We can see 21 similar
patterns. Each of them corresponds to one execution of MonPro
Fig. 4.85 Sample correlation coefficients computed following attack steps from Sect. 4.4.2 with
.10, 000 traces for the computation of Algorithm 4.9. The trace from Fig. 4.84 is gray in the
background. We can see that there are 21 patterns in the sample correlation coefficient plot that
coincide with those from Fig. 4.84—each corresponds to one execution of MonPro
Sample correlation coefficient
Fig. 4.86 There are mainly two types of patterns in the sample correlation coefficient plot from
Figure 4.85—one with a higher peak cluster (colored in blue) and one with a lower peak cluster
(colored in green). Among the blue-colored patterns, we further divide them into two types—one
with a high peak at the end (in lighter blue) and one without this peak (in darker blue)
1 0 1 1 1 0 1 0 1 1,
.
d = 1011101011 = 747.
.
v m = v · m.
.
As one can imagine, the cryptographic algorithm needs to be changed a bit for us to
carry out computations with the masked intermediate values and keep track of all the
masks. So that at the end of the encryption, we can remove the masks to output the
original ciphertext. In general, a masking scheme specifies how masks are applied
to the plaintext and intermediate values, as well as how they are removed from the
ciphertext. There are a few principles we follow for a masking scheme design:
• All intermediate values should be masked during the computation. In particular,
we would apply masks to the plaintext (and the key).
• We assume the attacker does not have knowledge of the masks—otherwise, the
attacker can carry out similar DPA attacks by making hypotheses about the key
values as in Sects. 4.3.1 and 4.3.2.
• When some intermediate values are to be XOR-ed with each other (e.g., in AES
MixColumns operation), different masks should be applied to each of them.
Otherwise, the same valued masks will cancel out.
• Each encryption has a different set of randomly generated masks.
For any function f , the mask that is applied to an input of f is called the input
mask of f . The corresponding mask for the output is called the output mask of f .
Definition 4.5.1 Let .f : Fm m2
2 → F2 be a function, where .m1 and .m2 are positive
1
.f (x ⊕ y) = f (x) ⊕ f (y).
Example 4.5.1
• AddRoundKey operation in AES (Sect. 3.1.2) round function is a linear function.
In fact, bitwise XOR with a round key is a linear function in general.
• DES (Sect. 3.1.1) Sboxes are nonlinear functions. Any Sbox proposed so far for
symmetric block ciphers is nonlinear.
• pLayer in PRESENT (Sect. 3.1.3) round function is linear.
• MixColumns operation in AES is linear (see Remark 3.1.3).
With Boolean masking, it is easy to keep track of the masks with linear operations.
Let f be a linear function, and take any input of f , .v, with a corresponding mask
.m; we have
. f (v ⊕ m) = f (v) ⊕ f (m).
Thus, when the input mask is .m, the output mask is given by .f (m). One of the main
challenges in designing a masking scheme is to find ways to keep track of masks for
nonlinear operations.
In this part, we will discuss a masking scheme for AES-128. The scheme was first
proposed in [HOM06], see also [MOP08, Section 9.2.1].
The only nonlinear operation in AES encryption is SubBytes. Let SB denote
AES Sbox. We will consider a table lookup implementation (see Sect. 3.2.1) for the
SubBytes operation. We choose an input mask .min, SB and an output mask .mout, SB
for SB. Then we generate a table that implements the masked Sbox, denoted SB.m ,
such that
min, SB ,
. mout, SB , m0 , m1 , m2 , m3 .
min, SB and .mout, SB will be the input and output masks for AES Sbox computa-
.
⎛ '⎞ ⎛ ⎞⎛ ⎞
m0 02 03 01 01 m0
⎜m' ⎟ ⎜01 02 03 01⎟ ⎜m1 ⎟
. ⎜ 1⎟ = ⎜ ⎟⎜ ⎟ (4.73)
⎝m' ⎠ ⎝01 01 02 03⎠ ⎝m2 ⎠ .
2
m'3 03 01 01 02 m3
Let us keep the matrix representation of the AES cipher state as in Eq. 3.2.
During the encryption, the masking scheme continues as follows. We apply masks
' ' ' '
.m , m , m , m to the plaintext such that the 4 bytes in row .i + 1 are masked with
0 1 2 3
'
.m . Then the cipher state before the initial AddRoundKey is of the format
i
⎛ ⎞
s00 ⊕ m'0 s01 ⊕ m'0 s02 ⊕ m'0 s03 ⊕ m'0
⎜s10 ⊕ m' s11 ⊕ m'1 s12 ⊕ m'1 s13 ⊕ m'1 ⎟
.⎜ 1 ⎟. (4.74)
⎝s20 ⊕ m' s21 ⊕ m'2 s22 ⊕ m'2 s23 ⊕ m' ⎠
2 2
s30 ⊕ m'3 s31 ⊕ m'3 s32 ⊕ m'3 s33 ⊕ m'3
We will not detail the masking scheme for the key schedule. For a round key K, we
use the following matrix representation:
⎛ ⎞
k00 k01 k02 k03
⎜k10 k11 k12 k13 ⎟
.⎜ ⎟.
⎝k20 k21 k22 k23 ⎠
k30 k31 k32 k33
We assume that the round keys, except for the last round key, are all masked such
that the bytes in row .i + 1 are masked with .m'i ⊕ min, SB . Then for a round key K,
the representation of its masked value in the matrix format will be
⎛ ⎞
k00 ⊕ m'0 ⊕ min, SB k01 ⊕ m'0 ⊕ min, SB k02 ⊕ m'0 ⊕ min, SB k03 ⊕ m'0 ⊕ min, SB
⎜k10 ⊕ m' ⊕ min, SB k11 ⊕ m'1 ⊕ min, SB k12 ⊕ m'1 ⊕ min, SB k13 ⊕ m'1 ⊕ min, SB ⎟
.⎜ 1 ⎟.
⎝k20 ⊕ m' ⊕ min, SB k21 ⊕ m'2 ⊕ min, SB k22 ⊕ m'2 ⊕ min, SB k23 ⊕ m'2 ⊕ min, SB ⎠
2
k30 ⊕ m'3 ⊕ min, SB k31 ⊕ m'3 ⊕ min, SB k32 ⊕ m'3 ⊕ min, SB k33 ⊕ m'3 ⊕ min, SB
(4.75)
After the initial AddRoundKey, according to Eqs. 4.74 and 4.75, the cipher state
becomes
⎛ ⎞
s00 ⊕ min, SB s01 ⊕ min, SB s02 ⊕ min, SB s03 ⊕ min, SB
⎜s10 ⊕ min, SB s11 ⊕ min, SB s12 ⊕ min, SB s13 ⊕ min, SB ⎟
.⎜ ⎟, (4.76)
⎝s20 ⊕ min, SB s21 ⊕ min, SB s22 ⊕ min, SB s23 ⊕ min, SB ⎠
s30 ⊕ min, SB s31 ⊕ min, SB s32 ⊕ min, SB s33 ⊕ min, SB
For round 1–round 9, the changes in cipher states after each operation of AES-
128 with masked implementation are detailed below:
• SubBytes. The SubBytes operation is performed using the table designed for
SB.m . By Eq. 4.72, after the SubBytes operation; each byte of the cipher state is
masked by .mout, SB :
⎛ ⎞
s00 ⊕ mout, SB s01 ⊕ mout, SB s02 ⊕ mout, SB s03 ⊕ mout, SB
⎜s10 ⊕ mout, SB s11 ⊕ mout, SB s12 ⊕ mout, SB s13 ⊕ mout, SB ⎟
.⎜ ⎟. (4.77)
⎝s20 ⊕ mout, SB s21 ⊕ mout, SB s22 ⊕ mout, SB s23 ⊕ mout, SB ⎠
s30 ⊕ mout, SB s31 ⊕ mout, SB s32 ⊕ mout, SB s33 ⊕ mout, SB
• ShiftRows. ShiftRows does not change the masks, each byte of the cipher state
is still masked by .mout, SB .
• MixColumns. Before MixColumns, we change the masks of the cipher state by
XOR-ing the four bytes in row .i + 1 with .m'i ⊕ mout, SB . In this way, the input of
MixColumns is of the format
⎛ ⎞
s00 ⊕ m0 s01 ⊕ m0 s02 ⊕ m0 s03 ⊕ m0
⎜s10 ⊕ m1 s11 ⊕ m1 s12 ⊕ m1 s13 ⊕ m1 ⎟
.⎜ ⎟.
⎝s20 ⊕ m2 s21 ⊕ m2 s22 ⊕ m2 s23 ⊕ m2 ⎠
s30 ⊕ m3 s31 ⊕ m3 s32 ⊕ m3 s33 ⊕ m3
By the choice of .m'i (see Eq. 4.73), the cipher state at the output of MixColumns
is the same as in Eq. 4.74:
⎛ ⎞
s00 ⊕ m'0 s01 ⊕ m'0 s02 ⊕ m'0 s03 ⊕ m'0
⎜s10 ⊕ m' s11 ⊕ m'1 s12 ⊕ m'1 s13 ⊕ m'1 ⎟
.⎜ 1 ⎟.
⎝s20 ⊕ m' s21 ⊕ m'2 s22 ⊕ m'2 s23 ⊕ m'2 ⎠
2
s30 ⊕ m'3 s31 ⊕ m'3 s32 ⊕ m'3 s33 ⊕ m'3
• AddRoundKey. After the AddRoundKey of the round, the cipher state becomes
the same as the input of this round, as given in Eq. 4.76:
⎛ ⎞
s00 ⊕ min, SB s01 ⊕ min, SB s02 ⊕ min, SB s03 ⊕ min, SB
⎜s10 ⊕ min, SB s11 ⊕ min, SB s12 ⊕ min, SB s13 ⊕ min, SB ⎟
.⎜ ⎟.
⎝s20 ⊕ min, SB s21 ⊕ min, SB s22 ⊕ min, SB s23 ⊕ min, SB ⎠
s30 ⊕ min, SB s31 ⊕ min, SB s32 ⊕ min, SB s33 ⊕ min, SB
We can repeat the above for every round from round 1 to round 9. Finally, the input
of round 10 is in the form of Eq. 4.76. After SubBytes and ShiftRows in round 10,
the cipher state will be the same as in Eq. 4.77. Thus we require that each byte of the
last round key is masked by .mout, SB . In this way, we will get unmasked ciphertext.
4.5 Countermeasures Against Side-Channel Analysis Attacks 331
Table 4.4 Relation between the output bits of Sboxes from the Quotient group .Qj i and the input
bits of Sboxes from the corresponding Remainder group .Rj i+1 . For example, the 0th input bit of
SB.i+1
j +4 in .Rj
i+1 comes from the first output bit of .SBi in .Qj i
4j
\\ i
\\ Qj SBi SBi4j +1 SBi4j +2 SBi4j +3
Rj i+1\\ \
4j
SBi+1
j (0, 0) (1, 0) (2, 0) (3, 0)
SBi+1
j +4 (0, 1) (1, 1) (2, 1) (3, 1)
SBi+1
j +8 (0, 2) (1, 2) (2, 2) (3, 2)
SBi+1
j +12 (0, 3) (1, 3) (2, 3) (3, 3)
We will present two methods for masking PRESENT encryption. Let SB denote the
PRESENT Sbox (Table 3.11) for the rest of this part.
Before we go into details of the masking scheme, we introduce the notion of
Quotient group and Remainder group. We number the Sboxes in the ith round of
PRESENT as .SBi0 , SBi1 , . . . , SBi15 , where .SBi0 is the right-most Sbox in Fig. 3.9.
Those Sboxes can be grouped in two different ways: the Quotient group and the
Remainder group:
{ } { }
Qj i := SBi4j , SBi4j +1 , SBi4j +2 , SBi4j +3 , Rj i := SBij , SBij +4 , SBij +8 , SBij +12 ,
.
where .j = 0, 1, 2, 3. Such a grouping allows us to relate the bits for each Sbox
output in round i to bits of each Sbox input in round .i + 1 in a certain way through
pLayer, as shown in Table 4.4. In particular, we observe that:
• Bits of the 0th Sbox (.SBi4j ) output in Quotient group .Qj i are permuted to the
0th bits of Sbox inputs in the corresponding Remainder group .Rj i+1 ;
• Bits of the first Sbox (.SBi4j +1 ) output in .Qj i are permuted to the first bits of
Sbox inputs in .Rj i+1 .
• Bits of the second Sbox (.SBi4j +2 ) output in .Qj i are permuted to the second bits
of Sbox inputs in .Rj i+1 .
• Bits of the third Sbox (.SBi4j +3 ) output in .Qj i are permuted to the third bits of
Sbox inputs in .Rj i+1 .
An illustration is shown in Fig. 4.87.
Hence pLayer can be considered as four identical parallel bitwise operations
where each is a function .p : F16 2 → F2 that takes one Quotient group output
16
Fig. 4.87 An illustration of the relation between Sbox outputs in a Quotient group to Sbox inputs in the corresponding Remainder group. Sboxes in Quotient
groups .Q0i , .Q1i , .Q2i , .Q3i and their corresponding Remainder groups .R0i+1 , .R1i+1 , .R2i+1 , .R3i+1 are in orange, blue, green, and red colors, respectively
4 Side-Channel Analysis Attacks and Countermeasures
4.5 Countermeasures Against Side-Channel Analysis Attacks 333
b15 , b14 , . . . , b1 , b0 ,
. (4.80)
where each .bj denotes a nibble of the cipher state. At the start of the encryption, we
mask the ith four nibbles of the plaintext with .mi , mi , mi , mi (.i = 0, 1, 2, 3). This
means the cipher state at the input of round 1 is given by
Then after the addRoundKey operation, the cipher state is of the following
format:
• pLayer. After the pLayer computation, according to our discussion above about
Quotient group, Remainder group, and Eq. 4.79, the cipher state will become (see
Fig. 4.87)
334 4 Side-Channel Analysis Attacks and Countermeasures
Table 4.5 An example of T2, which specifies the output mask .mout,SB for each input mask .min,SB
of PRESENT Sbox [SBM18] such that all possible values of .min ⊕ mout appear
.min,SB 0 1 2 3 4 5 6 7 8 9 A B C D E F
.mout,SB = T2[min,SB ] E 4 F 9 0 3 D 5 7 8 A 2 B 1 6 C
.min,SB ⊕ mout,SB E 5 D A 4 6 B 2 F 1 0 9 7 C 8 3
which is the same as in Eq. 4.81. Thus the above can be repeated for all 31 rounds.
We assume the last round key has the same masks as the plaintext. Then after the
final addRoundKey operation, we will get unmasked ciphertext.
The second masking scheme for PRESENT is detailed in [SBM18]. Different
from the masked AES Sbox lookup table, this time we compute a lookup table,
denoted T1, such that for any .v ∈ F42 , any input mask .min ∈ F42 , and the
corresponding output mask .mout ∈ F42 for PRESENT Sbox,
We also need another table T2 that helps us to keep track of the masks
T2[min ] = mout ,
. min = 0, 1, . . . , F. (4.83)
In this way, we do not need to generate a masked Sbox lookup table whenever the
input mask for the Sbox changes. The size of T1 is .8 × 4, and the storage required
is .28 × 24 = 212 bits or .29 bytes. The table T2 requires 16 bits of memory. It is
suggested that T2 should be designed such that all possible values of .min ⊕ mout
appear. For example, one possible choice of T2 is given in Table 4.5, originally
presented in [SBM18].
In fact, in general, we have the following observations:
Remark 4.5.1 Let f be a function, and let .min,f denote its input mask with
corresponding output mask .mout,f . For any input .x of f , we have
Thus, when choosing the input mask .min,f and its corresponding output mask
mout,f , we need to ensure that all possible values of .min,f ⊕ mout,f appear.
.
Otherwise, the distribution induced by .(x ⊕ f (x)) ⊕ (min,f ⊕ mout,f ) will not be
uniform, and the signal corresponding to the value of .x ⊕ f (x) cannot be properly
concealed, making it vulnerable to DPA attacks.
4.5 Countermeasures Against Side-Channel Analysis Attacks 335
Since the pLayer operation is linear, we can simply apply pLayer to the masks to
keep track of their changes. We use the same notation as in Eq. 4.80 for PRESENT
cipher state. At the beginning of one encryption, we randomly generate 16 masks,
each is applied to one nibble of the plaintext. Suppose the cipher state at the input
of round i is of the following format:
b15 ⊕ mi−1
.
i−1 i−1 i−1
15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .
b15 ⊕ mi−1
.
i−1 i−1 i−1
15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .
• sBoxLayer. Let
⎡ ⎤
.mi−1
j,out = T2 m i−1
j,in , j = 0, 1, . . . , 15,
denote the output mask for PRESENT Sbox corresponding to the input mask
.mi−1
j,in . Then after sBoxLayer, the cipher state is of the following format:
b15 ⊕ mi−1
.
i−1 i−1 i−1
15,out , b14 ⊕ m14,out , . . . , b1 ⊕ m1,out , b0 ⊕ m0,out ,
• pLayer. We apply the pLayer operation to both the cipher state and the mask for
the whole cipher state. The mask for the whole cipher state is the string obtained
by concatenating all 16 masks .mi−1
j,out :
mi−1
.
i−1 i−1 i−1
15,out , m14,out , . . . , m1,out , m0,out .
After pLayer, masks for each nibble of the cipher state will be changed and the
cipher state will become
where
Consequently, .mij,in will be the input mask for the j th Sbox in round .i + 1.
Finally, after 31 rounds, we have another addRoundKey operation, which does not
change the masks of the cipher state since the round keys are not masked. The cipher
state will be
336 4 Side-Channel Analysis Attacks and Countermeasures
b15 ⊕ m31
. 15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .
31 31 31
To get the unmasked ciphertext, we remove the masks by XOR-ing the cipher state
with
m31
.
31 31 31
15,in , m14,in , . . . , m1,in , m0,in .
Fig. 4.88 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Masked fixed dataset A and 50 traces from Masked fixed dataset B. The signal is given by the
plaintext value, and the fixed versus fixed setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5
• Masked random plaintext dataset: This dataset contains 20000 traces with a fixed
round key
FEDCBA0123456789
. (4.84)
Fig. 4.89 SNR computed with Masked random dataset. The signal is given by the exact value of
the 0th Sbox output
output of the first round. We consider the target signal (P-DPA Step 5) to be the
exact value of .v, since in Sect. 4.3.2.3 we have seen that this is a better choice than
taking .wt (v) to be the target signal (see Fig. 4.52). Consequently, we group our
profiling traces Masked random dataset into 16 sets (P-DPA Step 6). Using the
methodology from P-DPA Step 7 and P-DPA Step 8, we have computed the SNR
values for all 3600 time samples using Masked random dataset. The results are
shown in Fig. 4.89.
The time sample achieving the highest SNR is .t = 1929, which will be
our POI (Template Step a from Sect. 4.3.2.3). Following Template Step b from
Sect. 4.3.2.3, we have built the template for this POI. In particular, the mean values
.μs for .s = 0, 1, . . . , 15 (corresponding to .v = 0, 1, . . . , 16) are as follows:
We take the Masked random plaintext dataset as our attack traces (P-DPA Step
10). There are 16 key hypotheses .k̂i = i − 1 (.i = 0, 1, . . . , 15). Based on our
implementation, the hypothetical intermediate value should be given by
where .pj is the 0th nibble of the plaintext corresponding to the j th trace, .m0,j is
the input mask applied to this nibble, and .M̂p ≤ 20000 is the number of traces used
for the attack. However, as mask values are unknown to the attacker, we will only
compute the unmasked hypothetical intermediate value (P-DPA Step 11)
v̂ ij = SBPRESENT (k̂i ⊕ pj ),
. i = 1, 2, . . . , 16, j = 1, 2, . . . , M̂p .
4.5 Countermeasures Against Side-Channel Analysis Attacks 339
Fig. 4.90 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks on the Masked random plaintext dataset (in black) and on the Random plaintext
dataset (in red)
Fig. 4.91 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks on the Masked random plaintext dataset (in black) and on the Random plaintext
dataset (in red)
Since we chose the signal to be the exact value of .v, our leakage model will be the
identity leakage model. The hypothetical signals are given by (P-DPA Step 12)
Following Template Step c, we can compute the probability score for each key
hypothesis. Then with Algorithm 4.1, we can calculate the estimations for guessing
entropy and success rate of the attack. We have set
max_trace = 60,
. no_of_attack = 100.
In practice, more than one mask will be applied to provide better protection,
leading us to higher order masking. We will briefly introduce this notion in Sect. 4.6.
a d mod n
.
for some .a ∈ Zn . Those attacks can be during the RSA signature signing process or
RSA decryption. More attacks will be discussed in Sect. 4.6.
Given those attacks, it is recommended to blind the secret values during the
computation. It is also required that the masks and blinded values should be updated
frequently or even during the computations. In this case, it will be difficult for the
attacker to combine whatever partial information obtained from the leakages of the
previously blinded value and the newly leaked information.
In this part, we will discuss a few methods, including exponent blinding, message
blinding, and modulus blinding. The countermeasures are mostly designed against
DPA attacks. In particular, the message blinding method will be effective against the
DPA attack we have presented in Sect. 4.4.2. The original proposals can be found
in [BCDG10, KJJR11].
Exponent blinding First, we consider how we can randomize the secret exponent
d. One method is that we generate a random number .λ ∈ [0, 2𝓁 − 1]. Then instead
of computing
a d mod n,
.
we compute
a d+λϕ(n) mod n.
. (4.85)
a d+λϕ(n) ≡ a d mod n.
.
.a d mod n = 83 mod 15 = 2.
4.5 Countermeasures Against Side-Channel Analysis Attacks 341
a λ × a d−λ mod n.
.
A third method generates a random number .λ such that .gcd(λ, ϕ(n)) = 1 and
calculates
−1 mod ϕ(n))
(a λ )d(λ
. mod n. (4.86)
Since
a d mod n = 2.
.
8 = 3 × 2 + 2,
. 3 = 2 + 1 =⇒ 1 = 3 − 2 = 3 − (8 − 3 × 2) = 3 × 3 − 8,
we have
a d mod n,
.
ap yq q + aq yp p mod n,
. or equivalently ap + ((aq − ap )yp mod q)p,
where
yq = q −1 mod p,
. yp = p−1 mod q.
The countermeasure takes two random numbers .λ1 , λ2 , and instead of computing
ap , .aq with Eq. 4.87, we calculate
.
a d+λ1 (p−1) mod p = a d mod (p−1) mod p, a d+λ2 (q−1) mod q = a d mod (q−1) mod q.
.
p = 3,
. q = 5, n = 15, d = 3, a = 8.
aq = a d+λ2 (q−1) mod q = 83+3×4 mod 5 = 815 mod 5 = 315 mod 5 = 3×(32 )7 mod 5
= 3 × 47 mod 5 = 3×4×(42 )3 mod 5 = 12 mod 5 = 2.
Our SCA attacks from Sect. 4.4 rely on exploiting the leakages to get the
value of each bit of the secret exponent d. We can see that for all the exponent
blinding methods above, assuming an attack on one RSA decryption (or RSA
signatures singing) execution, with the same methods, we can only recover the
value of .d+a random number or .d×a random number, making the real value of d
concealed from the attacker. On the other hand, if two computations with different
masks are attacked, the secret key can be recovered. For example, with the first
countermeasure, if we know the values for
d + λ1 ϕ(n),
. d + λ2 ϕ(n),
(λ1 − λ2 )ϕ(n).
.
Since .λ1 and .λ2 have bit length 20 or 30, we can factorize .(λ1 − λ2 )ϕ(n) by trying
all possible values of .λ1 and .λ2 .
Message blinding We can also mask the value a. In this way, DPA attacks (e.g.,
the attack in Sect. 4.4.2) that rely on knowing certain intermediate values related to
a cannot be carried out.
Take a random number .λ such that .gcd(λ, n) = 1, and compute
a1 = λe mod n,
. a2 = λ−1 mod n.
To get
a d mod n,
.
we calculate
Since
ed ≡ 1 mod ϕ(n),
.
344 4 Side-Channel Analysis Attacks and Countermeasures
by Corollary 1.4.5,
Then
(((aa1 )d mod n)a2 ) mod n = (((aλe mod n)d mod n)(λ−1 mod n)) mod n
.
The first mask .a1 randomizes the input of the computation, and the second mask .a2
corrects the output to the expected result.
Example 4.5.5 Keep the same parameters as in Example 4.5.2:
p = 3,
. q = 5, n = 15, e = 3, d = 3, a = 8, ϕ(n) = 8.
We know that .a d mod n = 2. Take .λ = 4, which is coprime with n. Then with the
message blinding countermeasure above, we have
15 = 4 × 3 + 3,
. 4 = 3 + 1 =⇒ 1 = 4 − 3 = 4 − (15 − 4 × 3) = 4 × 4 − 15
and
Finally,
Modulus blinding When the modulus is random during the computations, similar
to random values of a, DPA attacks such as the one in Sect. 4.4.2 cannot be carried
out as the attacker does not know the modulus to derive the target intermediate
values.
For blinding the modulus n, we generate a random number .λ and compute
p = 3,
. q = 5, n = 15, e = 3, d = 3, a = 8, ϕ(n) = 8.
Remark 4.5.2 Note that the message blinding and the modulus blinding methods
we have presented can also be used in a similar way to protect the computation of
a d mod p,
. a d mod q,
Leakage model We note that the Hamming distance, Hamming weight, identity
(Sect. 4.2.1), and stochastic leakage (Sect. 4.3.2.2) models all assume there are no
differences in the leakage when the value in a bit switches from 0 to 1 or from 1 to
0. Improved models can be found in, e.g., [PSQ07, GHP04].
Leakage assessment TVLA (see Sect. 4.2.3) was first proposed in 2011
[GGJR+ 11]. More discussions on how to set the threshold 4.5 can be found
in [DZD+ 18]. Another prominent leakage assessment method is Person’s χ 2 -
test [SM15], which is normally used as a replacement for TVLA when analyzing
multivariate and horizontal leakages.
Simple power analysis We have seen that by visual inspection of the power traces,
the attacker can gain information about the operations being executed on the device.
SPA was first introduced in [KJJ99], which is also the very first proposal of power
analysis attacks. The authors mentioned that programs involving conditional branch
operations depending on secret parameters are at risk. Later this idea was applied to
develop an SPA attack on RSA [MDS99b] (see Sect. 4.4.1).
[Nov02] (see also [KJJR11, Section 3.3]) proposes an attack that exploits
vulnerability in Garner’s algorithm for CRT-based RSA. The authors demonstrate
that with SPA, we can identify if a mod p > a mod q. Then with adaptive chosen
ciphertext and binary search, the value of p can be recovered. [FMP03] shows
346 4 Side-Channel Analysis Attacks and Countermeasures
that with only known messages, assuming p and q have different lengths, in case
q < p/2𝓁 , p and q can be recovered by performing 60 × 2𝓁 signatures on average.
A lower bound of 𝓁 is specified in the paper.
SPA has also been used to obtain the Hamming weight of operands [MS00] or
attack AES key schedule [Man03]. Similar to profiled DPA, we can carry out a
profiled SPA attack, see, e.g., [Man03, Section5.3].
Differential power analysis A DPA attack on DES can be found in, e.g.,
[MDS99a]. For AES, detailed descriptions are given in [MOP08, Chapter 6].
For DPA attacks on RSA, [MDS99b] lists different variants of DPA on RSA,
where some can be considered as extended SPA attacks. [dBLW03] proposes a
DPA attack on CRT-based implementation using Garner’s algorithm. The target
intermediate value is the remainder after the modular reduction with one of the
primes. [AFV07] studies more attacks on other intermediate values of CRT-based
RSA. We have elaborated one of the methods in Sect. 4.4.2.
We also refer the readers to [MOP08, Sta10, KJJR11] for more discussions on
SPA and DPA.
Template attacks The idea of template attacks was first introduced in [CRR03]. In
Sect. 4.3.2.3, we discussed how templates can be used for DPA on symmetric block
ciphers. In a similar manner, template-based attacks can also be applied to SPA
on symmetric block ciphers [MOP08, Section 5.3], and SCA on RSA [VEW12,
XLZ+ 18].
We note that the template attacks we have described used normal distributions
to approximate the distributions induced by leakages. One might refer to this
as a Gaussian template attack. A more generic method, MIA, can be found
in [GBTP08], where the authors aim to approximate the mutual information
between the hypothetical leakages and the actual measured leakages without making
assumptions on the leakage distribution.
SCADPA Side-channel assisted differential plaintext attack (SCADPA) was first
proposed in [BJB18] for PRESENT and in [BJHB19] for GIFT implementations.
It was later generalized to all SPN block ciphers in [BBH+ 20]. The attack presented
in Sect. 4.3.3 is based on this generalized attack. We refer the readers to the original
paper [BBH+ 20] for attacks on more ciphers and analysis of attack complexity.
More attacks Other side-channel attack methods exist for symmetric block
ciphers. For example, collision attacks [SWP03] identify the collision of
intermediate values between two encryptions using power traces to recover
the secret key. Algebraic side-channel attacks [RS09] express both the target
algorithm and its leakages as equations to achieve successful attacks with unknown
plaintext/ciphertext. Soft analytical SCA [VCGS14] constructs a graph for the
implementation and uses the belief propagation algorithm on this graph to efficiently
combine the information of all leakage points. DCSCA (differential ciphertext
SCA) [HBB21] targets GIFT cipher. The attack analyzes the statistical distribution
of intermediate values with the help of side-channel leakages to recover the last
4.6 Further Reading 347
round key. The authors also demonstrated the extension of the attack to GIFT-based
AEAD schemes.
Preprocessing of traces During the measurements, it can happen that the traces
contain too much noise. Or if there are certain countermeasures in place, the traces
can also be misaligned. There are various classical methods for preprocessing
traces. For example, moving average computes the average of the leakages from
a few time samples to smooth the signal. Principal component analysis [BHvW12]
aims to reduce the noise in the traces by projecting high-dimensional data to a
lower dimensional subspace while preserving the data variance. Elastic alignment
[vWWB11] aligns the traces by focusing on the synchronization of trace shape and
generating artificial samples. It can be used to counter jitter-based countermeasures.
The method is based on the dynamic time warping algorithm designed for speech
recognition [SC78].
Hiding-based countermeasure A hiding-based countermeasure aims to make the
leakage random or constant independent of the operation/data.
To randomize the leakage, we can insert random delays (jitters) [CK09] or
shuffle the execution order of independent operations. For example, shuffle Sboxes
in AES implementations [HOM06] randomize the sequence of square and multiply
operations in RSA [Wal02]. Another approach to randomizing the leakage proposes
to use residue number systems to allow randomizing the representation of finite field
elements for computing exponentiation [BILT04].
To make the leakage constant, different methodologies have been proposed on
different levels. For the cell level (or logic design level), we have, for example,
dual-rail precharge logic (DPL) [TV06] and dynamic and differential logic styles
[TAV02]. DPL has two phases: in the precharge phase, values in the wires are set to
a precharge value (either 0 or 1); then during the evaluation phase, one wire carries
the signal 0, and the other wire carries the signal 1. We note that this is equivalent
to using the binary code {01, 10} for encoding 0, 1. For the software level, we have
seen encoding-based countermeasures for symmetric block ciphers in Sect. 4.5.1.1
and square and multiply always algorithm for RSA in Sect. 4.5.1.2. The original
proposal of the square and multiply always algorithm can be found in [Cor99]. More
on encoding-based countermeasures can be found in, e.g., [CG16]. [CG16] uses
linear complementary dual code—a code C is a complementary dual code if C ∩
C ⊥ = {0} (see Definition 1.6.9). Another example of software level countermeasure
can be found in [HDD11], where the authors propose to use DPL in software for
symmetric block ciphers. See also [RGN13] for a DPL in software countermeasure
with provable security for bitsliced implementation of PRESENT.
Masking-based countermeasures Those countermeasures are designed to make
the leakage dependent on some random value. Masking was first proposed by
Goubin and Patarin [GP99] and Chari et al. [CJRR99] independently. It has
been proven that masking-based countermeasure is secure given that the source of
randomness is truly random [PR13]. Due to this sound mathematical basis, it has
become the most adopted countermeasure for symmetric block ciphers.
348 4 Side-Channel Analysis Attacks and Countermeasures
the value of m as a discrete random variable. In case the distribution induced by this
random variable is uniform on Fm 2 , the distribution induced by the value of v ⊕ m
v
mv
is also uniform on F2 , regardless of the value of v. Thus, we expect the leakage
to be independent of v when only first-order DPA is carried out. The security proof
for first-order Boolean masking against first-order DPA can be found in [BGK04].
Results for higher order Boolean masking are given in [RP10]. However, the proofs
rely on the masks to be truly random, which is not easy to achieve in practice. For
example, our masked implementation from Sect. 4.5.2.3 can still be attacked with
first-order DPA (see Figs. 4.90 and 4.91). We also note that the choice of masks
should follow certain rules so that the masking scheme is more secure (see, e.g.,
[BGN+ 15])
Blinding Blinding was first suggested in [Koc96]. It was then later formalized by
J. S. Coron [Cor99]. It is worth noting that several patents have been published
about masking [KJJ10] and blinding [KJ01].
Various attacks on blinding have also been published. For example, [FV03] pro-
poses an attack on the left-to-right square and multiply algorithm that recovers a
blinded secret exponent with SPA. [WvWM11] discusses a DPA attack on the
square and multiply always algorithm and message blinding. [FRVD08] exploits
the leakage during the computation of the random exponent.
More about countermeasures For SCA countermeasures, except for those intro-
duced in this chapter, there are also many other techniques. In general, we can divide
them according to the levels of protection.
Protocol level countermeasures aim to design cryptographic protocols to survive
leakage analysis. For example, by limiting the number of communications that can
be performed with any given key, fewer measurements can be done by the attacker
for the same key or by rekeying [MSGR10].
Cryptographic primitive level countermeasures are proposals of new cipher
designs that are resistant to side-channel attacks.
Implementation level countermeasures were the focus of this chapter, where
we discussed some hiding and masking/blinding techniques in Sect. 4.5. There are
also other implementation-level countermeasures, for example, time randomization
[MMS01b] and encryption of the buses [BHT01].
Architecture level countermeasures refer to techniques that modify the archi-
tecture of the computation device. For example, [MMS01a] proposes to use
a nondeterministic processor to randomly change the sequence of the executed
program during each execution; [SVK+ 03] integrates secure instructions into a
nonsecure processor.
Hardware level countermeasures protect the implementations through external
means, for example, conforming glues [AK96], protective coating [TSS+ 06], and
detachable power supplies [Sha00].
Attacks on post-quantum cryptographic implementations Several papers pro-
pose SCA on post-quantum cryptosystems.
350 4 Side-Channel Analysis Attacks and Countermeasures
AI-based methods have been applied for side-channel analysis in the past few years.
If we look at DPA (Sects. 4.3.1, 4.3.2, and 4.4.2), the key recovery is essentially
a classification problem. In particular, in a profiled setting, the profiling phase
corresponds to the training phase of an AI-based algorithm. During the attack phase,
the analysis of the leakage traces can be seen as a classification problem where
the goal of an attacker is to classify those traces based on the related data (e.g.,
a specific Sbox output value). Various AI-based techniques have been adopted for
SCA, e.g., k-nearest neighbor algorithm [MZMM16], random forest [LBM15],
support vector machines [HZ12], multilayer perception (MLP) [GHO15], and
convolutional neural networks (CNNs) [ZBHV20]. It has also been shown that, with
neural networks, protected implementations can be broken. For example, [WP20]
used autoencoder to break hiding countermeasures, while in [MPP16], the authors
successfully broke masking countermeasures with deep learning techniques.
4.6 Further Reading 351
As an example, let us consider the case of a neural network used for the
classification problem in a DPA attack on AES implementations (see Sect. 4.3.1).
The input of the network will then be (part of) the traces. The output layer will have
a softmax activation function, and each class corresponds to one possible value of
the target Sbox output, hence leading to one key byte hypothesis with the knowledge
of the plaintext. Then during the inference, for each input data, the network output
indicates the possibilities of the 256 values for the Sbox output, which gives a
possibility of each of the corresponding key byte hypotheses.
Success rate and guessing entropy Given a few, say .M̂p , data (trace), we can
compute a score for each key hypothesis by summing up the corresponding
probabilities predicted using each data. Then we can rank the key hypotheses
according to their scores with the one ranked the first having the highest score. Let
M̂
us denote the rank of the correct key hypothesis by .rkAIp . It is easy to see that we
M̂
can consider .rkAIp as a random variable whose randomness comes from different
plaintexts/measurements.
Recall that in Eq. 4.36, we have defined the success rate for a DPA attack. For
AI-based SCA attacks, we have an equivalent definition of success rate, namely
M̂
the probability that .rkAIp = 1. Similarly, we can also define the guessing entropy
M̂
(see Eq. 4.37) to be the expectation of the random variable .rkAIp . Same as for DPA
attacks, we can estimate success rate with the frequency of successful attacks among
M̂
a number of trials and estimate guessing entropy using the sample mean of .rkAIp .
In particular, for a fixed .M̂p , we randomly select .M̂p data from the test set and
M̂
carry out an attack with .M̂p traces, and then we compute .rkAIp . We repeat this
M̂
procedure for, e.g., 100 times, which gives us a sample of .rkAIp . Its mean is then
an estimation for the guessing entropy. An estimation for the success rate is the
M̂
frequency of .rkAIp = 1 among those 100 simulated attacks.
In most cases, the goal of AI-based SCA is to achieve a low guessing entropy or
a high success rate with as few traces as possible after training.
Different research topics in AI-assisted SCA Many different aspects of AI-
assisted SCA have been analyzed by researchers.
Firstly, there are a few publications on public datasets, which are used to evaluate
novel proposals of AI-based techniques. To name a few, ASCAD dataset [BPS+ 20,
BPS+ 21] contains power traces for software implementations of AES with masking
countermeasures and artificially introduced random jitters. AES_HD [BJP20]
dataset is EM traces corresponding to unprotected AES hardware implementation on
FPGA. AES_RD [CK09, CK10, CK18] dataset consists of power traces of software
implementations of AES with random delay.
The most studied direction is of course to achieve high success rates or low
guessing entropy. By examining the similarity of side-channel traces to time series
352 4 Side-Channel Analysis Attacks and Countermeasures
data (e.g., audio signals), [KPH+ 19] proposed a VGG15-like network together
with a regularization method achieved by adding noise to the traces. Zaid et
al. [ZBHV20] introduced a methodology for the design of CNNs in the SCA
context. The paper analyzed several datasets and constructed an optimal CNN
for each dataset. [WJB20] showed an improvement of [ZBHV20] using data
oversampling. [PCP20] used ensemble models to achieve good generalization from
the training set to the validation set for a given dataset. On the other hand, Won
et al. [WHJ+ 21] utilized Multi-scale Convolutional Neural Networks for SCA to
achieve the goal of integrating classical trace preprocessing techniques and attacking
several datasets without changing the network architectures.
Hyperparameter tuning, an important problem in AI algorithm development in
general, naturally attracted attention in the domain of SCA. Various methods have
been proposed, for example, Bayesian optimization and random search [WPP22],
reinforcement learning [RWPP21], and genetic algorithm for choosing architec-
tures [MPP16] or for choosing all hyperparameters [AGF21].
It has been shown that test accuracy in machine learning cannot properly assess
SCA performance [PHJ+ 19]. Because of this observation, many training strategies
are studied, for example, stopping criteria based on success rate [RZC+ 21] or based
on mutual information [PBP21].
Recently, non-profiled AI-based SCA has also gained attention in the research
community. For example, in [Tim19], the authors propose to train a neural network
for each key hypothesis. To do this, the attacker splits the traces based on the key
hypothesis, just like when carrying out DPA. The network that achieves the best
training metrics then reveals the actual key byte. This method was titled Differential
Deep Learning Analysis (DDLA).
Stream ciphers were targeted by a combination of machine learning, mixed
integer linear programming, and satisfiability modulo theory methods [KDB+ 22].
Furthermore, AI-based methods have also been adopted for the identification of
points of interest [LZC+ 21] and leakage assessment [MWM21].
Chapter 5
Fault Attacks and Countermeasures
Fault attacks are active attacks where the attacker tries to perturb the internal
computations by external means. Such attacks exploit a scenario where the attacker
has access to the device and can tamper with it.
Fault attacks can be achieved with different techniques, ranging from simple
clock/voltage glitches to sophisticated optical fault injections (see Sect. 6.2 for more
details).
The attacker’s goal is to recover the secret master key of the cryptographic
algorithm. The attack methodologies are normally developed on the algorithmic
level. But implementation-specific vulnerabilities also exist (Sect. 5.1.4).
There are different effects that a fault injection can achieve. Instruction skip and
instruction change perturbs the instruction being executed by modifying the opcode
of the instruction. Bit flip flips the bits in the data. The number of bits affected is
normally limited by the register size (although, technically it is possible to affect
a few registers at once). We use the notation m-bit flip to indicate how many bits
are flipped by the fault attack. This notion is consistent with our previous definition
of bit flip (see Definition1.2.17). Bit set/rest fixes the bit value to be 1 (set) or 0
(reset). Random byte fault changes the byte value to a random number. Stuck-at
fault permanently changes the value of one bit to 0 (stuck-at-0) or 1 (stuck-at-1).
We refer to those different effects as fault models.
If the fault injected in an intermediate value x results in a faulty value .x ' , we refer
to .ε := x ⊕ x ' as the fault mask, which represents the change in the faulted value.
We can divide the faults into two types depending on how long the effects
last. A permanent fault is a destructive fault that changes the value of a memory
cell permanently and hence affects data during the computations. Whereas when a
transient fault is injected, the circuit recovers its original behavior after the fault
stimulus ceases (usually just one instruction) or after the device reset. A transient
fault can perturb both data and instruction. In this chapter, we only consider transient
faults.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 353
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_5
354 5 Fault Attacks and Countermeasures
After the fault injection, there are two possible scenarios. The output (ciphertext)
is faulty, or the fault is ineffective and the ciphertext is not changed. We will see that
both scenarios can be exploited.
In the rest of this chapter, we will discuss fault attacks and countermeasures
for symmetric block ciphers (Sects. 5.1 and 5.2) and for RSA and RSA signatures
(Sects. 5.3 and 5.4).
This section presents a few fault attack methods on symmetric block ciphers. By
convention (see Kerckhoffs’ principle in Definition 2.1.3), we assume that the
specifications of round functions and key schedules are public. The master key, and
hence also the round keys, is secret. We also assume that throughout the attack,
the same master key is used, and the goal of the attacker is normally to recover
certain round key(s). The methodologies presented can be applied to an unprotected
implementation of any symmetric block cipher proposed up to now.
Fault attacks normally aim to recover the last/first round key(s) and then use the
inverse key schedule to find the master key. As mentioned in Remarks 3.1.1, 3.1.4,
and 3.1.5, for DES and PRESENT-80, the knowledge of any round key gives 48 and
64 bits of the master key, respectively, and the rest of the key bits can be brute forced,
while for AES, the value of any round key reveals the value of the full master key.
Differential Fault Analysis (DFA) was first introduced by Biham et al. [BS97] in
1997. It has been studied by numerous researchers in different settings and is one of
the most popular fault attack analysis methods for symmetric block ciphers.
DFA considers a fault injection into the intermediate state of the cipher, normally
in the last few rounds. Then the difference between correct and faulty ciphertexts is
analyzed to recover the round key(s).
Before going into details of DFA, we recall the notion of differential distribution
table of an Sbox from Definition 4.3.1.
Example 5.1.1 Let us consider one of the DES Sboxes, SB.1DES : F62 → F42
(Table 3.3). We note that since the maximum bit length of the input, 6, is longer
than that of the output, 4, the output difference can be zero for some cases. The size
of the table is .26 × (24 − 1). Part of it can be found in Table 5.1.
For example,
= SB1DES (001101) ⊕ 7 = 13 ⊕ 7 = A.
Table 5.1 Part of the difference distribution table for SB.1DES (Table 3.3)
H
HH δ 1 2 ... 7 8 ...
Δ H
0 ... 13,14 ...
1 ... 1,6,30,37 ...
2 ... 3,4,A,D,23,24,31,33,34,36 ...
3 1C,1D,2C,2D,3C,3D 5,7,C,E,21,23,30,32 ... 2,5,8,F 11,12,13,17,19,1A,
1B,1F,26,2E,37,3F ...
5.1 Fault Attacks on Symmetric Block Ciphers
4 ... ...
5 6,7 20,22,3C,3E ... 1B,1C,3B,3C 7,F,23,27,2B,2F,35,3D . . .
6 C,D,24,25 24,26,2D,2F ... 9,E,11,12,15,16,20,27 4,C,10,14,18,1C,32,3A . . .
7 16,17,32,33 15,17,1C,1E ... 21,26,2A,2D 22,2A,36,3E ...
8 ... 10,17 ...
9 E,F,10,11,28,29,36,37,38,39 10,12,2C,2E,38,3A ... B,C,22,25 6,E,20,25,28,2D ...
A 4,5,14,15,26,27,30,31,34,35,3A,3B 0,2,14,16,25,27,39,3B . . . 0,7,1A,1D,28,2F,39,3E 5,D ...
.. .. .. .. .. .. ..
. . . . . . .
355
356 5 Fault Attacks and Countermeasures
c = a & b.
.
a b .c =a&b
0 0 0
0 1 0
1 0 0
1 1 1
Suppose the output c can be observed by the attacker and .a, b are unknown.
The goal of the attacker is to recover the value of a. This can be achieved by
DFA—during the computation, the attacker injects a fault in b by flipping it. By
the knowledge of the faulty and the correct outputs, the attacker can easily recover
the value of a: If the output stays the same, then .a = 0; otherwise .a = 1.
Next, we will detail how DFA works on an Sbox. Let SB.: Fω2 1 → Fω2 2 be an
Sbox, and let .a ∈ Fω2 1 , b ∈ Fω2 2 be fixed secret values. Define
f : Fω2 1 → Fω2
. (5.1)
x |→ SB(x ⊕ a) ⊕ b.
Suppose the attacker has the knowledge of the Sbox design, inputs and outputs of
f , and the fault mask .ε. Furthermore, the attacker can repeat the computation with
the same input (not chosen by the attacker). With details of the Sbox, the attacker
can compute the DDT, denoted by T , of SB.
Let .Δ denote the difference between the correct and the faulty output; we have
Example 5.1.3 (How DFA works on PRESENT Sbox) Let us consider the case
when the Sbox in the definition of f (Eq. 5.1) is the PRESENT Sbox (Table 3.11).
Suppose the attacker fixes the input to be .x = 0, and they know that the correct
output of f is 0.
When the attacker injects fault in .x with fault mask .ε1 = 3, they get a faulty
output 1. By Eq. 5.2, we have
Δ1 = 0 ⊕ 1 = 1.
.
x ⊕ a = 9.
.
One might ask, how many faults are needed to recover the values of .a and .b. If
we take a closer look at Table 4.1, we can see that in case the attacker can choose the
fault mask, they only need two faults. For example, fault masks 3 and 5 can uniquely
determine the Sbox input—any two distinct elements that appear in the same entry
in column .δ = 3 are in two different entries in column .δ = 5. When a random fault
mask is considered, a brute force analysis can show that at most four different fault
masks are needed.
Now, we will discuss how DFA can break implementations of DES (Sect. 3.1.1).
Recall that DES is a Feistel cipher. Its cipher state at the end of round i can be
denoted as .Li and .Ri , where L stands for left and R stands for right. The DES round
function F satisfies
(5.3)
Before the first round function, the encryption starts with an initial permutation (IP).
The inverse of IP, called the final permutation (IP.−1 ), is applied to the cipher state
after the last round before outputting the ciphertext. In our analysis, we ignore the
final permutation and consider the value before that as the ciphertext. Otherwise, the
attacker can easily obtain this value by applying IP to the ciphertext.
At the ith round, the function f in the round function (Eq. 5.3) of DES takes
input .Ri−1 ∈ F322 and round key .Ki ∈ F2 and outputs a 32-bit intermediate value
48
as follows:
Then the output .EDES (Ri−1 ) is XOR-ed with the round key .Ki , producing a 48-bit
intermediate value. This 48-bit value is divided into eight 6-bit subblocks. Eight
j
distinct Sboxes, SB.DES : F62 → F42 (1 ≤ j ≤ 8), are applied to each of the 6 bits.
Finally, the resulting 32-bit intermediate value goes through a permutation function
.PDES : F
32 → F32 (Table 3.4).
2 2
For .j = 1, 2, . . . , 8, let .EDES (Ri )j denote the j th 6 bits of .EDES (Ri ). For
example, .EDES (Ri )1 are bits at positions .1, 2, 3, 4, 5, 6 of .EDES (Ri ) (see also Note
j −1
in Sect. 3.1.1). Similarly, let .Ki denote the j th 6 bits of .Ki and .PDES (Ri ⊕ Li−1 )j
−1
be the j th 4 bits of .PDES (Ri ⊕ Li−1 ). By Eqs. 5.3 and 5.4, we have
5.1 Fault Attacks on Symmetric Block Ciphers 359
−1 j j
PDES
. (Ri ⊕ Li−1 )j = SBDES (EDES (Ri−1 )j ⊕ Ki ). (5.5)
We consider a fault injection at the right half of the cipher state at the beginning
of the 16th round, i.e., fault in .R15 . Suppose the fault model is 1-bit flip. In other
words, the fault mask .ε ∈ F32
2 satisfies .wt (ε) = 1 and
'
R15
. = R15 ⊕ ε.
We assume the attacker has the knowledge of the output of DES (correct
and faulty ciphertexts), fault model, and fault location. They can also repeat the
computation with the same plaintext, not chosen by the attacker. The attacker’s goal
is to recover .K16 , the last round key.
Let .L'16 and .R16 ' denote the left and right parts of the faulty ciphertext,
respectively. By our assumption, the attacker has the knowledge of .L'16 and .L16 .
Since .R15 = L16 , we have
Define
'
ΔR16 := R16
. ⊕ R16 . (5.7)
By Eq. 5.5,
−1 j j
PDES
. (R16 ⊕ L15 )j = SBDES (EDES (L16 )j ⊕ K16 ),
−1 '
⊕ L15 )j = SBDES (EDES (L'16 )j ⊕ K16 )
j j
PDES (R16
j j
= SBDES (EDES (L16 ⊕ ε)j ⊕ K16 ).
j
Thus, .EDES (L16 )j ⊕ K16 is an input for the j th DES Sbox such that with input
−1
difference .EDES (ε)j , the output difference is .PDES (ΔR16 )j . With the knowledge of
j
.ε, .ΔR16 , and .L16 , the attacker can reduce the key hypotheses for .K .
16
We note that if .EDES (ε)j = 0, the input for the j th Sbox is not changed, and
the output will also not change. In this case, we say that this Sbox is inactive.
Otherwise, we say the Sbox is active. For an inactive Sbox, a different fault mask
will be needed to activate this Sbox. Since we consider a 1-bit flip, by the design of
.EDES (Table 3.2), 16 bits of the input are repeated in the output; thus only one or
L15 = 00000000,
. R15 = 00000000, K16 = 14D8F55DAA7A.
By Eq. 5.4,
By Eq. 5.3,
and
.L'16 = R15
' '
= 40000000, R16 '
= L15 ⊕ f (R15 , K16 ) = 83AAB98E.
By Eq. 5.7,
'
ΔR16 = R16
. ⊕ R16 = 00800200.
Since the bit flip is in the second bit of input for .EDES , according to Table 3.2,
j
the third bit of the output of .EDES will be changed. Consequently, .SBDES is active
for .j = 1 and inactive otherwise. We have
EDES (ε)1 = 8,
. EDES (ε)j = 0 for j /= 1.
By Table 3.4, the first 4 bits of the output of .PDES are given by the 9th, 17th, 23rd,
and 31st bits of the input, hence
−1
PDES
. (ΔR16 )1 = 1010 = A.
1 is equal
gives output difference .A when the input difference is .8. By Table 5.1, .K16
to one of the two possible values: 5 and D, where 5 agrees with the first 6 bits of .K16 .
In [BS97], the authors reported that with exhaustive search, they found that,
on average, four possible 6-bit key hypotheses remain for each active Sbox. An
improved attack that considers fault injection in the earlier rounds can be found
in [Riv09]
In this part, we discuss a DFA attack on AES-128 implementations. Recall that AES
cipher state can be represented as a 4 .× 4 matrix of bytes (see Eq. 3.2):
⎛ ⎞
s00 s01 s02 s03
⎜s10 s11 s12 s13 ⎟
.⎜ ⎟. (5.8)
⎝s20 s21 s22 s23 ⎠
s30 s31 s32 s33
Let us represent those bytes by squares as in Fig. 3.6 for visual illustration. Suppose
a fault is injected at the beginning of one round (except for the last round) in byte
.s00 . Then the fault propagation in this round can be represented by Fig. 5.2, where
blue squares correspond to bytes that might be affected by the fault. Since SubBytes
and ShiftRows only affect 1 byte and the first row does not change in ShiftRows
operation, in the first three states, the blue squares stay in the same position.
MixColumns takes one column as input and outputs one column. AddRoundKey
does not change the fault effects. Hence in the last state, the whole first column can
be affected by the fault. Similarly, if the fault is injected at the beginning of one
round in any combination of bytes .s00 , s11 , s22 , s33 , at the end of this round, the
whole first column might be affected by the fault. Some cases are shown in Fig. 5.3.
Let us refer to the bytes .s00 , s11 , s22 , s33 as a diagonal of AES state. We consider
a fault attack where a random byte fault is injected in this diagonal of the AES state
at the end of round 7. By the above discussion, we know that at the end of round 8,
the whole first column might be affected by the fault. Similarly, we can study the
fault propagation in round 9. Let .δi (.i = 1, 2, 3, 4) denote the differences between
the four correct and faulty bytes in the first column of the cipher state after SubBytes
in round 9. An illustration is shown in Fig. 5.4, where .S8 (respectively, .S9 ) denotes
Fig. 5.2 Visual illustration of how the fault propagates when a fault is injected at the beginning
of one AES round (not the last round) in byte .s00 . Blue squares correspond to bytes that can be
affected by the fault
362 5 Fault Attacks and Countermeasures
Fig. 5.4 Visual illustration of fault propagation in the 9th round of AES when the fault was
injected in the diagonal .s00 , s11 , s22 , s33 of the AES cipher state at the end of round 7
the cipher state at the end of round 8 (respectively, round 9). After ShiftRows, those
four .δi s move to different positions as shown in the third cipher state in the figure.
Recall that MixColumns multiplies one column by the following matrix (see
Eq. 3.6):
⎛ ⎞
02 03 01 01
⎜01 02 03 01⎟
.⎜ ⎟
⎝01 01 02 03⎠ .
03 01 01 02
Since this is a linear operation, the differences will also be multiplied by the
corresponding coefficients in the matrix. Consequently, we get the last state .S9 as
shown in Fig. 5.4.
Let us represent the cipher state at the end of round 9 .S9 , the correct ciphertext
c, and the last round key .K10 with the following matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a01 a02 a03 c00 c01 c02 c03 k00 k01 k02 k03
⎜a10 a11 a12 a13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S9 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝a20 a21 a22 a23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
a30 a31 a32 a33 c30 c31 c32 c33 k30 k31 k32 k33
which gives
a00 = SB−1
.
AES (c00 ⊕ k00 )
a10 = SB−1
AES (c13 ⊕ k13 )
a20 = SB−1
AES (c22 ⊕ k22 )
a30 = SB−1
AES (c31 ⊕ k31 ).
Then
'
a00
. = SB−1 '
AES (c00 ⊕ k00 )
'
a10 = SB−1 '
AES (c13 ⊕ k13 )
'
a20 = SB−1 '
AES (c22 ⊕ k22 )
'
a30 = SB−1 '
AES (c31 ⊕ k31 ).
Then for each value of .δ, the possible values for .k00 , k13 , k22 , k31 will be restricted
by the above four equations. In particular,
a00 = SB−1
.
AES (c00 ⊕ k00 )
can be considered as an AES Sbox input that corresponds to input difference .2δ and
' . Similarly,
output difference .c00 ⊕ c00
a10 = SB−1
.
AES (c13 ⊕ k13 ), a20 = SB−1
AES (c22 ⊕ k22 ), a30 = SB−1
AES (c31 ⊕ k31 )
364 5 Fault Attacks and Countermeasures
δ,
. δ, 3δ,
respectively. It was shown [SMR09] that, on average, the key hypotheses for
(k00 , k13 , k22 , k31 ) can be reduced to .28 .
.
By AES encryption and key schedule (Sect. 3.1.2), we can find that (see [NIS01]
Appendix C)
⎛ ⎞
D1 79 B4 D6
⎜87 C4 55 6F⎟
.S7 = ⎜ ⎟
⎝6C 30 94 F4⎠ ,
0F 0A AD 1F
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
47 A4 E0 AE 54 F0 10 BE 13 E3 F3 4D
⎜43 1C 16 BF⎟ ⎜99 85 93 2C⎟ ⎜11 94 07 2B⎟
.K8 = ⎜ ⎟ K9 = ⎜ ⎟ =⎜ ⎟
⎝87 65 BA 7A⎠ , ⎝32 57 ED 97⎠ , K10 ⎝1D 4A A7 30⎠ .
35 B9 F4 D2 D1 68 9C 4E 7F 17 8B C5
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
3E B6 8D F6 3E B6 8D F6 BA A1 D5 5F
SB ⎜ 17 1C FC A8⎟ ⎜ ⎟ ⎜
⎟ SR ⎜1C FC A8 17⎟ MC ⎜A0 F9 51 41⎟
⎟
.S7 −→⎜⎝50 04 22 BF⎠ −→ ⎝22 BF 50 04⎠ −−→ ⎝3D B5 2C 4D⎠
76 67 95 C0 C0 76 67 95 E7 6E BA 23
⎛ ⎞
FD 05 35 F1
AK ⎜E3 E5 47 FE⎟
−−→ ⎜ ⎟
⎝BA D0 96 37⎠ = S8 ,
D2 D7 4E F1
where SB, SR, MC, and AR stand for SubBytes (Table 3.9), ShiftRows, Mix-
Columns, and AddRoundKey, respectively. The operations in round 9 compute
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
54 6B 96 A1 54 6B 96 A1 E9 02 1B 35
SB ⎜ 11 D9 A0 BB⎟ ⎜ ⎟ ⎜
⎟ SR ⎜D9 A0 BB 11⎟ MC ⎜F7 30 F2 3C⎟
⎟
.S8 −→⎜⎝F4 70 90 9A⎠ −→ ⎝90 9A F4 70⎠ −−→ ⎝4E 20 CC 21⎠
B5 0E 2F A1 A1 B5 0E 2F EC F6 F2 C7
⎛ ⎞
BD F2 0B 8B
AK ⎜6E B5 61 10⎟ ⎟
−−→ ⎜⎝7C 77 21 B6⎠ = S9 .
3D 9E 6E 89
In round 10 we have
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
7A 89 2B 3D 7A 89 2B 3D 69 6A D8 70
SB ⎜ 9F D5 EF CA⎟ ⎜ ⎟ ⎜
⎟ SR ⎜D5 EF CA 9F⎟ AK ⎜C4 7B CD B4⎟
⎟
.S9 −→⎜⎝10 F5 FD 4E⎠ −→ ⎝FD 4E 10 F5⎠ −−→ ⎝E0 04 B7 C5⎠ = c.
27 0B 9F A7 A7 27 0B 9F D8 30 80 5A
Suppose a fault is injected in byte .s00 of .S7 with fault mask D8. We have
'
.s00 = D1 ⊕ D8 = 09.
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
09 79 B4 D6 01 B6 8D F6 01 B6 8D F6
⎜87 C4 55 6F⎟ SB ⎜17 1C FC A8⎟ SR ⎜1C FC A8 17⎟
.S7 = ⎜
' ⎟ ⎜ ⎟ ⎜ ⎟
⎝6C 30 94 F4⎠ −→ ⎝50 04 22 BF⎠ −→ ⎝22 BF 50 04⎠
0F 0A AD 1F 76 67 95 C0 C0 76 67 95
366 5 Fault Attacks and Countermeasures
⎛ ⎞ ⎛ ⎞
C4 A1 D5 5F 83 05 35 F1
⎜ 9F ⎟ ⎜
F9 51 41⎟ AK ⎜DC E5 47 FE⎟
−−→ ⎜
MC
−−→ ⎝ ⎟ = S' .
⎝02 B5 2C 4D⎠ 85 D0 96 37⎠ 8
A6 6E BA 23 93 D7 4E F1
⎛ ⎞ ⎛ ⎞
EC 6B 96 A1 EC 6B 96 A1
⎜ ⎟ ⎜
' SB ⎜86 D9 A0 BB⎟ SR ⎜D9 A0 BB 86⎟
⎟
.S8 −→⎝ −→ ⎝
97 70 90 9A ⎠ 90 9A 97 70⎠
DC 0E 2F A1 A1 DC 0E 2F
⎛ ⎞ ⎛ ⎞
82 6B 78 97 D6 9B 68 29
MC ⎜4F 59 57 09⎟ ⎜
⎟ AK ⎜D6 DC C4 25⎟
⎟
−−→ ⎜ '
⎝F6 9B 0A B6⎠ −−→ ⎝C4 CC E7 21⎠ = S9 .
3F 24 91 50 EE 4C 0D 1E
2δ = SB−1
.
−1
AES (69 ⊕ k00 ) ⊕ SBAES (E5 ⊕ k00 )
δ = SB−1 −1
AES (B4 ⊕ k13 ) ⊕ SBAES (DD ⊕ k13 )
δ = SB−1 −1
AES (B7 ⊕ k22 ) ⊕ SBAES (BB ⊕ k22 )
3δ = SB−1 −1
AES (30 ⊕ k31 ) ⊕ SBAES (3F ⊕ k31 ).
SB−1
.
AES (69 ⊕ k00 ), SB−1
AES (B4 ⊕ k13 ), SB−1
AES (B7 ⊕ k22 ), SB−1
AES (30 ⊕ k31 )
2δ,
. δ, δ, 3δ
69 ⊕ E5 = 8C,
. B4 ⊕ DD = 69, B7 ⊕ BB = 0C, 30 ⊕ DD = ED,
5.1 Fault Attacks on Symmetric Block Ciphers 367
Table 5.2 Part of the difference distribution table for AES Sbox (Table 3.9) corresponding to
output differences 0C, 69, 8C, and ED
HH δ
Δ HH
1 2 3 4 5 6 7 8 9 ...
respectively.
Part of the AES Sbox difference distribution table corresponding to output
differences 8C, 69, 0C, and ED is shown in Table 5.2. We can see that .δ /= 02
since for input difference 02, the entry for row 69 is empty. In other words, there
are no inputs that have output difference 69 for input difference 02. Thus .δ can only
take values that give nonempty entries in the DDT for columns .2δ, .δ, .δ, .3δ and
corresponding rows 8C, 69, 0C, ED. By searching the rows 8C, 69, 0C, ED, we can
find all possible values of .δ:
01, 06, 0B, 28, 3D, 49, 6B, 76, 8F, 90, A6, B2, B8, D0, EE,
.
in total 15 choices. In most of the entries of AES Sbox DDT, there are only two
values; thus the remaining number of key hypotheses is roughly
24 × 15 ≈ 28 .
.
We can also check that for the correct values .k00 = 13, k13 = 2B, k22 =
A7, k31 = 17 (see Table 3.10 for .SB−1
AES ),
2δ = SB−1
.
−1
AES (7A) ⊕ SBAES (F6) = BD ⊕ D6 = 6B
δ = SB−1 −1
AES (9F) ⊕ SBAES (F6) = 6E ⊕ D6 = B8
δ = SB−1 −1
AES (10) ⊕ SBAES (1C) = 7C ⊕ C4 = B8
3δ = SB−1 −1
AES (27) ⊕ SBAES (28) = 3D ⊕ EE = D3.
The other three columns of .S9 in Fig. 5.4 can provide similar results, reducing
the key hypotheses for other key bytes of .K10 . Consequently, with just one pair of
correct and faulty ciphertext, the key hypotheses for .K10 can be reduced to .232 as
opposed to the original .2128 .
368 5 Fault Attacks and Countermeasures
3δ4 2δ3 δ2 δ1
2δ4 δ3 δ2 3δ1
δ4 δ3 3δ2 2δ1
δ4 3δ3 2δ2 δ1
S7 S8 S9
We note that in this attack, we assume the attacker has the knowledge of the fault
location (diagonal of cipher state at the end of round 7), fault model (random byte),
and output of AES (correct and faulty ciphertext). Since the attack is on the diagonal
of the cipher state, it is also called the diagonal DFA. Similar attacks can be carried
out if the fault is injected in the other three “diagonals” of the cipher state at the end
of round 7. The corresponding fault propagations are depicted in Fig. 5.5, where .Si
denotes the cipher state at the end of round i.
1
P (X' = x ' |X = x) /=
.
2b
1
P (X' = x ' |X = x) /=
.
2
for some x and .x ' . Thus both stuck-at-0 and bit-flip fault models are nonuniform.
Example 5.1.7 We again consider the case when x is 1 bit. We discuss two more
complicated nonuniform fault models. Stuck-at-0 with probability .0.5 changes x
to 0 with probability .0.5. The corresponding fault distribution table is shown in
Table 5.4 (a). Random-AND with .δ, where .δ follows a uniform distribution, has the
same fault distribution table. For example,
In this part, we will discuss an SFA attack on AES-128. We represent the cipher
state at the end of round 9 .S9 , the correct ciphertext c, and the last round key .K10
with the following matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S9 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33
We consider a fault in .s00 with a nonuniform fault model. Let .S00 and .S00' denote the
'
random variables corresponding to .s00 and its faulty value .s00 , respectively. Suppose
the attacker has the knowledge of the fault location and the fault distribution table,
i.e., the probabilities
' '
P (S00
. = s00 |S00 = s00 )
1
.P (S00 = s00 ) = , ∀s00 ∈ F82 .
256
Then by Lemma 1.7.2,
⎲
255
' ' ' '
.P (S00 = s00 ) = P (S00 = s00 |S00 = s00 )P (S00 = s00 )
s00 =0
1 ⎲
255
' '
= P (S00 = s00 |S00 = s00 ). (5.10)
256
s00 =0
i
ŝ00
. = SB−1 'i
AES (c00 ⊕ k̂00 ). (5.11)
i can be
The probability that the faulty value of .s00 in the ith encryption equals .ŝ00
found using the fault distribution table with Eq. 5.10:
1 ⎲
255
' '
P (S00
. = ŝ00
i
)= P (S00 = ŝ00
i
|S00 = s00 ).
256
s00 =0
Define .𝓁(k̂00 ) to be the probability that the faulty value of .s00 in the ith encryption
i for all i, i.e.,
equals the hypothetical value .ŝ00
∏
m
'
𝓁(k̂00 ) :=
. P (S00 = ŝ00
i
). (5.12)
i=1
5.1 Fault Attacks on Symmetric Block Ciphers 371
Then the correct key can be found using the maximum likelihood approach
' '
.k00 = c00 ⊕ SBAES (00) = c00 ⊕ 63.
Example 5.1.9 In this example, we consider a random-AND fault model such that
⎧
1 ' = s ANDδ
s00
' ' 00
.P (S00 = s00 |S00 = s00 ) =
0 Otherwise,
where
1
P (δ = x) =
. , ∀x ∈ F82 .
256
By Eq. 5.10,
1 ⎲
255
' ' ' '
P (S00
. = s00 )= P (S00 = s00 |S00 = s00 )
256
s00 =0
⎛ 255 ⎞
1 ⎲ ⎲
255
' '
= P (S00 = s00 |S00 = s00 , δ = x)P (δ = x)
256
s00 =0 x=1
1 ⎲ {
255
'
}
= 2
| δ | s00 = δANDs00 |
256
s00 =0
{ ' = δANDs ,
} '
| (δ, s00 ) | s00 00 s00 ∈ F82 , δ ∈ F82 | 38−wt(s00 )
= 2
= ,
256 2562
(5.13)
( ' ) ' (See Definition 1.6.10). To
where .wt s00 denotes the Hamming weight of .s00
'
derive the last equality, we note that if 1 bit of .s00 is 0, then the corresponding
372 5 Fault Attacks and Countermeasures
bit for .δ and .s00 can be either 0 or 1 but not both 1, giving us three choices. If 1 bit
' is 1, then the corresponding bit in .δ and .s must both be 1.
of .s00 00
Let .s00 = AB, .k00 = 00. Then (see Table 3.9 for .SBAES )
Suppose five injected faults result in values of .δ = 0F, F0, FF, 54, CD, respec-
tively. Then the corresponding faulty values of .s00 'i are .0B, A0, AB, 00, 89. And the
'i
faulty ciphertext bytes .c00 are .2B, E0, 62, 63, A7.
Take .k̂00 = 1A, and by Eq. 5.11, we have (see Table 3.10 for .SB−1 AES )
1
ŝ00
. = SB−1 −1
AES (2B ⊕ 1A) = SBAES (31) = C7,
2
ŝ00 = SB−1 −1
AES (E0 ⊕ 1A) = SBAES (FA) = 2D,
3
ŝ00 = SB−1 −1
AES (62 ⊕ 1A) = SBAES (78) = BC,
4
ŝ00 = SB−1 −1
AES (63 ⊕ 1A) = SBAES (79) = B6,
5
ŝ00 = SB−1 −1
AES (A7 ⊕ 1A) = SBAES (BD) = 7A.
∏
5
' ' '
.𝓁(1A) = P (S00 = ŝ00
i
) = P (S00 = C7)P (S00 = 2D)
i=1
' ' '
P (S00 = BC)P (S00 = B6)P (S00 = 7A)
1
= × 38×5−wt(C7)−wt(2D)−wt(BC)−wt(B6)−wt(7A)
25610
1 316
= 10
× 340−5−4−5−5−5 = .
256 25610
And
∏
5
' ' ' '
𝓁(00) =
. P (S00 = ŝ00
i
) = P (S00 = 0B)P (S00 = A0)P (S00 = AB)
i=1
' '
P (S00 = 00)P (S00 = 89)
1
= × 38×5−wt(0B)−wt(A0)−wt(AB)−wt(00)−wt(89)
25610
1 327
= 10
× 340−3−2−5−3 = .
256 25610
We can see that .𝓁(00) > 𝓁(1A).
It was shown in [FJLT13] that with high probability, the correct key byte can be
found with only a few faults. The same method can recover other bytes of .K10 . We
note that each byte can be recovered in parallel; hence the number of faults required
to get the full round key depends on the number of bytes that can be faulted with
one fault injection.
In case the attacker only knows that the fault model is nonuniform, without the
knowledge of its fault distribution table, a metric based on the Square Euclidean
Imbalance (SEI) can be used. Define
⎛ { } ⎞2
⎲
255
| i | ŝ00
i =j |
1
.SEI(k̂00 ) := − .
m 256
j =0
We can see that by definition, SEI measures a certain distance between the obtained
' and the uniform distribution. Since we know that
hypothetical distribution of .S00
' to be far
the fault model is nonuniform, we expect the distribution induced by .S00
from the uniform distribution. Thus we take the correct key to be
In this part, we consider the fault to be injected in the output of round 8, .S8 . Similar
to before, we represent the cipher state at the end of round 8 .S8 , the correct ciphertext
c, and the last round key .K10 with the following matrices:
374 5 Fault Attacks and Countermeasures
Fig. 5.6 Illustration of fault propagation for a fault injected in the first byte of .S8 (the cipher state
at the end of round 8)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S8 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33
Suppose a fault is injected in .s00 with a nonuniform fault model. The fault
propagation is shown in Fig. 5.6.
We can see that .s00 is related to .c00 , c13 , c22 , c31 and .k00 , k13 , k22 , k31 as follows:
.s00 = SB−1
AES (a00 ⊕ InvMixColumns for the first column
(SB−1 −1 −1 −1
AES (c00 ⊕ k00 ), SBAES (c13 ⊕ k13 ), SBAES (c22 ⊕ k22 ), SBAES (c31 ⊕ k31 ))).
We have
.s00 = SB−1 −1 −1
AES (a00 ⊕ 0E · SBAES (c00 ⊕ k00 ) ⊕ 0B · SBAES (c13 ⊕ k13 )
⊕ 0D · SB−1 −1
AES (c22 ⊕ k22 ) ⊕ 09 · SBAES (c31 ⊕ k31 )).
{ }
With a set of m faulty ciphertexts . c'1 , c'2 , . . . , c'm , the attacker can make
hypothesis on the values of .k00 , k13 , k22 , k31 , and .a00 , denoted by .k̂00 , k̂13 , k̂22 , k̂31 ,
5.1 Fault Attacks on Symmetric Block Ciphers 375
' ,
and .â00 . Then they can compute the corresponding hypothetical values for .s00
i
denoted .ŝ00 , as follows:
i
ŝ00
. = SB−1 −1 'i −1 'i
AES (â00 ⊕ 0E · SBAES (c00 ⊕ k̂00 ) ⊕ 0B · SBAES (c13 ⊕ k̂13 )
The correct key bytes can be recovered with either maximum likelihood (when
the fault distribution table is known) or SEI (when the fault distribution table is
unknown) as discussed above.
We refer the reader to the original paper [FJLT13] for other methods of obtaining
the correct key hypothesis and attacks in even earlier rounds of AES.
The main advantage of SFA is that only faulty ciphertexts are required, and
there is no need for repeated plaintexts. However, the attack assumes that each fault
injection is successful.
Persistent Fault Analysis (PFA) [ZLZ+ 18] considers a fault in the memory, normally
where the Sbox lookup table is stored. As we do not expect the table to be rewritten
during the computation, the fault would stay until the device is reset; hence the name
“persistent” is used in the attack method.
We will use AES-128 as a running example to show how the attack works. The
methodology also applies to other block ciphers.
We consider a random byte fault model. Suppose the fault location is in the first
byte of the Sbox lookup table. Then the output for SB.AES (00) = v is changed to .v ' ,
the fault mask .ε ∈ F82 is given by
ε = v ⊕ v' .
.
By Table 3.9, we know that .v = 63. We assume the attacker has the knowledge of
the output of AES (correct and faulty ciphertexts), fault model, and fault location.
However, the attacker does not know the fault mask. The attacker aims to recover
the last round key .K10 .
Recall that in round 10, the operations for AES encryption include Sub-
Bytes, ShiftRows, and AddRoundKey. We represent the cipher state right before
AddRoundKey in round 10, denoted S, the ciphertext c, and .K10 with the following
matrices:
376 5 Fault Attacks and Countermeasures
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33
. k00 /= c00 ⊕ v.
{ }
Thus the attacker can collect a set of m (faulty) ciphertexts . c1 , c2 , . . . , cm and
eliminate key hypotheses for .k00 that are equal to
i
c00
. ⊕ v.
Example 5.1.10 Let the key byte .k00 = 45 and the fault mask .ε = 12. Then the
faulty output of AES Sbox for input .00 becomes
63 ⊕ 12 = 71.
.
In this case, no matter what input the AES Sbox gets during the computations, the
output will never be 63. In particular, we have .s00 /= 63. Equivalently,
k00 ⊕ 63 = 45 ⊕ 63 = 26
.
would never appear in the first byte of the ciphertexts. Otherwise, we would have
00 ⊕ 63 = 63,
. 12 ⊕ 63 = 71, FE ⊕ 63 = 9D
⎧
⎪
⎨≈ c00 = v ⊕ ε ⊕ k00
2
⎪ 256
P (Y = c00 ) = 0
. c00 = v ⊕ k00
⎪
⎪
⎩≈ 1 otherwise.
256
{ }
Given a set of ciphertexts . c1 , c2 , . . . , cm , if we look at the first byte of those
ciphertexts, we expect .v ⊕ ε ⊕ k00 to appear with the highest frequency. Thus the
attacker computes
{ }
ymax := arg max| c00
.
i
| c00
i
= y |.
y
2
. P̂ (ymax ) ≈ .
256
The simulated results in [ZLZ+ 18] show that with about 4000 faulty ciphertexts, the
empirical probability of .ymax is high enough to be distinguished from that of other
values of y.
In line 2, we load the address of Table 1, then line 3 looks up the table, and finally,
line 4 stores the table output in the register r21. These three lines implement line 1
in Algorithm 3.5. Afterward, in line 5, the leftmost 2 bits are extracted from r21 and
stored in r21. This corresponds to line 5 in Algorithm 3.5. As we have explained in
Sect. 3.2.2.1, these 2 bits correspond to bits at positions 0 and 1 of pLayer output.
Similarly, lines 6–8 extract the 2nd and 3rd bits of pLayer output using Table 2 and
store them in r23. Then line 9 combines bits 0, 1, 2, and 3 with bitwise OR. Then
the implementation continues to extract pLayer output bits at positions 4 and 5 with
Table 3 and at positions 6 and 7 with Table 4. Those bits are all combined through
bitwise OR to register r21 (lines 12 and 15).
The fault attack on this implementation injects fault in register r23 between
lines 14 and 15 in the final round of PRESENT. The fault model used is a bit flip.
5.2 Fault Countermeasures for Symmetric Block Ciphers 379
c7 c6 c5 . . . c0 = b7 b6 b5 . . . b0 ⊕ κ7 κ6 . . . κ1 κ0 .
.
After line 15, the value in register r21 will then become .111111b1 b0 . And the
faulty ciphertext byte .c7' c6' c5' . . . c0' is given by
Since the faulty ciphertext byte .c7' c6' c5' . . . c0' is known, the attacker can recover 6
bits of .K32 by computing
A simple countermeasure one might consider to protect against certain fault attacks
would be to repeat the encryption, compare the two outputs, and only return the
ciphertext if those two outputs are equal [BECN+ 06]. For example, for DFA
attacks described in Sect. 5.1.1, such a countermeasure will be successful since those
attacks require the knowledge of the faulty ciphertext. However, an easy attack on
380 5 Fault Attacks and Countermeasures
We recall from Definition 1.6.3 that the (minimum) distance of a binary code C,
denoted .dis (C), is given by
We have seen that a binary code with minimum distance .dis (C) can detect
dis (C) − 1 bit flips (see Definition 1.6.5 and Theorem 1.6.1). Thus a natural choice
.
for fault countermeasure is to consider encoding the intermediate values during the
computation. The question is, which code to choose and how to implement it?
As an example of what kind of code to use, we will discuss one proposal of
using anticode (see Definition 1.6.12) for the countermeasure against bit flips and
instruction skips [BHL19]. Recall that a binary .(n, M, d, δ)-anticode has length n,
cardinality M, minimum distance d, and maximum distance .δ, where the maximum
distance of a binary code C (see Definition 1.6.12) is given by
then the resulting faulty value is still a codeword and cannot be detected. Since there
are in total M codewords, the possibility for the fault to go undetected is at least
.2/M. Thus, a very big maximum distance is also not desirable.
We refer the reader to the original paper [BHL19] for the formalization of
encoding-based countermeasures for symmetric block ciphers and calculations of
the probability of detecting any m-bit flips and instruction skips given a binary code.
The authors also provide a theoretical analysis which concludes that to have overall
good protection against all possible bit flips, it is better to use code with not too
small minimum distance and not too big maximum distance.
5.2 Fault Countermeasures for Symmetric Block Ciphers 381
Example 5.2.1 As a simple example, let us consider .{01, 10}, a binary .(2, 2, 2, 2)-
anticode. Since there are two codewords, it can be used to encode 1 bit of
information. Let 01 be the codeword for 0 and 10 be the codeword for 1. The
lookup table for carrying out XOR between .a, b (.a, b ∈ F2 ) is shown in Table 5.5.
As mentioned before, 00 indicates an error. Thus the table outputs 00 if one input is
not a codeword.
Example 5.2.2 Let us consider bit flip attacks on the inputs of XOR operation from
Example 5.2.1. We can see that any 1-bit flip will be detected: If the fault is injected
in input 01, with 1-bit flip, we get either 00 or 11; both will give output 00. Similarly,
if 1-bit flip is injected in input 10, we will have 00 or 11, and the output will again
be 00.
On the other hand, a 2-bit flip will be undetected. For example, suppose we would
like to compute .0 ⊕ 0. Then the inputs for the table lookup will be 01 and 01, and
the output will be 01, which corresponds to 0. If a 2-bit flip is injected in the first
input, we get 10 and 01 for the table lookup. The result will be 10. Such a fault will
not be detected and can successfully change the output of the operation.
We recall the notion of Quotient group and Remainder group for PRESENT
Sboxes from Sect. 4.5.2.3. We have discussed that pLayer can be considered as four
identical parallel bitwise operations where each is a function .p : F16 2 → F2 that
16
takes one Quotient group output and permutes it to the corresponding Remainder
group input. Furthermore, we have seen in Sect. 3.1.3 that addRoundKey is a
function .F64
2 → F2 . Each Sbox in the sBoxLayer is a function SB.: F2 → F2 .
64 4 4
Thus, one convenient code choice would be those with cardinality 16, encoding 4
bits of information. In particular, we are looking for a binary .(n, 16, d, δ)-anticode,
where d is the minimum distance of the code and .δ is the maximum distance of the
code.
We refer the readers to [BHL19] for an algorithm for finding anticodes that
achieve a low probability of undetected faults with given length, minimum distance,
and maximum distance. In the rest of this subsection, we will use the following
binary .(8, 16, 2, 7)-anticode as a running example
. {01, 08, 02, 0B, 04, 1D, 1E, 30, 07, 65, 6A, AD, B3, CE, D9, F6} . (5.14)
In particular, 01 is the codeword for 0000, 08 in the codeword for 0001, etc. And
we write
5.2 Fault Countermeasures for Symmetric Block Ciphers 383
01 = encode(0000).
.
Example 5.2.3 Using the anticode given in Eq. 5.14, the table entry corresponding
to 01 and 08 will be
And we write
~08 = 08.
01⊕
.
. x3 = 1, x2 = 1, x1 = 0, x0 = 1.
.T 0:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(000x3s ), encode(000x2s ),
encode(000x1s ), encode(000x0s )
.T 1:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(00x3s 0), encode(00x2s 0),
encode(00x1s 0), encode(00x0s 0)
384 5 Fault Attacks and Countermeasures
.T 2:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(0x3s 00), encode(0x2s 00),
encode(0x1s 00), encode(0x0s 00)
.T 3:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(x3s 000), encode(x2s 000),
encode(x1s 000), encode(x0s 000).
Thus, each table extracts the bits of the Sbox output, permutes them, and outputs
the corresponding codeword. It is easy to see that each entry of the outputs of each
table can be either encode.(0000) or encode.(0001) for T 0, encode.(0010) for T 1,
encode.(0100) for T 2, and encode.(1000) for T 3.
Example 5.2.5 Suppose the input is .01 = encode(0000). The corresponding Sbox
output would be .C = 1100 (see Table 3.11), i.e., .x3s x2s x1s x0s = 1100. Using the anticode
given in Eq. 5.14, the output of T 0 will be
The output of T 1 is
T 2 gives
Finally, T 3 produces
Example 5.2.6 Suppose the input is .08 = encode(0001). The corresponding Sbox
output would be .5 = 0101, i.e., .x3s x2s x1s x0s = 0101. Using the anticode given in Eq. 5.14,
the output of T 0 will be
The output of T 1 is
T 2 gives
5.2 Fault Countermeasures for Symmetric Block Ciphers 385
Finally, T 3 produces
Now, let the original cipher state at sBoxLayer input be .b63 b62 . . . b0 . For the
encoding-based implementation, the corresponding cipher state will be
encode(b7 b6 b5 b4 )encode(b3 b2 b1 b0 ).
Each codeword in this cipher state will be passed to tables .T 0, T 1, T 2, T 3, and the
outputs will be recorded. Then the output of pLayer will be computed by combining
~.
those table outputs through .⊕
Example 5.2.7 By Table 3.12, the pLayer output bits at positions .0, 1, 2, 3 come
from the bits at positions .0, 4, 8, 12 of the input of pLayer. Thus, we first get
encode.(000b0s ) from T 0 output, encode.(00b4s 0) from T 1, encode.(0b8s 00) from T 2,
s 000) from T 3, and then the 0th nibble of pLayer output will be
and encode.(b12
~encode(00b4 0)⊕
.encode(000b0 )⊕
s ~encode(0b8 00)⊕
s ~encode(b12 000).
s s
As another example, the third nibble (bits .16, 17, 18, 19) of pLayer output is given by
~encode(00b5 0)⊕
.encode(000b1 )⊕
s ~encode(0b9 00)⊕
s ~encode(b13 000).
s s
Remark 5.2.1 By the design of our implementation, when the faulty intermediate value
is not a codeword, the table lookup returns .0, and the attacker will not be able to tell what
the original faulty ciphertext is. Since both DFA and SFA require analysis of the faulty
ciphertexts, they can be prevented when the fault model is bit flip, and the number of bit
flips is lower than the minimum distance of the binary code.
We have also seen that binary codes can correct error. According to Theo-
rem 1.6.2, if m bits are flipped during the computation, a binary code C used for
encoding-based countermeasure can correct this fault as long as .m ≤ ⎿(d − 1)/2⏌,
where d is the minimum distance of C . Note that to realize the incomplete decoding
rule, we need an error message to indicate more than one codeword is at the same
smallest distance from the input word.
For example, let us consider the 3-repetition code .C[3,1,3] = {000, 111}, which is
a .[3, 1, 3]-linear code (see Example 1.6.8). Since .C[3,1,3] contains two codewords,
it can be used to encode 1 bit of information. As 000 is a codeword of .C[3,1,3] , we
cannot use it as the error message. On the other hand, we note that no word in .F32 is
at the same distance from 000 and 111, which means we will always be able to find a
codeword using the minimum distance decoding rule. .C[3,1,3] has minimum distance
386 5 Fault Attacks and Countermeasures
Table 5.6 Lookup table for & 000 001 010 011 100 101 110 111
error-correcting code based
computation of AND between 000 000 000 000 000 000 000 000 000
.a, b (.a, b ∈ F2 ), using the 001 000 000 000 000 000 000 000 000
3-repetition code .{000, 111}. 010 000 000 000 000 000 000 000 000
000 is the codeword for 0, 011 000 000 000 111 000 111 111 111
and 111 is the codeword for 1 100 000 000 000 000 000 000 000 000
101 000 000 000 111 000 111 111 111
110 000 000 000 111 000 111 111 111
111 000 000 000 111 000 111 111 111
Let 000 be the codeword for 0 and 111 be the codeword for 1. The lookup table
for computation of AND between .a, b (.a, b ∈ F2 ) with error correction is shown in
Table 5.6. For example, if the inputs are 0 (000) and 1 (111), the correct output
should be 0, which corresponds to codeword 000.
We can also see that if there are more bit flips, the faulty output might be
corrected to a wrong codeword. For example, if the inputs are 111 and 111, but the
second 111 is faulted to 001 with a 2-bit flip attack, then the table lookup gives output
000. However, since .1 & 1 = 1, the output should be 111. Thus, it is better to only
use error-correcting code-based countermeasure when we know at most .⎿(d − 1)/2⏌
bits can be flipped, where d is the minimum distance of the binary code.
We refer the readers to [BKHL20] for an encoding-based hardware implemen-
tation of PRESENT using the 3-repetition code .C[3,1,3] .
.κ0 = 0000000000000000.
Furthermore,
and
.κ10 = β ⊕ ShiftRows(SubBytes(β)).
The total round counter j is increased by .λ at the end of each loop (line 15).
When .λ = 0, only a dummy round is computed.
Up to now, we have seen how the AES rounds and dummy rounds are computed.
Next, we discuss how fault is handled in the algorithm.
5.2 Fault Countermeasures for Symmetric Block Ciphers 389
.10 : F128
2 → F2
∏
x |→ (1 − xi ).
i
In other words,
⎧
1 if x = 0
.10 (x) =
0 otherwise.
⎧
0 if x = 0
.¬10 (x) =
1 otherwise.
Thus, when j is odd and .λ = 1 (i.e., in the loop when the redundant AES round
is computed), .γ indicates if the cipher state in the AES round computation, .R0 , is
equal to the redundant cipher state, .R1 , or equivalent, whether fault happened in
AES round or in the redundant round computation. If there was no fault, .γ = 0;
otherwise, .γ = 1.
Thus, when .λ = 0, i.e., in the loop when the dummy round is computed, .δ indicates
if there is a fault injected in the computation of the dummy round state .R2 . By the
design of dummy round keys and .β (see Eq. 5.15), if there are no faults, .R2 = β and
.δ = 0. Otherwise, .R2 /= β and .δ = 1.
This line guarantees that .R0 will be changed to a random number .R2 if a fault is
detected in any of the computations. Consequently, the output will be a random
number or infected ciphertext.
The computations for the AES round, the redundant round, and the dummy round
are shown in Algorithms 5.4, 5.5, and 5.6.
5.3 Fault Attacks on RSA and RSA Signatures 391
As discussed in Sect. 2.1.2, a public key cryptosystem has a public key and a private
key. For fault attacks that will be discussed in this section, the attacker’s goal will
be the recovery of the secret key.
Unlike fault attacks on symmetric block ciphers, attacks on public key ciphers
depend on the underlying intractable problem, and we do not have a systematic
methodology. However, the general attack concept can be applied to ciphers
based on similar intractable problems. This section will focus on fault attacks on
implementations of RSA signatures. We will discuss a few fault attacks during the
signature signing procedure to recover the private key.
Note
We note that the attacks on RSA signature signing procedure can also be applied
to RSA decryption process.
For the rest of this section, let p and q be two distinct odd primes. Let .n = pq
and .e ∈ Z∗ϕ(n) be the public key for RSA signatures. .d = e−1 mod ϕ(n) denotes the
private key. The goal of the attacker is to recover d . As discussed in Sect. 3.5.1.3, the
signature is computed on the hash value, .h(m), of the intended message m, where
h is a fast public hash function (see Sect. 2.1.1). For simplicity, we will use m to
denote the hash value .h(m).
Let .𝓁d and .𝓁n denote the bit length of d and n, respectively. We have the following
binary representation (see Theorem 1.1.1) of d :
d −1
𝓁⎲
.d = di 2i .
i=0
We recap here the CRT-based implementation for RSA signatures. Following the
discussions in Sect. 3.5.1.3, to sign the signature for m, the owner of the private key,
say Alice, computes
.s = sp yq q + sq yp p mod n,
or by Garner’s algorithm,
where
Alice sends s and m to Bob. To verify the signature, Bob computes and checks if
.s
e
mod n = m.
We first describe an attack that recovers the private key of RSA signatures
by exploiting a faulty signature. The attack was first introduced by Boneh et
al. [BDL97]. The name “Bellcore” comes from the company the authors were
working for at the time of the publication. This paper is also the very first paper
that introduced fault attacks to cryptographic implementations.
As mentioned in Sect. 3.5.1.3, .yq and .yp (Eq. 5.18) can be precomputed. We
assume there are no faults in their computations.
By the design of .sp , .sq , .yp , and .yq , we have
which gives
Suppose a malicious fault was induced during the signing of the signature and
the computation of .sp or .sq (Eq. 5.17), but not both, is corrupted. Let us assume that
.sp is faulty and .sq is computed correctly. A similar attack applies if .sq is faulty and
'
.sp is correct. Let .s denote the faulty signature. By Eq. 5.19,
'
.s ≡ s ≡ sq mod q, s ' /≡ s mod p.
In other words,
'
.q|(s − s), p ∤ (s ' − s).
Recall that n and e are public. If the attacker further has the knowledge of s and .s ' ,
then they can compute
n
.q = gcd(s ' − s, n), p= .
q
5.3 Fault Attacks on RSA and RSA Signatures 393
.ϕ(n) = (p − 1)(q − 1)
'e
.s ≡ m mod q, s 'e /≡ m mod p,
i.e.,
'e
.q|(s − m), p ∤ (s 'e − m).
.s
e
mod n = 65 mod 35 = 6 = m.
Now suppose the computation of .sp is faulty and .sp' = 3. Then we have
'
.s = sp' + ((sq − sp' )yp mod q)p = 3 + ((6 − 3) × 3 mod 7) × 5 = 3 + 2 × 5 = 13.
If the attacker has the knowledge of .s = 6 and .s ' = 13, they can compute
394 5 Fault Attacks and Countermeasures
If the attacker has the knowledge of .s ' = 13 and .m = 6, they can compute
and .q = 7.
Similarly, suppose the computation of .sq is faulty and .sq' = 2. Then
'
.s = sp + ((sq' − sp )yp mod q)p = 1 + ((2 − 1) × 3 mod 7) × 5 = 16.
If the attacker has the knowledge of .s = 6 and .s ' = 16, they can compute
If the attacker has the knowledge of .s ' = 16 and .m = 6, they can compute
Hence .p = 5.
Example 5.3.2 Let .p = 11, .q = 13. Then .n = 143,
.ϕ(n) = 10 × 12 = 120.
Choose .e = 11, which is coprime with .ϕ(n). By the extended Euclidean algorithm,
and we have .d = 11−1 mod 120 = 11. Again, by the extended Euclidean algorithm,
and we have
5.3 Fault Attacks on RSA and RSA Signatures 395
We have
. 462 mod 143 = 114, 463 mod 143 = 114 × 46 mod 143 = 96,
465 mod 143 = 114 × 96 mod 143 = 76, 4610 mod 143 = 762 mod 143 = 56.
.s
e
mod n = 4611 mod 143 = 56 × 46 mod 143 = 2 = m.
Now suppose the computation of .sp is faulty and .sp' = 7. Then we have
'
.s = sp' + ((sq − sp' )yp mod q)p = 7 + ((7 − 7) × 6 mod 13) × 11 = 7.
Hence .q = 13.
If the attacker has knowledge of .s ' = 7 and .m = 2, they can compute
'
.s = sp + ((sq' − sp )yp mod q)p = 2 + ((2 − 2) × 6 mod 13) × 11 = 2.
In this subsection, we will look at fault attacks on the square and multiply algorithm.
We will first detail the bit flip attack proposed in [BDH+ 97], and then we will
discuss an improved version proposed in [JQBD97].
Instead of a CRT-based implementation, we assume the implementation com-
putes the signature with the right-to-left square and multiply algorithm. Following
Algorithm 3.7, to compute .md mod n, we have Algorithm 5.7, where .𝓁d is the bit
length of d .
For the attack, we inject a bit-flip fault model so that 1 bit of d , say .di , is
flipped. Let .d ' denote the faulty value of d . Then the faulty signature is given by
'
.s = m mod n. From Algorithm 5.7, lines 4 and 5, we can see that the computations
d'
5.3 Fault Attacks on RSA and RSA Signatures 397
Algorithm 5.7: Computing RSA signature with the right-to-left square and
multiply algorithm
Input: n, m, d// n is the RSA modulus; m is hash value of the message;
d is the private key of bit length 𝓁d
Output: s = md mod n
1 s=1
2 t =m
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
i
// multiply by m2
5 s = s ∗ t mod n
i+1
// t = m2
6 t = t ∗ t mod n
7 return s
⎧
s' m−2 mod n, if di = 1, di' = 0
i
. ≡ (5.21)
if di = 0, di' = 1.
i
s m2 mod n,
Suppose the attacker has the knowledge of s , .s ' , and m; then they can compute
s'
. mod n
s
2i
.m mod n
s 'e s 'e
. ≡ mod n.
se m
. ≡ (5.22)
if di = 0, di' = 1.
i
m me2 mod n,
398 5 Fault Attacks and Countermeasures
s 'e
. mod n
m
e2i
.m mod n
.s = md mod n = 8.
The intermediate values for the computation with Algorithm 5.7 will be
i di t result
.0 1 4 2
1 1 1 8
Suppose .d0 is flipped, then .d ' = 2 = 102 . The resulting computation following
Algorithm 5.7 will then have the intermediate values as follows:
i di t result
.0 0 4 1
1 1 1 4
Thus .s ' = 4.
With the knowledge of .s = 8, .s ' = 4, and .m = 2, the attacker computes
s' 4 s'
≡ ≡ 2−1 mod 15, ≡ m−2 mod n.
i i
. m2 ≡ 21 ≡ 2 mod 15 =⇒
s 8 s
s 'e 43
. ≡ ≡ 32 ≡ 2 mod 15.
m 2
5.3 Fault Attacks on RSA and RSA Signatures 399
And we have
Thus
s 'e
≡ m−e2 mod n.
i
.
m
In this subsection, we will discuss an attack [BCG08] that injects faults into the
RSA public key n, during the signature singing, and recovers the private key d .
Since the value n is big, it will be stored in a few registers. The fault can be injected
during loading or preparing n. The attack is specific to the right-to-left square and
multiply algorithm.
The RSA signature computation with the right-to-left square and multiply
algorithm is detailed in Algorithm 5.7. Let .n' denote the faulty RSA modulus and
.ε := n ⊕ n'
be the fault mask. Suppose the fault is injected in round j (.1 ≤ j ≤ 𝓁d − 2), resulting
in a faulty square computation in line 6
.t = t ∗ t mod n' ,
and this faulty .n' is also used for the rest of the computation. Then the faulty
signature is given by
⎡⎛ ⎞ ⎤
−1
j∏ d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣⎝ m 2i d i mod n⎠ m mod n ⎦ mod n' . (5.23)
i=0 i=j
all possible values of d to find out which one gives the faulty signature. Hence we
assume .j ≥ 1.
Recall that the correct signature is given by
d −1
𝓁∏
i
.s = m2 di mod n.
i=0
Define
then
d −1
𝓁∏
i
d
.m (j ) = m2 di mod n,
i=j
and
⎡ ⎤
d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣(sm−d(j ) mod n) m mod n ⎦ mod n' .
i=j
and compares it with .s ' . Then they record values of .d̂(j ) that satisfy
'
.ŝ = s',
.s = md mod n = 25 mod 15
5.3 Fault Attacks on RSA and RSA Signatures 401
with Algorithm 5.7, we have the following intermediate values in each loop:
i di t s
0 1 4 2
.
1 0 1 2
2 1 1 2
i di t s
0 1 4 2
.
1 0 3 2
2 1 9 6
and the faulty signature .s ' = 6, which agrees with Eq. 5.23:
⎡⎛ ⎞ ⎤
−1
j∏ d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣⎝ m 2i d i mod n⎠ m mod n ⎦ mod n'
i=0 i=j
⎾ 2 ⎛
⎤
∏ ⎞ 2i di
20 d0
mod n'
0
= (m mod n) m2 mod n
i=1
⎾ ⎤
= (md0 mod n)(m mod n)2d1 +2 d2 mod n'
2
To recover the secret key d, the attacker takes all possible values for .d(1) = d2 d1 0
and computes the corresponding possible faulty signatures with Eq. 5.24:
⎡ ⎤
d −1 ⎛
𝓁∏ ⎞2i−j +1 d̂i
j −1
.ŝ
'
= ⎣(sm−d̂(j ) mod n) m2 mod n ⎦ mod n'
i=j
⎾ ⎤
= (2m−d̂(1) mod n)(m mod n)2d̂1 +2 d̂2 mod n'
2
⎾ ⎤
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' .
2
Thus the attacker can conclude that .d(1) = 010 or 100, i.e., .d1 d2 = 01 or 10.
In case the attacker does not have the knowledge of the exact fault mask .ε (and hence
' ), but instead, they know the range for .ε . Then they can brute force all possible
.n
values of .ε and .d̂(j ) to reduce the key candidate. We refer the readers to [BCG08]
for more details.
This part looks into implementations that are either based on the right-to-left square
and multiply algorithm (Sect. 3.5.1.1) or on the Montgomery powering ladder
(Sect. 3.5.1.2). We further require that the modular multiplication is implemented
with Blakely’s method (Sect. 3.5.2.1). We will discuss a fault attack that is specific
to such a setting.
The attack exploits the knowledge of whether an intermediate faulty value is
used or not by observing whether the final output is changed, thus the name safe
error attack [YJ00]. Since only knowing whether the output is changed or not is
enough, if we implement a countermeasure that repeats the computation, compares
the final results, and outputs an error when a fault is detected, the safe error attack
still applies.
Let .ω be the computer’s word size (see Sect. 2.1.2). Take .κ = ⎾𝓁n /ω⏋, i.e.,
With the Montgomery powering ladder and Blakley’s method, the signature .s =
md mod n is computed with Algorithm 5.8 (see Algorithm 3.15).
Since .𝓁n is the bit length of n, the bit lengths of the variables .R0 and .R1 are at
most .𝓁n . We can write
⎲
κ−1 ⎲
κ−1
.R0 = R0i (2ω )i , R1 = R1i (2ω )i .
i=0 i=0
We can also assume each of .R0i and .R1i is stored in one register.
Suppose .dj = 0, and a fault is injected during the j th iteration of the outer loop,
when .i < i0 in the loop starting from line 6, in the variable .R0i0 , for some .i0 such
that .0 ≤ i0 ≤ κ − 1. Then the value in .R1 in line 9 will not be affected since .R0i0 is
used when .i = i0 . However, the value in .R0 in line 14 will be faulty. Hence the final
output will be faulty.
On the other hand, suppose .dj = 1, and a fault is injected during the j th iteration
of the outer loop and when .i < i0 in the loop starting from line 17, in the variable
.R0i0 , for some .i0 such that .0 ≤ i0 ≤ κ − 1. Then the fault will go unnoticed since
.R0i0 is used when .i = i0 and the value in .R0 will be rewritten in line 20. Thus the
26 return R0
j = 1 d1 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 mod 15 = 2
line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R = 2ω R + R11 R1 mod n = 0
. i=0 R = 2ω R + R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01
j = 0 d0 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 × 4 mod 15 = 8
line 20 R0 = 8 R00 = 00, R01 = 10
5.3 Fault Attacks on RSA and RSA Signatures 405
Then they inject fault into .R01 at this time. We note that .R01 is used (blue .R01 in the
above equations) before .i = 0 and reassigned value in line 20 (orange .R01 in the above
equations). Thus the computations are not affected, and the signature is correct. The
attacker can conclude that .d0 = 1.
Example 5.3.6 Let .d = 2 = 102 , and keep the other parameters the same as in
Example 5.3.5. Then
.s = md mod n = 22 mod 15 = 4.
j = 1 d1 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 mod 15 = 2
. line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R = 2ω R + R11 R1 mod n = 0
i=0 R = 2ω R + R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01
j = 0 d0 = 0
loop line 6 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω + R00 R1 mod n = 8
line 9 R1 = 8 R10 = 00, R11 = 10
loop line 11 i = 1 R = 2ω R + R01 R0 mod n = 0
i=0 R = 2ω R + R00 R0 mod n = 2 × 2 mod 15 = 4
line 14 R0 = 4
Then they inject fault into .R01 at this time. Suppose the faulty .R01 has a value 01. The
intermediate values will be as follows:
406 5 Fault Attacks and Countermeasures
j = 1 d1 = 1
loop line 17 i = 1 R=0
i=0 R = R00 R1 mod n = 2 mod 15 = 2
line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R=0
i=0 R = R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01
.
j = 0 d0 = 0
loop line 6 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω + R00 R1 mod n = 8
line 9 R1 = 8 R10 = 00, R11 = 10
loop line 11 i = 1 R = 2ω R + R01 R0 mod n = 1 × 4 mod 15 = 4
i=0 R = 2ω R + R00 R0 mod n = 22 × 4 + 2 × 2 mod 15 = 5
line 14 R0 = 5
where the blue .R01 is used before the fault injection and the green .R01 carries the faulty
value of .R01 . Thus the final result will be changed, and the attacker can conclude .d0 = 0.
Before detailing the safe error attack on the square and multiply algorithm, we first
consider a fault attack on Algorithm 5.9, where Blakely’s method (Algorithm 3.11)
is used for computing modular multiplication.
Let .a, b ∈ Zn be two integers. Since .𝓁n is the bit length of n, the bit length of a
is at most .𝓁n . Recall that .κ = ⎾𝓁n /ω⏋. We can store a in .κ registers, each containing
one .ai and (see also Eq. 3.22)
⎲
κ−1
.a = ai (2ω )i . (5.25)
i=0
We assume the attacker has the knowledge of the correct output for a pair of
a and b. And they can rerun the algorithm with the same input, inject fault, and
observe the output. Suppose .c = 1, and a fault is injected during the loop starting
from line 3 in the register containing .ai0 (.0 ≤ i0 ≤ κ − 1), when .i < i0 . In this case,
the fault in .ai0 will not affect the output since .ai0 is used when i is equal to .i0 . On
the other hand, if .c = 0 and a fault is injected in the register containing .ai0 during
the computation, then the final result will be faulty since the faulty value in a will
be returned.
Now, if the attacker does not know the value of c and would like to recover it by
fault injection attacks, they can assume that .c = 1, and the loop in line 3 is executed.
Then they inject fault in .ai0 at the time when i is less than .i0 . Finally, they compare
5.3 Fault Attacks on RSA and RSA Signatures 407
the output with the correct one and recovers the value of c—if the output is correct,
.c= 1; otherwise .c = 0.
The same attack idea can be applied to the square and multiply algorithm to
recover the secret key. With the right-to-left square and multiply algorithm and
Blakley’s method, the signature .s = md mod n is computed with Algorithm 5.10
(see Algorithms 3.13 and 5.7). Since .𝓁n is the bit length of n, the bit lengths of the
variables s and t are at most .𝓁n . We can write
⎲
κ−1 ⎲
κ−1
.s = sj (2ω )j , t= tj (2ω )j .
j =0 j =0
Similar techniques can also be applied to attack the left-to-right square and
multiply algorithm with Blakely’s method. We refer the interested reader to [YJ00].
Example 5.3.7 Let us repeat the computations in Example 5.3.5 with Algorithm 5.10.
We have
.p = 3, q = 5, n = 15, d = 3 = 112 , m = 2,
𝓁n = 4, 𝓁d = 2, ω = 2, κ = 2.
i = 0 d0 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 2 mod 15 = 2
line 9 s=2 s0 = 10, s1 = 00
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
. j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 2 × 4 mod 15 = 8
line 9 s=8
starting from line 6. We note that .s1 is used (blue .s1 in the above equations) before .j = 0
and reassigned value in line 9 (orange .s1 in the above equations). Thus the computations
are not affected, and the final result is unchanged. The attacker can conclude that .d0 = 1.
Example 5.3.8 Let .d = 2 = 102 , and keep the rest of the parameters as in
Example 5.3.7. Then
.s = md mod n = 22 mod 15 = 4.
i = 0 d0 = 0
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
.
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 1 × 4 mod 15 = 4
line 9 s=4
not used in the iteration for .i = 0. We can assume the fault is injected before the start of
the next iteration as the computation time for lines 10–14 is similar to that for lines 5–9.
Suppose the faulty .s1 has a value 01. The intermediate values will be as follows:
i = 0 d0 = 0
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
.
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0 + 1 × 4 mod 15 = 4
j =0 R = 2ω R + s0 t mod n = 22 × 4 + 1 × 4 mod 15 = 5
line 9 s=5
where the green .s1 is the faulty .s1 . The final result is changed, and the attacker can
conclude .d0 = 0.
In this section, we will discuss a few countermeasures for the attacks presented in
Sect. 5.3. We keep the same notations as before. p and q are two distinct odd primes,
and .n = pq . .d ∈ Z∗ϕ(n) is the private key for RSA signatures and .e = d −1 mod ϕ(n).
n has bit length .𝓁n . d has bit length .𝓁d with the following binary representation (see
Theorem 1.1.1):
d −1
𝓁⎲
.d = di 2i .
i=0
.s = sp yq q + sq yp p mod n
or by Garner’s algorithm
where
∗
.sp = md mod (p−1)(r−1) mod pr, sq∗ = md mod (q−1)(r−1) mod qr. (5.28)
Then we check if
∗
.sp ≡ sq∗ mod r. (5.29)
∗
.sp ≡ md mod (p−1)(r−1) mod p.
Let
.d = a + b(p − 1)(r − 1)
By Corollary 1.4.3,
∗
.sp ≡ md mod (p−1) mod p. (5.31)
Hence,
412 5 Fault Attacks and Countermeasures
∗
.sp ≡ md mod (p−1) ≡ sp mod p.
Similarly,
∗
.sq ≡ md mod (q−1) ≡ sq mod q.
∗
.sp ≡ md mod (r−1) mod r, sq∗ ≡ md mod (r−1) mod r,
which gives
∗
.sp ≡ sq∗ mod r.
Suppose the Bellcore attack is to be carried out, and a malicious fault is injected
during the computation (Eq. 5.28) of .sp∗ or .sq∗ , but not both. Without loss of
generality, let us assume .sp∗ is faulty and .sq∗ is computed correctly. Let .sp∗' denote
the faulty .sp∗ . The fault will be detected if
∗'
.sp /≡ sq∗ mod r,
∗'
.sp ≡ sq∗ mod r. (5.32)
If we assume the fault is injected so that the resulting value of .sp∗' is random and
follows a uniform distribution in .Zpr , then the probability for .sp∗' to satisfy Eq. 5.32
is .1/r . Thus, with Shamir’s countermeasure, the Bellcore attack will be successful
with probability .1/r . When the bit length of r is around 32 bits, this probability is
about .2−32 .
Example 5.4.1 Let us compute the signature from Example 5.3.1 with Shamir’s
countermeasure. We have
.p = 5, q = 7, n = 35, d = 5, m = 6.
∗
.sp ≡ sq∗ ≡ 0 mod 3.
We have shown in Example 3.5.8 that .yq = 3 and .yp = 3. By Eq. 5.30, the signature is
given by
∗'
.sp /≡ sq∗ mod r.
∗'
.sp ≡ sq∗ ≡ 0 mod 3,
'
.s = sp∗' yq q + sq∗ yp p mod n = 9 × 3 × 7 + 6 × 3 × 5 mod 35 = 34.
In this case, the attacker can repeat the Bellcore attack by computing
Example 5.4.2 Let us compute the signature from Example 5.3.2, by Shamir’s counter-
measure. We have
∗
.sp ≡ sq∗ ≡ 3 mod 5.
414 5 Fault Attacks and Countermeasures
We have shown in Example 5.3.2 that .yq = 6 and .yp = 6. By Eq. 5.30, the signature is
given by
∗'
.sp /≡ sq∗ mod r.
∗'
.sp ≡ sq∗ ≡ 3 mod 5,
'
.s = sp∗' yq q + sq∗ yp p mod n = 10 × 6 × 13 + 33 × 6 × 11 mod 143 = 98.
In this case, the attacker can repeat the Bellcore attack by computing
Recall that
We select a random integer r such that .gcd(dr , ϕ(n)) = 1 and .er is a small integer,
where
.dr = d − r,
and
Let
| | | |
m m
.kp = , kq = .
p q
In Lemma 5.4.1, we will show that the signature s computed above is indeed
equal to the signature given by .md mod n.
Lemma 5.4.1
.s ≡ md mod n. (5.40)
By Corollary 1.4.3,
416 5 Fault Attacks and Countermeasures
d e
.m r r mod p = m mod p.
Furthermore,
| |
m
. p = m − (m mod p).
p
Hence,
| |
m
d e
.(m r r mod p) + p = m. (5.41)
p
.m̂ = m mod q.
By Eq. 5.36,
Together with Eq. 5.34, it follows from Chinese Remainder Theorem (see Theorem 1.4.7
and Example 1.4.19) that
.s = sdr m
~r mod n = mdr mr mod n = md−r+r = md mod n.
⨆
⨅
Next, we show that the Bellcore attack cannot succeed if s is calculated using
Eqs. 5.34–5.39.
Proposition 5.4.1 Suppose .p < q. If .sp is faulty, then .sq is also faulty.
Proof Let .sp' denote the faulty value of .sp , then .sp /= sp' . By Corollary 1.4.4,
e
.spr /≡ sp'er mod p.
Since .p < q,
5.4 Fault Countermeasures for RSA and RSA Signatures 417
e
.(spr mod p) mod q /= (sp'er mod p) mod q.
By Eq. 5.35, .m̂ is faulty. Thus .sq is also faulty by Eq. 5.36. ⨆
⨅
Lemma 5.4.2 Suppose .p > q. The cardinality of the set
{ }
. (a, b) | a, b ∈ Zp , a /= b, a ≡ b mod q
is given by
| | | | ⎛| | ⎞
p p p
.E := 2(p mod q) +q −1 .
q q q
.0 ≤ a mod q ≤ (p mod q) − 1.
| |
In this case, there are . pq of .b ∈ Zp such that .b ≡ a mod q.
There are
⎛| | ⎞ | |
p p
.q − (p mod q) + 1 = (q − p mod q)
q q
.p mod q ≤ a mod q ≤ q − 1.
| |
p
In this case, there are . q − 1 of .b ∈ Zp such that .b ≡ a mod q. We have
⎛| | ⎞| | | | ⎛| | ⎞
p p p p
.E = (p mod q) +1 + (q − p mod q) −1
q q q q
| || | | | | | ⎛| | ⎞
p p p p p
= (p mod q) + (p mod q) +q −1
q q q q q
| || | | |
p p p
− (p mod q) + (p mod q)
q q q
| | | | ⎛| | ⎞
p p p
= 2(p mod q) +q −1 .
q q q
⨆
⨅
418 5 Fault Attacks and Countermeasures
Those values of a are given by .{0, 1, 5, 6}. In this case, there are
| | | |
p 7
. = =1
q 5
many .b ∈ Z7 such that .b ≡ a mod q. In particular, all possible values of .(a, b) are given
by
There are
| |
p
.(q − p mod q) = (5 − 2) × 1 = 3
q
The values of a are given by .{2, 3, 4}. In this case, there are
| |
p
. − 1 = 0.
q
many .b ∈ Z7 such that .b ≡ a mod q. For example, there is no other number except for
2 in .Z7 that is congruent to .2 mod 7.
Thus the total number of pairs .(a, b) is 4. We can check that
| | | | ⎛| | ⎞
p p p
.E = 2(p mod q) +q − 1 = 2 × 2 + 5 × 0 = 4.
q q q
Proposition 5.4.2 Suppose .p > q. If .sp is faulty, the probability for .sq to be also faulty
is
E
.1 − .
p(p − 1)
5.4 Fault Countermeasures for RSA and RSA Signatures 419
Proof Let .sp' denote the faulty value of .sp , then .sp /= sp' . By Corollary 1.4.4,
e
.spr /≡ sp'er mod p.
There are .p(p − 1) distinct pairs .(sp , sp' ). By Eqs. 5.35 and 5.36 and Lemma 5.4.2, there
are E possible pairs .(sp , sp' ) that produce the same .m̂, hence the same .sq .
Thus the probability for .sq to be faulty is
E
.1 − .
p(p − 1)
⨆
⨅
We note that, in practice, p is large, and p and q are of similar bit lengths. Then E
will be small compared to .p(p − 1).
Example 5.4.4 Let .p = 421, .q = 419, then
| | | | ⎛| | ⎞
p p p
.E = 2(p mod q) +q − 1 = 2 × 2 × 1 + 419 × 1 × 0 = 4
q q q
and
E 4
.1 − =1− = 0.99998.
p(p − 1) 421 × 420
Proposition 5.4.3 If .sq is faulty and .sp is computed correctly, the attacker cannot
'er
compute .q = gcd(sdr − m, n) without brute force.
' , .m
Proof Suppose .sq is faulty, and .sp is computed correctly. Let .sdr ~' , .sq' , and .s ' denote
~, .sq , and s, respectively.
the faulty values of .sdr , .m
To carry out the Bellcore attack, the attacker needs to compute
'er
.q = gcd(sdr − m, n).
However, the attacker does not have the knowledge of .sdr ' . Instead, we can assume that
' '
the attacker knows .s . To get .sdr , the attacker needs to compute .m~'r .
'
We note that there are .q − 1 possible values for .sq . By Corollary 1.4.4 and Eq. 5.38,
there are .q − 1 possible values for .m ~' . And by Corollary 1.4.6, there are .q − 1 possible
'r
~ mod n. Thus the attacker cannot tell which value in .Zn is .m
values for .m ~'r mod n even
with the knowledge of m and r because of the unknown .m '
~ . In conclusion, the attacker
needs to brute force all possible values for .m ~'r mod n in .Zn . ⨆
⨅
In summary, the Bellcore attack assumes one of .sp and .sq is faulty, but not both. For
the infective countermeasure, we have shown:
. When .p < q , if .sp is faulty, .sq will also be faulty.
. When .p > q , if .sp is faulty, then .sq has a high probability to be faulty.
420 5 Fault Attacks and Countermeasures
. If .sq is faulty and .sp is not faulty, the attacker cannot repeat the attack without
brute force.
Example 5.4.5 Let .p = 3, .q = 5, and .m = 3. Then, .n = 15, .ϕ(n) = 8. As discussed
in Example 3.5.5, .yp = 2, .yq = 2. Suppose .d = 5.
To compute the signature with the infective countermeasure, choose .r = 2, and we
have
And
s = sdr m
~r mod n = 12 × 32 mod 15 = 108 mod 15 = 3.
'
.m̂ = ((sp'er mod p) + kp p) mod q = (1 + 3) mod 5 = 4,
'
.sdr = sp yq q + sq' yp p mod n = 0 + 1 × 2 × 3 mod 15 = 6.
We note that
'er
.p = gcd(sdr − m, n) = gcd(63 − 3, 15) = gcd(213, 15). (5.42)
However, the attacker does not have the knowledge of .m ' from
~'r to get the value of .sdr
'
.s . Thus from their point of view, any value in .Zn = Z15 might be .m
'r
~ . And they cannot
compute p as in Eq. 5.42.
Example 5.4.6 Let us compute the signature from Example 5.3.2 with the infective
countermeasure. We have
Choose .r = 4, then
.dr = d − r = 11 − 4 = 7.
hence
We also have
| | | | | | | |
m 2 m 2
.kp = = = 0, kq = = = 0.
p 11 q 13
And
'
.m̂ = ((sp'er mod p) + kp p) mod q = (2103 mod 11 + 0) mod 13 = 23 mod 11 = 8,
'
.m̂ = ((sp'er mod p) + kp p) mod q = (2103 mod 13 + 0) mod 13 = 27 mod 13 = 11,
By Proposition 5.4.2, the probability for .sp to be faulty and .sq to be computed correctly
is given by
5.4 Fault Countermeasures for RSA and RSA Signatures 423
E 4 1
. = = ≈ 0.0256.
p(p − 1) 13 × (13 − 1) 39
.s = z mod n.
'
.z ≡ y ' mod r, (5.43)
where .z' and .y ' denote the values of z and y when fault is present during the
computation. If we assume the fault is random, then the probability of achieving
Eq. 5.43 can be approximated by the probability that two random numbers are
congruent modulo r , which is .1/r . If r is an integer of bit length 20, the probability
is less than .10−6 .
Example 5.4.8 Let us consider the computation from Example 5.3.3. We have
.p = 3, q = 5, n = 15, d = 3 = d1 d0 = 11, m = 2.
.z ≡ y ≡ 2 mod r.
.s = z mod n = 8 mod 15 = 8.
Now if there is a bit flip on the least significant bit of d, .d0 , resulting in .d ' = 2, then
We have
'
.y /≡ z' mod r.
On the other hand, if the bit flip is on .d1 and we get .d ' = 1, then
We have
'
.y ≡ z' ≡ 2 mod r.
And
'
.s = z' mod n = 2 mod 15 = 2.
In this case, the attack described in Sect. 5.3.2 can be repeated. In particular, the attacker
computes
s' 2 s'
= mod 15 = 2−2 mod 15, ≡ m−2 mod n.
i
. m2i = 22 mod 15 =⇒
s 8 s
We note that a simple countermeasure exists for the safe error attack presented in
Sect. 5.3.4.
We first consider protecting the simple algorithm in Algorithm 5.9. Recall that
.a, b ∈ Zn . .𝓁n is the bit length of n, and the bit lengths of a , b are at most .𝓁n .
.κ = ⎾𝓁n /ω⏋,
5.4 Fault Countermeasures for RSA and RSA Signatures 425
where .ω is the word size of the computer. We can store a in .κ registers, each
containing one .ai and
⎲
κ−1
.a = ai (2ω )i .
i=0
⎲
κ−1
.b = bi (2ω )i ,
i=0
where each .bi is stored in one register. Then we can modify Algorithm 5.9 to
Algorithm 5.11. Suppose .c = 1 and the fault is in .bi0 when .i < i0 , for some .i0
that satisfies .0 ≤ i0 ≤ κ − 1. Since .bi0 is used before the fault happens, the final
result will not be affected. Suppose .c = 0, then a fault in .bi0 at any time will not
change the final output either. If a fault is injected in a , the output will be faulty no
matter what value c takes. Thus, Algorithm 5.11 is not vulnerable to the safe error
attack discussed in Sect. 5.3.4.
Algorithm 5.11: Modified Algorithm 5.9 to counter the safe error attack
Input: n, a, b, c// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a, b ∈ Zn ; c = 0, 1
Output: ab mod n if c = 1 and a otherwise
1 if c = 1 then
2 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the computer’s word size
3 for i = κ − 1, i >= 0, i − − do
4 R = 2ω R + bi a
5 R = R mod n
6 a=R
7 return a
.R = 2ω R + R1i R0 .
We get Algorithm 5.12. In this case, suppose .dj = 0, and a fault is injected in the
variable .R0i0 , during the j th iteration of the outer loop and at the time .i < i0 in
the loop starting from line 6, where .0 ≤ i0 ≤ κ − 1. The final output will be faulty
because the faulty .R0i0 will be used in line 12. If a fault is injected in .R0i0 in the loop
starting from line 11, since the value in the whole variable .R0 is used in line 12, the
fault will propagate to the output.
On the other hand, if .dj = 1 and a fault is injected in .R0i0 during the j th iteration
of the outer loop, specifically at the time .i < i0 in the loop starting from line 17,
426 5 Fault Attacks and Countermeasures
26 return R0
where .0 ≤ i0 ≤ κ − 1, the final output will also be faulty because the faulty .R0i0 will
be used in line 18. If the fault is injected in the j th iteration of the outer loop in .R0i0
in the loop starting from line 22, the fault will stay till the next iteration of the outer
loop and affect the output.
If a fault is injected in .R1i0 for some .i0 , by a similar argument, the signature will
always be faulty whether .dj = 0 or .dj = 1.
Thus Algorithm 5.12 is resistant to the safe error attack discussed in Sect. 5.3.4.1.
5.5 Further Reading 427
In the same manner, to protect Algorithm 5.10 against the safe error attack, we
can just change line 7 to
.R = 2ω R + tj s.
We get Algorithm 5.13. If .di = 1, a fault during the i th iteration in .sj0 (.0 ≤ j0 ≤
κ − 1) will affect the result since the faulty .sj0 will be used in line 7. If .di = 0, s is
not used in the i th iteration, but the faulty .sj0 will be used in the next iteration and
affect the final output. On the other hand, a fault in t will always propagate to the
output since the faulty value will be used in line 12. Thus Algorithm 5.13 is resistant
to the safe error attack discussed in Sect. 5.3.4.2.
Differential fault analysis We have seen the diagonal DFA attack on AES in
Sect. 5.1.1.2. Tunstall et. al [TMA11] demonstrated that using this attack and
by exploiting the relation between .K10 (the last round key) and .K9 (the second
428 5 Fault Attacks and Countermeasures
last round key), the key guesses for .K10 can be further reduced to .212 . Piret and
Quisquater [PQ03] discussed another DFA attack on AES that injects fault to
the input of MixColumns in round 9. Phan et al. [PY06] proposed to combine
cryptanalysis techniques with DFA to recover the secret key of AES.
DFA attacks on PRESENT implementations can be found in, e.g., [BEG13,
WW10, BH15]. A generalization of DFA to SPN ciphers is given in [KHN+ 19].
Persistent fault analysis (PFA) We discussed the PFA attack on AES in Sect. 5.1.3.
PFA can also be applied to other block ciphers, e.g., PRESENT [ZZJ+ 20] and feistel
cipher [CB19]. In [ZZY+ 19], the authors demonstrated a practical fault injection
in the AES Sbox lookup table. In 2020, Xu et al. [XZY+ 20] discussed PFA attacks
in earlier rounds of AES and other SPN ciphers. Notably, AI has also been adopted
for PFA to recover the key for AES [COZZ23].
Other fault attack methodologies on symmetric block ciphers There are many
other fault attack methods. Here we give more information on a few of them.
Ineffective fault analysis (IFA) was first introduced in [Cla07], where the faults
that do not change the intermediate values are exploited. Those faults are called
ineffective faults. Normally a particular fault model is assumed, e.g., a stuck-at-0
fault model. We note that IFA is dependent on the effect a fault has on the corrupted
data. In comparison, the safe error attack (Sect. 5.3.4) does not require a specific
fault model, an intermediate value is changed, and the knowledge of whether the
faulty value is used or not is exploited.
Statistical ineffective fault attack (SIFA) [DEK+ 18] combines both SFA
(Sect. 5.1.2) and IFA. A nonuniform fault model is assumed, and the attack exploits
ineffective faults. More precisely, the dependency between the fault induction being
ineffective and the data that is processed is exploited. Different from SFA, SIFA
does not require each fault to be successful, but the attack requires repeated plaintext
and knowledge of the correct ciphertext (or whether each ciphertext is correct or
not). The fault injection is the same as described in Sect. 5.1.2.2. After the attacker
obtains a set of ciphertexts, they filter out the correct ones. With each hypothesis of
4 bytes of K10 , the attacker can compute a hypothesis of the original byte value s00 .
Then, statistical methods, such as maximum likelihood as discussed in Sect. 5.1.2.1,
can be applied to find the correct key hypothesis. In [DEK+ 18], the authors provide
a detailed theoretical analysis of the number of ciphertexts needed and extensive
experimental results.
Collision fault analysis [BK06] injects fault in the earlier rounds of a block cipher
implementation. Then the attacker records the faulty ciphertext and finds plaintext
that produces the same ciphertext, but without fault. Further analysis using those
plaintexts can recover the secret key. If the fault only changes 1 bit or 1 byte of the
intermediate value, the attacker can try different plaintexts that only differ at 1 bit
or 1 byte.
Algebraic fault analysis (AFA) [CJW10] is similar to DFA. It also exploits
differences between correct and faulty ciphertexts. But DFA relies on manual
5.5 Further Reading 429
1 An SAT solver solves Boolean satisfiability problems. It takes a Boolean logic formula and checks
There are also other proposals for different code designs. For example, in
[KKT04], the authors proposed a special type of code for hardware countermea-
sures. The code is defined by
{ }
.C = (x, w) | x ∈ Fk2 , w = (P x)3 ∈ Fr2 ,
. max|C ∩ C + e| = R,
e/=0
where
{ }
.C + e = c + e | c ∈ C, e ∈ Fn2 .
In [GGP09], the authors proposed to use a digest value for the cipher state
and update it after each operation. The fault can be detected through the digest
values. See also [MSY06] for a comparative study on a few detection-based
countermeasures for symmetric block ciphers.
Infective countermeasure Infective countermeasure was first introduced for RSA
[SMKLM02] (see Sect. 5.4.2). Then it was adopted for symmetric block cipher in
2012 [GST12], where the authors discussed the implementation for both SPN and
Feistel ciphers. In 2014, Tupsamudre et al. broke this countermeasure for AES and
proposed an improved version [TBM14] for AES implementations.
As we have seen in Sect. 5.2.2, the infective countermeasure returns a ciphertext
in a way that if the attacker does not know the correct ciphertext will not be able to
tell if the fault injection was successful or not. However, as shown in [DEK+ 18],
for SIFA, even though the attacker does not know whether the ineffective fault
occurred in the target AES round or anywhere else, they can precalculate the
probability of faulting the target round and analyze the obtained ciphertext utilizing
this probability.
Generally applicable fault countermeasures For fault attack countermeasures,
except for those introduced in this chapter, there are also many other techniques.
Similarly to SCA countermeasures (Sect. 4.6), we can divide them according to the
levels of protection.
Protocol level countermeasure involves designing the usage of cryptographic
primitives in a way that certain fault attacks are not possible anymore, e.g.,
rekeying [MSGR10] or tweak-in-plaintext strategy [BBB+ 18].
Cryptographic primitive level approaches provide some sort of fault protection
directly in the cipher design [BLMR19, BBB+ 21]. The main advantage is to
unburden the implementer from the need to apply additional countermeasures.
However, at this point, the fault models covered directly in the design are limited.
5.5 Further Reading 431
In the first part of this section, we will explain how information leakage is created
by the operation of integrated circuits. In the second part, we will detail the main
components of a measurement setup—oscilloscopes and probes.
1 While we use the term “fault attacks” throughout the book, one can also find the term “fault
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 433
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_6
434 6 Practical Aspects of Physical Attacks
Fig. 6.2 Switching of the CMOS circuit, showing (a) the charging path from .VDD to .CL and (b)
the discharging path .CL to GN D of the capacitive load
to noise and decrease the static power dissipation, compared to implementing each
of these types separately [Cal75]. A circuit consisting of NMOS and PMOS
transistors is called a complementary metal oxide semiconductor (CMOS).
The side-channel leakage that comes in the form of electromagnetic leakage of
power consumption originates from the physical characteristics of data processing
by CMOS-based circuits. Based on these characteristics, leakage models were
developed to recover the processed information (see Sect. 4.2.1, part Leakage
Models). There are two types of power dissipation in CMOS gates: static and
dynamic (see Fig. 6.1). Static power is consumed even if there is no circuit activity.
It is primarily caused by leakage currents that flow when the transistor is in the off
state. While this type of power dissipation is not as investigated in the world of
SCA as the dynamic one, some works focus on exploiting it [MMR19]. Dynamic
power dissipation comes in two forms: short-circuit currents (a short time during
the switching of a gate when PMOS and NMOS are conducting simultaneously)
and switching power consumption (charge and discharge of the load capacitance).
When considering side channels, the switching power is the most relevant as
it directly correlates the processed data with the observable changes in power
consumption [Sta10]. This behavior of the CMOS circuit is depicted in Fig. 6.2.
Generally, the energy delivery to a CMOS is split into two parts—the charging and
the discharging of the load capacitance .CL . During the charging phase, the input
gate signal makes a .1 → 0 switch, resulting in switching the PMOS transistor
6.1 Side-Channel Attacks 435
on and its NMOS counterpart off. As shown in Fig. 6.2a, in this scenario, the load
capacitance .CL is connected to the supply voltage (.VDD ) via the PMOS transistor,
thus allowing the current .I (t) to charge .CL . There are two important equations
defining the transition from the energy point of view [JAB+ 03]:
. Ed = CL VDD
2
,
⎰ ∞ 1
Ec =
. I (t)V (t)dt = 2
CL VDD ,
0 2
where .Ed is the delivered energy and .Ec is the energy stored in the .CL . From
these equations, it can be seen that only half of the delivered energy is stored in
the capacitor—the PMOS transistor dissipates the other half. Therefore, this power
loss during the logic transition can be measured and correlated with the switching
activity, resulting in SCA leakage. When the gate signal changes from 0 to 1, the
opposite scenario happens—the PMOS transistor is switched off, and NMOS is
switched on. The energy .Ed stored in .CL is drained to the ground via the NMOS
transistor, as can be seen in Fig. 6.2b, thus causing SCA leakage. For more details
regarding the power consumption of CMOS circuits, we refer the interested reader
to [NYGD22], for example.
It is important to note that there is usually more than one switch during one clock
cycle. This is because the input signals to the (multi-input) gate normally do not
arrive at the same time, resulting in several switches before the correct output is
generated. The output transitions before the stable state are called glitches. They are
unnecessary for the correct functioning of the circuit and consume a non-negligible
amount of dynamic power, ranging between .20% and .70% [SM12]. Glitches are
the reason why Boolean masking in hardware, although theoretically secure, can be
broken [MPG05]. An approach called threshold implementation [NRS11] solves
the problem of secure Boolean masking by utilizing multiparty computation and
secret sharing.
The core of the measurement setup for SCA is the oscilloscope. It can either be
connected to the power supply of the DUT for power measurements or can measure
electromagnetic (EM) signals through an EM probe.
6.1.2.1 Oscilloscopes
a b
Fig. 6.3 Digital sampling of a continuous signal with ten samples of (a) low-frequency signal and
(b) high-frequency signal
the measured signal (voltage, in our case) at the specified sampling rate and changes
it into a digital value. The precision of this value is generally between 8 and 12 bits
for midrange oscilloscopes, and the sampling rate ranges from hundreds of mega
samples per second (MS/s) to several giga samples per second (GS/s).
When measuring analog signals, such as voltage, with digital devices, it is
important to note that we are measuring a continuous value with equipment that
samples such a value at periodic intervals (which is why we call it a time sample)
and stores it in a binary format with limited precision. Therefore, discretization is
applied twice—first in the time domain and then to the value itself. According to the
Nyquist–Shannon sampling theorem [Vai01], the sampling rate of the measurement
device should be at least twice the highest frequency component of the measured
signal. It is a good rule of thumb to have the oscilloscope sampling rate at least .4×
the target device frequency when doing measurements for power analysis attacks.
Figure 6.3 shows this phenomenon. The red curve denotes the original analog signal,
while the black lines show a sampling of this signal with ten samples over the given
time interval. While we can easily reconstruct the original signal if the frequency
is low (Fig. 6.3a), it becomes much harder with a high-frequency signal (Fig. 6.3b).
The precision of the oscilloscope specifies how many values can the sampled output
value take, e.g., an 8-bit ADC would give a range of 256 values, which is sufficient
for an SCA attack in most cases.
Another important parameter of the oscilloscope is the analog bandwidth. It is
defined as the frequency at which the amplitude measured by the oscilloscope has
reduced by 3 dB. To avoid the unnecessary modification of the measured signal, the
bandwidth should be at least .3× the target device frequency.
An important task during the acquisition is capturing the correct time window
corresponding to the operations we want to measure. In laboratory conditions, it
is common to use an artificial trigger signal that indicates the start/end of the
encryption. In real-world settings, it is necessary to identify the correct position
by examining the captured signal—this is usually done based on the evaluator’s
expertise.
6.2 Fault Attacks 437
6.1.2.2 Probes
Near-field electric and magnetic probes are an essential part of the setup when doing
electromagnetic side-channel analysis. They can be connected to the oscilloscope in
a passive way or with an amplifier. Optionally, a bandpass filter can be used to only
pass the relevant frequencies and discard the rest. Several established companies,
such as Riscure and Langer, provide probes suitable for EM SCA. Due to the
simplicity of the probe design, researchers have also been building their own probes
since the early days of SCA [GMO01]. Generally, a coiled copper wire is sufficient,
with a coil diameter of at most a few hundred microns. More details on designing
near-field probes can be found, for example, in [Siv17].
An interesting aspect of fault attacks is that, unlike with side-channel attacks, the
adversary can break the cryptographic security even without the knowledge of the
underlying algorithm, for example, by skipping the entire encryption routine by
injecting faults in the conditional branches [SWM18].
In this section, we will look into the practical aspects of fault attacks (FAs), such
as sample preparation, fault injection techniques and devices, and mechanisms to
trigger faults in integrated circuits.
In this subsection, we will outline the most popular techniques for FA testing of
integrated circuits [BH22].
Clock and voltage glitching techniques are the most accessible in terms of cost
as they do not need sophisticated equipment. Initially, they were only performed
locally with a device at hand, but with power management techniques such as
dynamic voltage and frequency scaling (DVFS), they can also be performed
remotely on chips that utilize that technology.
In the case of a voltage glitch, the faults are caused by precise high variations in
power supply or by underpowering the device. Power supply variations, or spikes,
modify the state of latches of flip-flops, influencing the control and data path logic
of the circuit [KJP14]. For example, if the voltage spike happens during memory
reading, wrong data may be retrieved. It was also shown that a different shape of the
glitch waveform affects the success of the attack [BFP19]. Underpowering, on the
438 6 Practical Aspects of Physical Attacks
other hand, affects the algorithm continuously and might cause faults throughout
the computation. Single faults are possible when the insufficient power supply
causes small enough stress so that dysfunctions do not occur immediately after
the computation starts and multiple faults do not happen [SGD08]. When the
attacker can physically access the target device, voltage glitching is generally easy
to implement. It is also the most inexpensive fault injection method as the necessary
equipment is wires for connecting to the device and a power source. A local voltage
glitch on a smart card is depicted in Fig. 6.4. Voltage glitching attacks were shown to
be effective even against security enclaves of Intel [CVM+ 21] and AMD [BJKS21].
An inexpensive Teensy 4.0 board (.≈30 USD) was used for the abovementioned
attacks, making them highly practical in terms of equipment cost.
Clock glitch is another technique that can be performed with low-cost equipment.
For digital computing devices, it is necessary to synchronize the calculations with
either an internal or an external clock. If the clock signal changes, the resulting com-
putation might have a wrong instruction executed or data corrupted. Devices that
require an external clock generator can be faulted by supplying a bad clock signal—
containing fewer pulses than the normal one [KSV13]. On the other hand, devices
that are configured to use an internal clock signal cannot be easily faulted. Clock
glitches are generally considered the simplest fault injection method as the attack
devices are easy to operate with. For example, clock glitches can be achieved by
using low-end field-programmable gate array (FPGA) boards [BGV11, ESH+ 11].
A relatively new direction in clock/voltage glitching is remote attacks that
take advantage of power management systems of modern processors. The security
aspects of these systems are rarely considered due to the complexity of devices
from the hardware point of view and software executed, cost, and time-to-market
constraints [PS19]. CLKSCREW is the first attack in this direction, targeting
frequency and voltage manipulation of the Nexus 6 phone, forcing the processor
to operate beyond recommended limits [TSS17]. The researchers experimentally
injected a one-byte random fault. CLKSCREW can be achieved just by utilizing
software control of energy management hardware regulators in the target devices.
Similar attacks were also proposed for ARM-based Krait processor [QWLQ19]
and Intel SGX [QWL+ 20]. The main advantage of these attacks is that they are
software-based, therefore allowing the threat model to shift from local to remote.
6.2 Fault Attacks 439
a b
Fig. 6.5 Depiction of (a) laser fault injection on an AVR microcontroller mounted on Arduino
UNO board and (b) zoomed infrared image of the chip
440 6 Practical Aspects of Physical Attacks
focused ion beam (FIB), but these techniques are generally outside of the budget of a
standard testing laboratory, so in this part, we will focus on the two abovementioned
methods.
Mechanical techniques are relatively straightforward. They are mostly used for
backside decapsulation as the front side of the chip is too sensitive to any physical
tampering. They can, for example, involve using inexpensive manual rotary milling
machines that grind down the epoxy package. This is recommended mostly for low-
cost chips as there is a high risk of overheating or mechanically damaging the die.
Another way is to use specialized tools for decapsulation, thinning, and polishing
(e.g., Ultra Tec ASAP-1). These tools work in an automated way by slowly milling
down the package layers to avoid any damage. Naturally, the main drawback is the
cost which typically ranges in tens of thousands of dollars.
Chemical techniques are recommended when the front side of the chip needs
to be accessed. In some cases, such as smart cards, acetone is enough to remove
the protective plastic (after the outer hard plastic case is removed, e.g., by using a
scalpel). When removing the black epoxy package, one might need to use strong
acids, such as fuming nitric acid (.HNO3 with a concentration of at least .86%). This
typically involves operation in a safe laboratory environment equipped with a fume
hood and a proper acid disposal facility. A depiction of such a setup is shown in
Fig. 6.6. When using such aggressive acids, there is also a risk of removing the
bonding wires using this technique, unless they are either golden or at least gold-
plated. More details on decapsulation techniques can be found in [BC16].
When using optical fault injection techniques, it is important to know the
absorption depth in silicon as a function of wavelength. This is depicted in Fig. 6.7.
The green laser (532 nm) has an absorption depth of .≈ 1.3 .μm; therefore, it can be
utilized either for front-side injection (where it can directly access the components)
or for almost fully removed silicon substrate from the backside. As the latter might
damage the chip, it makes sense to use lasers with deeper absorption depth, such as
808 nm or 1064 nm, both from the near-infrared light spectrum. The 1064 nm laser
allows a penetration depth up to 1 mm which can often be used even for non-thinned
substrate.
There are other fault injection techniques that are related to optical techniques in
the way they work. In the area of failure analysis, electron and ion beams have been
successfully used to test the reliability of circuits [SA93]. X-ray beams were used
to tamper with memories of a microcontroller [ABC+ 17].
All in all, optical fault injection offers precision and repeatability at a relatively
high cost (considering commercial off-the-shelf setups). With specific expertise, it is
possible to construct a DIY setup for a much lower price. The main drawback of this
technique is the necessity to “see” the chip, which normally requires depackaging
and delayering of the chip, making it often impractical outside of laboratory
environments. As it is a powerful technique, it is a de facto standard for security
testing and certification labs which need to consider strong attacker models.
2 https://www.newae.com/chipshouter
3 We use currencies stated in original papers, which is why some prices are in USD and some in
EUR.
4 https://www.avtechpulse.com/medium/
6.2 Fault Attacks 443
Rowhammer is a remote fault injection technique that exploits the physical charac-
teristics of DRAM (dynamic random access memory) technology. This attack works
by aggressive reading/writing to memory cells adjacent to the target cell, where it
causes bit flips [KDK+ 14]. The attack is made possible by advancing technology
which allows shrinking the cells and placing them closer to each other. A smaller
cell uses less capacity for charge and therefore provides less tolerance to noise and
greater vulnerability to errors [MDB+ 02]. High cell density further extends this
vulnerability by creating electromagnetic coupling effects between them, producing
unwanted interactions [KKY+ 89]. The Rowhammer access patterns are depicted in
Fig. 6.10. The aggressor row refers to a row that is being hammered by the attacker
to flip the bits in the victim row. According to [JVDVF+ 22], three common patterns
were shown effective in flipping bits. The single-sided pattern uses one aggressor
row next to the victim row and the other one far apart. The double-sided pattern
tightly surrounds the victim row with aggressor rows, increasing the chance of bit
flips. Finally, there is an n-sided pattern where n refers to .n − 1 victim rows being
hammered by n aggressor rows. The figure shows an example for .n = 4.
444 6 Practical Aspects of Physical Attacks
Fig. 6.10 Different ways of spatial arrangement of aggressor rows (black) and target/victim rows
(red/pink) in DRAM. (a) Single-sided. (b) Double-sided. (c) 4-sided
5 https://www.commoncriteriaportal.org
446 6 Practical Aspects of Physical Attacks
It is related to risk analysis methods and considers the following rating factors:
elapsed time, expertise, knowledge of TOE, access to TOE, used equipment, and
open samples.
• Attack Methods for Smart Cards and Similar Devices [SI20b]: This is a com-
panion document, under limited distribution. It describes the attacks themselves.
The rating method from the listed documents is also adopted in the security
evaluation specified by EMVCo, an organization managed by the major payment
security players (American Express, Discover, JCB, MasterCard, UnionPay, and
Visa). Their aim is to maintain the standardized security level of contact and
contactless payment system by managing and evolving the security requirements
and related testing processes.
A.1 Matrices
Σ
n−1 Σ
n−1
. det(A) = (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ).
j =0 j =0
Σ
n−1 Σ
1
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ) = a00 a11 − a01 a10 .
j =0 j =0
Take .i0 = 1,
Σ
n−1 Σ
1
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)1+j a1j det(A1j ) = −a10 a01 + a11 a00 .
j =0 j =0
Since R is a commutative ring, .a00 a11 − a01 a10 = −a10 a01 + a11 a00 , the lemma is
true for .n = 2.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 447
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
448 A Proofs
Now suppose the lemma is true for .n = k, where .k ≥ 2. In particular, for any
0 ≤ j < k and .i0 /= 0, we have
.
Σ
k−1 Σ
k−1
. det(A0j ) = (−1)𝓁 a0j det(A00,j 𝓁 ) = (−1)i0 +𝓁 ai0 j det(A0i0 ,j 𝓁 ), (A.1)
𝓁=0 𝓁=0
where .A0i,j 𝓁 is obtained from .A0j by deleting the ith row and .𝓁th column. We will
show that the lemma is true for .n = k + 1. Take .i0 = 0, we have
Σ
n−1 Σ
k
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ).
j =0 j =0
Σ
k Σ
k−1
= (−1)i0 +j ai0 j (−1)𝓁 a0j det(Ai0 0,j 𝓁 )
j =0 𝓁=0
⎧k−1 ⎫
Σ
k Σ
i0 +𝓁
= j
(−1) a0j (−1) ai0 j det(Ai0 0,j 𝓁 )
j =0 𝓁=0
Σ
k
= (−1)j a0j det(A0j ),
j =0
In this section, we will focus on matrices with coefficients from the field of real
numbers .R. Let .n, m be two positive integers.
Definition A.2.1 For any vector .u = (u0 , u1 , . . . , un−1 ) ∈ Rn , the Euclidean norm
of .u, denoted .||u||2 is defined to be
⎧n−1 ⎫1/2
Σ
.||u||2 = u2i .
i=0
A Proofs 449
In other words, the Euclidean norm of .u is the square root of the scalar product (see
Definition 1.3.2) between .u and .uT :
⎧ ⎫1/2
||u||2 = u · uT
. . (A.2)
Definition A.2.2 For any two vectors .u, v ∈ Rn , the Euclidean distance, denoted
d(u, v), is defined to be the Euclidean norm of the vector .u − v
.
Definition A.2.3 The row rank (resp. column rank) of a matrix A, denoted rank.(A)
over .R is the maximum number of rows (resp. columns) in A that constitute a set of
independent vectors.
The following result is very useful for us. For a proof, see, e.g., [Ber09, Section
2.4].
Theorem A.2.1 The column rank of a matrix A is equal to its row rank.
Definition A.2.4 The rank of a matrix A, denoted rank.(A) is the row rank of A. An
n × m matrix A is said to have full column rank if rank.(A) = m. It is said to have
.
1001
We can see that the vectors .{(1, 0, 1), (0, 1, 0), (1, 0, 0)} are independent but
Lemma A.2.1 An .n × m matrix A has full row rank if and only if there does not
exist a nonzero vector .u ∈ Rn such that .uA = 0. A has full column rank if and only
if there does not exist a nonzero vector .u ∈ Rm such that .AuT = 0.
Theorem A.2.2 An .n × n square matrix A is invertible if and only if rank.(A) = n.
Proof We will provide the proof for the necessity. We refer the readers to [Goc11,
Section 3.6] for the proof of the sufficiency.
By Definition 1.3.3, A is invertible if and only if there exists an .n × n matrix B
such that .AB = BA = In , where .In is the .n−dimensional identity matrix. Suppose
A is invertible and rank.(A) /= n. Then by Lemma A.2.1, there exists a nonzero
vector .u ∈ Rn such that .uA = 0. Then we have
.uAB = 0B = 0 = 0In ,
a contradiction. ⨆
⨅
Lemma A.2.2 Let M be an .n × m matrix. The matrix T
.M M is invertible if and
only if M has full column rank.
Proof Let
A = M T M.
.
AuT = M T MuT = 0.
.
Lemma A.2.1, there exists a nonzero vector .u ∈ Rm such that .AuT = 0, which
gives (see Eq. A.2 and Remark A.2.1)
Since .Mv has .mv columns, by definition, .Mv has full column rank. It follows from
Lemma A.2.2 that the matrix .MvT Mv is invertible.
A Proofs 451
In particular, with all possible values of .v appearing in the rows of .Mv , we will
have .mv linear independent rows in .Mv given by those with Hamming weight 1:
Then .Mv has row rank equal to .mv and .MvT Mv will be an invertible matrix.
Appendix B
Long Division
In primary school, we learned to do long division for calculating the quotient and
remainder of dividing one integer by another integer. For example, to compute
1346 = 25 × q + r,
.
we can write
53
25 1346
125
96
75
. 21
f (x) = x 8 + x 4 + x 3 + x + 1 ∈ F2 [x],
.
and
g(x) = x + 1 ∈ F2 [x].
.
We have
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 453
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
454 B Long Division
x7 + x6 + x5 + x4 + x2 + x + 1
x + 1) x8 + x4 + x3 + x + 1
x8 + x7
x7 + x4 + x3 + x + 1
x7 + x6
x6 + x4 + x3 + x + 1
x6 + x5
x5 + x4 + x3 + x + 1
x5 + x4
x3 + x + 1
x3 + x2
x2 + x + 1
x2 + x
1
f (x) = (x + 1)(x 7 + x 6 + x 5 + x 4 + x 2 + x + 1) + 1.
.
Appendix C
DES Sbox
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 455
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
456 C DES Sbox
For .i = 1, 2, 3, define
.ϕi : F42 → F2
x |→ SBPRESENT (x)i ,
where .SBPRESENT (x)i is the ith bit of SB.PRESENT (x), the PRESENT Sbox output
corresponding to .x. In this section, we will compute the algebraic normal forms for
.ϕi . Similarly to Table 3.13, we construct the table for each .ϕi —see Tables D.1, D.2,
and D.3.
The coefficients .λ are calculated based on Eq. 3.10 and the following equations:
λ0000 = ϕi (0000),
. λ0001 = ϕi (0000) + ϕi (0001),
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 457
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
458 D Algebraic Normal Forms for PRESENT Sbox Output Bits
Table D.1 The Boolean function .ϕ1 takes input .x and outputs the 1st bit of SB.PRESENT (x). The
second last row lists the output of .ϕ1 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ1
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ1 (x) 0 0 1 1 0 0 1 0 1 1 1 0 0 1 0 1
.λx 0 0 1 0 0 0 0 1 1 0 1 1 1 1 0 0
Table D.2 The Boolean function .ϕ2 takes input .x and outputs the 2nd bit of SB.PRESENT (x). The
second last row lists the output of .ϕ2 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ2
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ2 (x) 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0
.λx 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0
Table D.3 The Boolean function .ϕ3 takes input .x and outputs the 3rd bit of SB.PRESENT (x). The
second last row lists the output of .ϕ3 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ3
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ3 (x) 1 0 0 1 1 0 1 1 0 1 1 1 0 0 0 0
.λx 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0
In Table E.1, we list values in .TSG , which are signals for each integer between
00 and 3F with Hamming weight 6, computed with the stochastic leakage model
obtained in Code-SCA Step 6 from Sect. 4.5.1.1. The sorted version of .TSG is shown
in Table E.2, where the signals are in ascending order and the words from .F62 with
Hamming weight 6 are recorded accordingly.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 461
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
462 E Encoding-Based Countermeasure for Symmetric Block Ciphers
CF 11001111 .−0.01066
D7 11010111 .−0.01074
DB 11011011 .−0.01061
DD 11011101 .−0.01067
DE 11011110 .−0.00951
E7 11100111 .−0.01156
EB 11101011 .−0.01143
ED 11101101 .−0.01149
EE 11101110 .−0.01033
F3 11110011 .−0.01152
F5 11110101 .−0.01158
F6 11110110 .−0.01042
F9 11111001 .−0.01145
FA 11111010 .−0.01029
FC 11111100 .−0.01035
E Encoding-Based Countermeasure for Symmetric Block Ciphers 463
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 465
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
466 References
[AK96] Ross Anderson and Markus Kuhn. Tamper resistance-a cautionary note. In
Proceedings of the second Usenix workshop on electronic commerce, volume 2,
pages 1–11, 1996.
[ANP20] Alexandre Adomnicai, Zakaria Najm, and Thomas Peyrin. Fixslicing: a new gift
representation: fast constant-time implementations of gift and gift-cofb on arm
cortex-m. IACR Transactions on Cryptographic Hardware and Embedded Systems,
pages 402–427, 2020.
[AP20] Alexandre Adomnicai and Thomas Peyrin. Fixslicing AES-like ciphers: New
bitsliced AES speed records on ARM-cortex M and RISC-V. Cryptology ePrint
Archive, 2020.
[APZ21] Melissa Azouaoui, Kostas Papagiannopoulos, and Dominik Zürner. Blind side-
channel SIFA. In 2021 Design, Automation & Test in Europe Conference &
Exhibition (DATE), pages 555–560. IEEE, 2021.
[Atm16] Atmel. AVR Instruction Set Manual. http://ww1.microchip.com/downloads/en/
devicedoc/atmel-0856-avr-instruction-set-manual.pdf, 2016.
[AV13] Kostas Papagiannopoulos Aram Verstegen. Present speed implementation. https://
github.com/kostaspap88/PRESENT_speed_implementation, 2013.
[AWKS12] Kahraman D Akdemir, Zhen Wang, Mark Karpovsky, and Berk Sunar. Design
of cryptographic devices resilient to fault injection attacks using nonlinear robust
codes. Fault analysis in cryptography, pages 171–199, 2012.
[BBB+ 07] Elaine Barker, William Barker, William Burr, William Polk, and Miles Smid. NIST
special publication 800-57. NIST Special publication, 2007.
[BBB+ 18] Anubhab Baksi, Shivam Bhasin, Jakub Breier, Mustafa Khairallah, and Thomas
Peyrin. Protecting block ciphers against differential fault attacks without re-keying.
In 2018 IEEE International Symposium on Hardware Oriented Security and Trust
(HOST), pages 191–194. IEEE, 2018.
[BBB+ 21] Anubhab Baksi, Shivam Bhasin, Jakub Breier, Mustafa Khairallah, Thomas Peyrin,
Sumanta Sarkar, and Siang Meng Sim. DEFAULT: Cipher Level Resistance
Against Differential Fault Attack. In Mehdi Tibouchi and Huaxiong Wang,
editors, Advances in Cryptology—ASIACRYPT 2021, pages 124–156, Cham, 2021.
Springer International Publishing.
[BBB+ 22] Lejla Batina, Shivam Bhasin, Jakub Breier, Xiaolu Hou, and Dirmanto Jap.
On implementation-level security of edge-based machine learning models. In
Security and Artificial Intelligence: A Crossdisciplinary Approach, pages 335–359.
Springer, 2022.
[BBH+ 20] Shivam Bhasin, Jakub Breier, Xiaolu Hou, Dirmanto Jap, Romain Poussier,
and Siang Meng Sim. SITM: See-in-the-middle side-channel assisted middle
round differential cryptanalysis on SPN block ciphers. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 95–122, 2020.
[BBJP19] Lejla Batina, Shivam Bhasin, Dirmanto Jap, and Stjepan Picek. {CSI} {NN}:
Reverse engineering of neural network architectures through electromagnetic side
channel. In 28th USENIX Security Symposium (USENIX Security 19), pages 515–
532, 2019.
[BC16] Jakub Breier and Chien-Ning Chen. On determining optimal parameters for testing
devices against laser fault attacks. In 2016 International Symposium on Integrated
Circuits (ISIC), pages 1–4. IEEE, 2016.
[BCDG10] Alexandre Berzati, Cécile Canovas-Dumas, and Louis Goubin. Public key
perturbation of randomized RSA implementations. In Cryptographic Hardware
and Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 306–319. Springer, 2010.
[BCG08] Alexandre Berzati, Cécile Canovas, and Louis Goubin. Perturbating RSA public
keys: An improved attack. In Cryptographic Hardware and Embedded Systems–
CHES 2008: 10th International Workshop, Washington, DC, USA, August 10–13,
2008. Proceedings 10, pages 380–395. Springer, 2008.
References 467
[BCMCC06] Eric Brier, Benoît Chevallier-Mames, Mathieu Ciet, and Christophe Clavier. Why
one should also secure RSA public key elements. In Cryptographic Hardware and
Embedded Systems-CHES 2006: 8th International Workshop, Yokohama, Japan,
October 10–13, 2006. Proceedings 8, pages 324–338. Springer, 2006.
[BD00] Dan Boneh and Glenn Durfee. Cryptanalysis of RSA with private key d less than
n/sup 0.292. IEEE transactions on Information Theory, 46(4):1339–1349, 2000.
[BD16] Elaine Barker and Quynh Dang. NIST special publication 800-57 part 1, revision
4. NIST Special publication, 2016.
[BDF98] Dan Boneh, Glenn Durfee, and Yair Frankel. An attack on RSA given a small
fraction of the private key bits. In International Conference on the Theory and
Application of Cryptology and Information Security, pages 25–34. Springer, 1998.
[BDF+ 09] Shivam Bhasin, Jean-Luc Danger, Florent Flament, Tarik Graba, Sylvain Guilley,
Yves Mathieu, Maxime Nassar, Laurent Sauvage, and Nidhal Selmane. Combined
SCA and DFA countermeasures integrable in a FPGA design flow. In 2009
International Conference on Reconfigurable Computing and FPGAs, pages 213–
218. IEEE, 2009.
[BDH+ 97] Feng Bao, Robert H Deng, Yongfei Han, A Jeng, A Desai Narasimhalu, and
T Ngair. Breaking public key cryptosystems on tamper resistant devices in the
presence of transient faults. In International Workshop on Security Protocols, pages
115–124. Springer, 1997.
[BDL97] Dan Boneh, Richard A DeMillo, and Richard J Lipton. On the importance of
checking cryptographic protocols for faults. In International conference on the
theory and applications of cryptographic techniques, pages 37–51. Springer, 1997.
[BDPA13] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. Keccak. In
Annual international conference on the theory and applications of cryptographic
techniques, pages 313–314. Springer, 2013.
[BDPVA07] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. Sponge
functions. In ECRYPT hash workshop, 2007.
[BECN+ 06] Hagai Bar-El, Hamid Choukri, David Naccache, Michael Tunstall, and Claire
Whelan. The sorcerer’s apprentice guide to fault attacks. Proceedings of the IEEE,
94(2):370–382, 2006.
[BEG13] Nasour Bagheri, Reza Ebrahimpour, and Navid Ghaedi. New differential fault
analysis on present. EURASIP Journal on Advances in Signal Processing, 2013:1–
10, 2013.
[Bei11] Amos Beimel. Secret-sharing schemes: A survey. In International conference on
coding and cryptology, pages 11–46. Springer, 2011.
[Ber09] Dennis S Bernstein. Matrix mathematics: theory, facts, and formulas. Princeton
university press, 2009.
[BFGV12] Josep Balasch, Sebastian Faust, Benedikt Gierlichs, and Ingrid Verbauwhede.
Theory and practice of a leakage resilient masking scheme. In Advances in
Cryptology–ASIACRYPT 2012: 18th International Conference on the Theory and
Application of Cryptology and Information Security, Beijing, China, December 2–
6, 2012. Proceedings 18, pages 758–775. Springer, 2012.
[BFP19] Claudio Bozzato, Riccardo Focardi, and Francesco Palmarini. Shaping the glitch:
optimizing voltage fault injection attacks. IACR Transactions on Cryptographic
Hardware and Embedded Systems, pages 199–224, 2019.
[BGE+ 17] Jan Burchard, Manl Gay, Ange-Salomé Messeng Ekossono, Jan Horáček, Bernd
Becker, Tobias Schubert, Martin Kreuzer, and Ilia Polian. Autofault: towards
automatic construction of algebraic fault attacks. In 2017 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 65–72. IEEE, 2017.
[BGK04] Johannes Blömer, Jorge Guajardo, and Volker Krummel. Provably secure masking
of AES. In International workshop on selected areas in cryptography, pages 69–
83. Springer, 2004.
468 References
[BGLP13] Ryad Benadjila, Jian Guo, Victor Lomné, and Thomas Peyrin. Implementing
lightweight block ciphers on x86 architectures. In International Conference on
Selected Areas in Cryptography, pages 324–351. Springer, 2013.
[BGLT04] Marco Bucci, Michele Guglielmo, Raimondo Luzzi, and Alessandro Trifiletti.
A power consumption randomization countermeasure for DPA-resistant crypto-
graphic processors. In Integrated Circuit and System Design. Power and Timing
Modeling, Optimization and Simulation: 14th International Workshop, PATMOS
2004, Santorini, Greece, September 15–17, 2004. Proceedings 14, pages 481–490.
Springer, 2004.
[BGM+ 03] Luca Benini, Angelo Galati, Alberto Macii, Enrico Macii, and Massimo Poncino.
Energy-efficient data scrambling on memory-processor interfaces. In Proceedings
of the 2003 international symposium on Low power electronics and design, pages
26–29, 2003.
[BGN+ 15] Begül Bilgin, Benedikt Gierlichs, Svetla Nikova, Ventzislav Nikov, and Vincent
Rijmen. Trade-offs for threshold implementations illustrated on AES. IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems,
34(7):1188–1200, 2015.
[BGV11] Josep Balasch, Benedikt Gierlichs, and Ingrid Verbauwhede. An in-depth and
black-box characterization of the effects of clock glitches on 8-bit MCUs. In
2011 Workshop on Fault Diagnosis and Tolerance in Cryptography, pages 105–
114. IEEE, 2011.
[BH15] Jakub Breier and Wei He. Multiple fault attack on PRESENT with a hardware
trojan implementation in FPGA. In Gabriel Ghinita and Pedro Peris-Lopez, editors,
2015 International Workshop on Secure Internet of Things, SIoT 2015, Vienna,
Austria, September 21–25, 2015, pages 58–64. IEEE Computer Society, 2015.
[BH17] Jakub Breier and Xiaolu Hou. Feeding two cats with one bowl: On designing a fault
and side-channel resistant software encoding scheme. In Topics in Cryptology–CT-
RSA 2017: The Cryptographers’ Track at the RSA Conference 2017, San Francisco,
CA, USA, February 14–17, 2017, Proceedings, pages 77–94. Springer, 2017.
[BH22] Jakub Breier and Xiaolu Hou. How practical are fault injection attacks, really?
IEEE Access, 10:113122–113130, 2022.
[BHJ+ 18] Jakub Breier, Xiaolu Hou, Dirmanto Jap, Lei Ma, Shivam Bhasin, and Yang Liu.
Practical fault attack on deep neural networks. In Proceedings of the 2018 ACM
SIGSAC Conference on Computer and Communications Security, pages 2204–
2206. ACM, 2018.
[BHL18] Jakub Breier, Xiaolu Hou, and Yang Liu. Fault attacks made easy: Differential
fault analysis automation on assembly code. IACR Transactions on Cryptographic
Hardware and Embedded Systems, pages 96–122, 2018.
[BHL19] Jakub Breier, Xiaolu Hou, and Yang Liu. On evaluating fault resilient encoding
schemes in software. IEEE Transactions on Dependable and Secure Computing,
18(3):1065–1079, 2019.
[BHOS22] Jakub Breier, Xiaolu Hou, Martín Ochoa, and Jesus Solano. Foobar: Fault fooling
backdoor attack on neural network training. IEEE Transactions on Dependable
and Secure Computing, 2022.
[BHT01] Eric Brier, Helena Handschuh, and Christophe Tymen. Fast primitives for internal
data scrambling in tamper resistant hardware. In Cryptographic Hardware and
Embedded Systems–CHES 2001: Third International Workshop Paris, France, May
14–16, 2001 Proceedings 3, pages 16–27. Springer, 2001.
[BHvW12] Lejla Batina, Jip Hogenboom, and Jasper GJ van Woudenberg. Getting more
from PCA: first results of using principal component analysis for extensive power
analysis. In Topics in Cryptology–CT-RSA 2012: The Cryptographers’ Track at
the RSA Conference 2012, San Francisco, CA, USA, February 27–March 2, 2012.
Proceedings, pages 383–397. Springer, 2012.
References 469
[Bih97] Eli Biham. A fast new DES implementation in software. In International Workshop
on Fast Software Encryption, pages 260–272. Springer, 1997.
[BILT04] Jean-Claude Bajard, Laurent Imbert, Pierre-Yvan Liardet, and Yannick Teglia.
Leak resistant arithmetic. In Cryptographic Hardware and Embedded Systems-
CHES 2004: 6th International Workshop Cambridge, MA, USA, August 11–13,
2004. Proceedings 6, pages 62–75. Springer, 2004.
[BJB18] Jakub Breier, Dirmanto Jap, and Shivam Bhasin. SCADPA: side-channel assisted
differential-plaintext attack on bit permutation based ciphers. In Jan Madsen and
Ayse K. Coskun, editors, 2018 Design, Automation & Test in Europe Conference
& Exhibition, DATE 2018, Dresden, Germany, March 19–23, 2018, pages 1129–
1134. IEEE, 2018.
[BJH+ 21] Jakub Breier, Dirmanto Jap, Xiaolu Hou, Shivam Bhasin, and Yang Liu. Sniff:
reverse engineering of neural networks with fault attacks. IEEE Transactions on
Reliability, 71(4):1527–1539, 2021.
[BJHB19] Jakub Breier, Dirmanto Jap, Xiaolu Hou, and Shivam Bhasin. On side channel
vulnerabilities of bit permutations in cryptographic algorithms. IEEE Transactions
on Information Forensics and Security, 15:1072–1085, 2019.
[BJHB23] Jakub Breier, Dirmanto Jap, Xiaolu Hou, and Shivam Bhasin. A
desynchronization-based countermeasure against side-channel analysis of neural
networks. In International Symposium on Cyber Security, Cryptology, and Machine
Learning, pages 296–306. Springer, 2023.
[BJKS21] Robert Buhren, Hans-Niklas Jacob, Thilo Krachenfels, and Jean-Pierre Seifert.
One glitch to rule them all: Fault injection attacks against AMD’s secure encrypted
virtualization. In Proceedings of the 2021 ACM SIGSAC Conference on Computer
and Communications Security, pages 2875–2889, 2021.
[BJP20] Shivam Bhasin, Dirmanto Jap, and Stjepan Picek. AES HD dataset—50 000 traces.
AISyLab repository, 2020. https://github.com/AISyLab/AES_HD.
[BK06] Johannes Blömer and Volker Krummel. Fault based collision attacks on AES. In
International Workshop on Fault Diagnosis and Tolerance in Cryptography, pages
106–120. Springer, 2006.
[BKH+ 19] Arthur Beckers, Masahiro Kinugawa, Yuichi Hayashi, Daisuke Fujimoto, Josep
Balasch, Benedikt Gierlichs, and Ingrid Verbauwhede. Design considerations for
em pulse fault injection. In International Conference on Smart Card Research and
Advanced Applications, pages 176–192. Springer, 2019.
[BKHL20] Jakub Breier, Mustafa Khairallah, Xiaolu Hou, and Yang Liu. A countermeasure
against statistical ineffective fault analysis. IEEE Transactions on Circuits and
Systems II: Express Briefs, 67(12):3322–3326, 2020.
[BKL+ 07] Andrey Bogdanov, Lars R Knudsen, Gregor Leander, Christof Paar, Axel
Poschmann, Matthew JB Robshaw, Yannick Seurin, and Charlotte Vikkelsoe.
Present: An ultra-lightweight block cipher. In International workshop on cryp-
tographic hardware and embedded systems, pages 450–466. Springer, 2007.
[Bla83] George R Blakely. A computer algorithm for calculating the product ab modulo m.
IEEE Transactions on Computers, 100(5):497–500, 1983.
[BLMR19] Christof Beierle, Gregor Leander, Amir Moradi, and Shahram Rasoolzadeh. Craft:
lightweight tweakable block cipher with efficient protection against DFA attacks.
IACR Transactions on Symmetric Cryptology, 2019(1):5–45, 2019.
[BMV07] Sanjay Burman, Debdeep Mukhopadhyay, and Kamakoti Veezhinathan. LFSR
based stream ciphers are vulnerable to power attacks. In International Conference
on Cryptology in India, pages 384–392. Springer, 2007.
[Bor06] Michele Boreale. Attacking right-to-left modular exponentiation with timely ran-
dom faults. In Fault Diagnosis and Tolerance in Cryptography: Third International
Workshop, FDTC 2006, Yokohama, Japan, October 10, 2006. Proceedings, pages
24–35. Springer, 2006.
470 References
[BOS03] Johannes Blömer, Martin Otto, and Jean-Pierre Seifert. A new CRT-RSA algorithm
secure against bellcore attacks. In Proceedings of the 10th ACM conference on
Computer and communications security, pages 311–320, 2003.
[BP82] HJ Beker and FC Piper. Communications security: a survey of cryptography. IEE
Proceedings A (Physical Science, Measurement and Instrumentation, Management
and Education, Reviews), 129(6):357–376, 1982.
[BPS+ 20] Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli, and Cécile
Dumas. Deep learning for side-channel analysis and introduction to ASCAD
database. Journal of Cryptographic Engineering, 10(2):163–188, 2020.
[BPS+ 21] Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli, and Cécile
Dumas. ASCAD SCA database. https://github.com/ANSSI-FR/ASCAD.git, 2021.
[BRBG16] Erik Bosman, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Dedup est
machina: Memory deduplication as an advanced exploitation vector. In 2016 IEEE
symposium on security and privacy (SP), pages 987–1004. IEEE, 2016.
[BS97] Eli Biham and Adi Shamir. Differential fault analysis of secret key cryptosystems.
In Advances in Cryptology–CRYPTO’97: 17th Annual International Cryptology
Conference Santa Barbara, California, USA August 17–21, 1997 Proceedings 17,
pages 513–525. Springer, 1997.
[BS08] Bhaskar Biswas and Nicolas Sendrier. Mceliece cryptosystem implementation:
Theory and practice. In International Workshop on Post-Quantum Cryptography,
pages 47–62. Springer, 2008.
[BS12] Eli Biham and Adi Shamir. Differential cryptanalysis of the data encryption
standard. Springer Science & Business Media, 2012.
[BSH75] Daniel Binder, Edward C Smith, and AB Holman. Satellite anomalies from galactic
cosmic rays. IEEE Transactions on Nuclear Science, 22(6):2675–2680, 1975.
[BT12] Alessandro Barenghi and Elena Trichina. Fault attacks on stream ciphers. In Fault
Analysis in Cryptography, pages 239–255. Springer, 2012.
[Buc04] Johannes Buchmann. Introduction to cryptography, volume 335. Springer, 2004.
[Cal75] Stephen Calebotta. CMOS, the ideal logic family. National Semiconductor CMOS
Databook, Rev, 1:2–3, 1975.
[CB19] Andrea Caforio and Subhadeep Banik. A study of persistent fault analysis.
In Security, Privacy, and Applied Cryptography Engineering: 9th International
Conference, SPACE 2019, Gandhinagar, India, December 3–7, 2019, Proceedings
9, pages 13–33. Springer, 2019.
[CCD+ 21] Pierre-Louis Cayrel, Brice Colombier, Vlad-Florin Drăgoi, Alexandre Menu, and
Lilian Bossuet. Message-recovery laser fault injection attack on the classic
mceliece cryptosystem. In Annual International Conference on the Theory and
Applications of Cryptographic Techniques, pages 438–467. Springer, 2021.
[CCT+ 18] Samuel Chef, Chung Tah Chua, Jing Yun Tay, Yu Wen Siah, Shivam Bhasin,
J Breier, and Chee Lip Gan. Descrambling of embedded SRAM using a laser probe.
In 2018 IEEE International Symposium on the Physical and Failure Analysis of
Integrated Circuits (IPFA), pages 1–6. IEEE, 2018.
[CD+ 15] Ronald Cramer, Ivan Bjerre Damgård, et al. Secure multiparty computation.
Cambridge University Press, 2015.
[CFGR10] Christophe Clavier, Benoit Feix, Georges Gagnerot, and Mylene Roussellet. Pas-
sive and active combined attacks on AES combining fault attacks and side channel
analysis. In 2010 Workshop on Fault Diagnosis and Tolerance in Cryptography,
pages 10–19. IEEE, 2010.
[CFZK21] Huili Chen, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. Proflip: Targeted tro-
jan attack with progressive bit flips. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, pages 7718–7727, 2021.
[CG16] Claude Carlet and Sylvain Guilley. Complementary dual codes for counter-
measures to side-channel attacks. Adv. Math. Commun., 10(1):131–150, 2016.
References 471
[CH17] Ang Cui and Rick Housley. BADFET: Defeating Modern Secure Boot Using
Second-Order Pulsed Electromagnetic Fault Injection. In 11th USENIX Workshop
on Offensive Technologies (WOOT 17), 2017.
[Cho22] Charles Q. Choi. IBM Unveils 433-Qubit Osprey Chip. IEEE Spectrum, November
2022.
[CJRR99] Suresh Chari, Charanjit S Jutla, Josyula R Rao, and Pankaj Rohatgi. Towards sound
approaches to counteract power-analysis attacks. In Advances in Cryptology—
CRYPTO’99: 19th Annual International Cryptology Conference Santa Barbara,
California, USA, August 15–19, 1999 Proceedings 19, pages 398–412. Springer,
1999.
[CJW10] Nicolas T Courtois, Keith Jackson, and David Ware. Fault-algebraic attacks on
inner rounds of DES. In E-Smart’10 Proceedings: The Future of Digital Security
Technologies. Strategies Telecom and Multimedia, 2010.
[CK09] Jean-Sébastien Coron and Ilya Kizhvatov. An efficient method for random delay
generation in embedded software. In Cryptographic Hardware and Embed-
ded Systems-CHES 2009: 11th International Workshop Lausanne, Switzerland,
September 6–9, 2009 Proceedings, pages 156–170. Springer, 2009.
[CK10] Jean-Sébastien Coron and Ilya Kizhvatov. Analysis and improvement of the
random delay countermeasure of ches 2009. In Cryptographic Hardware and
Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 95–109. Springer, 2010.
[CK18] Jean-Sébastien Coron and Ilya Kizhvatov. Trace sets with random delays. https://
github.com/ikizhvatov/randomdelays-traces.git, 2018.
[Cla07] Christophe Clavier. Secret external encodings do not prevent transient fault
analysis. In Cryptographic Hardware and Embedded Systems-CHES 2007: 9th
International Workshop, Vienna, Austria, September 10–13, 2007. Proceedings 9,
pages 181–194. Springer, 2007.
[Cor99] Jean-Sébastien Coron. Resistance against differential power analysis for elliptic
curve cryptosystems. In Cryptographic Hardware and Embedded Systems:
First InternationalWorkshop, CHES’99 Worcester, MA, USA, August 12–13, 1999
Proceedings 1, pages 292–302. Springer, 1999.
[COZZ23] Yukun Cheng, Changhai Ou, Fan Zhang, and Shihui Zheng. DLPFA: Deep learning
based persistent fault analysis against block ciphers. Cryptology ePrint Archive,
2023.
[CRR03] Suresh Chari, Josyula R Rao, and Pankaj Rohatgi. Template attacks. In
Cryptographic Hardware and Embedded Systems-CHES 2002: 4th International
Workshop Redwood Shores, CA, USA, August 13–15, 2002 Revised Papers 4, pages
13–28. Springer, 2003.
[CT03] Jean-Sébastien Coron and Alexei Tchulkine. A new algorithm for switching
from arithmetic to boolean masking. In International Workshop on Cryptographic
Hardware and Embedded Systems, pages 89–97. Springer, 2003.
[CVM+ 21] Zitai Chen, Georgios Vasilakis, Kit Murdock, Edward Dean, David Oswald, and
Flavio D Garcia. VoltPillager: Hardware-based fault injection attacks against Intel
SGX Enclaves using the SVID voltage scaling interface. In 30th USENIX Security
Symposium (USENIX Security 21), pages 699–716, 2021.
[DAP+ 22] Anuj Dubey, Afzal Ahmad, Muhammad Adeel Pasha, Rosario Cammarota, and
Aydin Aysu. Modulonet: Neural networks meet modular arithmetic for efficient
hardware masking. IACR Transactions on Cryptographic Hardware and Embedded
Systems, pages 506–556, 2022.
[dBLW03] Bert den Boer, Kerstin Lemke, and Guntram Wicke. A DPA attack against the
modular reduction within a crt implementation of RSA. In Cryptographic Hard-
ware and Embedded Systems-CHES 2002: 4th International Workshop Redwood
Shores, CA, USA, August 13–15, 2002 Revised Papers 4, pages 228–243. Springer,
2003.
472 References
[DCA20] Anuj Dubey, Rosario Cammarota, and Aydin Aysu. Maskednet: The first hardware
inference engine aiming power side-channel protection. In 2020 IEEE Inter-
national Symposium on Hardware Oriented Security and Trust (HOST), pages
197–208. IEEE, 2020.
[DCRB+ 16] Thomas De Cnudde, Oscar Reparaz, Begül Bilgin, Svetla Nikova, Ventzislav
Nikov, and Vincent Rijmen. Masking AES with shares in hardware. In
International Conference on Cryptographic Hardware and Embedded Systems,
pages 194–212. Springer, 2016.
[DCSA22] Anuj Dubey, Rosario Cammarota, Vikram Suresh, and Aydin Aysu. Guarding
machine learning hardware against physical side-channel attacks. ACM Journal
on Emerging Technologies in Computing Systems (JETC), 18(3):1–31, 2022.
[DEK+ 18] Christoph Dobraunig, Maria Eichlseder, Thomas Korak, Stefan Mangard, Florian
Mendel, and Robert Primas. SIFA: exploiting ineffective fault inductions on
symmetric cryptography. IACR Transactions on Cryptographic Hardware and
Embedded Systems, pages 547–572, 2018.
[DLM20] Mathieu Dumont, Mathieu Lisart, and Philippe Maurine. Modeling and simulating
electromagnetic fault injection. IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, 40(4):680–693, 2020.
[DO22] Shaked Delarea and Yossi Oren. Practical, low-cost fault injection attacks on
personal smart devices. Applied Sciences, 12(1):417, 2022.
[DPRS11] Julien Doget, Emmanuel Prouff, Matthieu Rivain, and François-Xavier Standaert.
Univariate side channel attacks and leakage modeling. Journal of Cryptographic
Engineering, 1:123–144, 2011.
[DR02] Joan Daemen and Vincent Rijmen. The design of Rijndael, volume 2. Springer,
2002.
[Dud14] Richard M Dudley. Uniform central limit theorems, volume 142. Cambridge
university press, 2014.
[Dur19] Rick Durrett. Probability: theory and examples, volume 49. Cambridge university
press, 2019.
[Dwo15] Morris Dworkin. SHA-3 Standard: Permutation-Based Hash and Extendable-
Output Functions, 2015-08-04 2015.
[DZD+ 18] A Adam Ding, Liwei Zhang, François Durvaux, François-Xavier Standaert, and
Yunsi Fei. Towards sound and optimal leakage detection procedure. In Smart Card
Research and Advanced Applications: 16th International Conference, CARDIS
2017, Lugano, Switzerland, November 13–15, 2017, Revised Selected Papers,
pages 105–122. Springer, 2018.
[EJ96] Artur Ekert and Richard Jozsa. Quantum computation and shor’s factoring
algorithm. Reviews of Modern Physics, 68(3):733, 1996.
[ESH+ 11] Sho Endo, Takeshi Sugawara, Naofumi Homma, Takafumi Aoki, and Akashi
Satoh. An on-chip glitchy-clock generator for testing fault injection attacks.
Journal of Cryptographic Engineering, 1(4):265–270, 2011.
[Far70] PG Farrell. Linear binary anticodes. Electronics Letters, 13(6):419–421, 1970.
[FJLT13] Thomas Fuhr, Éliane Jaulmes, Victor Lomné, and Adrian Thillard. Fault attacks
on AES with faulty ciphertexts only. In 2013 Workshop on Fault Diagnosis and
Tolerance in Cryptography, pages 108–118. IEEE, 2013.
[FMP03] Pierre-Alain Fouque, Gwenaëlle Martinet, and Guillaume Poupard. Attacking
unbalanced RSA-crt using spa. In Cryptographic Hardware and Embedded
Systems-CHES 2003: 5th International Workshop, Cologne, Germany, September
8–10, 2003. Proceedings 5, pages 254–268. Springer, 2003.
[Fou98] Electronic Frontier Foundation. Cracking DES: Secrets of encryption research,
wiretap politics and chip design. https://cryptome.org/jya/cracking-des/cracking-
des.htm, 1998.
[FRVD08] Pierre-Alain Fouque, Denis Réal, Frédéric Valette, and Mhamed Drissi. The carry
leakage on the randomized exponent countermeasure. In International Workshop
References 473
Third International Workshop, FDTC 2006, Yokohama, Japan, October 10, 2006.
Proceedings, pages 173–184. Springer, 2006.
[GST12] Benedikt Gierlichs, Jörn-Marc Schmidt, and Michael Tunstall. Infective com-
putation and dummy rounds: Fault protection for block ciphers without check-
before-output. In Progress in Cryptology–LATINCRYPT 2012: 2nd International
Conference on Cryptology and Information Security in Latin America, Santiago,
Chile, October 7–10, 2012. Proceedings 2, pages 305–321. Springer, 2012.
[Hab65] Donald H Habing. The use of lasers to simulate radiation-induced transients
in semiconductor devices and circuits. IEEE Transactions on Nuclear Science,
12(5):91–100, 1965.
[HBB+ 16] Wei He, Jakub Breier, Shivam Bhasin, Noriyuki Miura, and Makoto Nagata. Ring
oscillator under laser: Potential of PLL-based countermeasure against laser fault
injection. In Fault Diagnosis and Tolerance in Cryptography (FDTC), 2016
Workshop on, pages 102–113. IEEE, 2016.
[HBB21] Xiaolu Hou, Jakub Breier, and Shivam Bhasin. DNFA: Differential no-fault
analysis of bit permutation based ciphers assisted by side-channel. In 2021 Design,
Automation & Test in Europe Conference & Exhibition (DATE), pages 182–187.
IEEE, 2021.
[HBB22] Xiaolu Hou, Jakub Breier, and Shivam Bhasin. SBCMA: Semi-blind com-
bined middle-round attack on bit-permutation ciphers with application to AEAD
schemes. IEEE Transactions on Information Forensics and Security, 17:3677–
3690, 2022.
[HBJ+ 21] Xiaolu Hou, Jakub Breier, Dirmanto Jap, Lei Ma, Shivam Bhasin, and Yang Liu.
Physical security of deep learning on edge devices: Comprehensive evaluation of
fault injection attack vectors. Microelectronics Reliability, 120:114116, 2021.
[HBK23] Xiaolu Hou, Jakub Breier, and Mladen Kovacevic. Another look at side-channel
resistant encoding schemes. IACR Cryptol. ePrint Arch., page 1698, 2023.
[HBZL19] Xiaolu Hou, Jakub Breier, Fuyuan Zhang, and Yang Liu. Fully automated
differential fault analysis on software implementations of block ciphers. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 1–29,
2019.
[HDD11] Philippe Hoogvorst, Guillaume Duc, and Jean-Luc Danger. Software implementa-
tion of dual-rail representation. COSADE, February, pages 24–25, 2011.
[Her96] Israel N Herstein. Abstract algebra. Prentice Hall, 1996.
[HFK+ 19] Sanghyun Hong, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, and Tudor
Dumitras. Terminal brain damage: Exposing the graceless degradation in deep
neural networks under hardware fault attacks. In 28th USENIX Security Symposium
(USENIX Security 19), pages 497–514, 2019.
[HH11] Ludger Hemme and Lars Hoffmann. Differential fault analysis on the sha1
compression function. In 2011 Workshop on Fault Diagnosis and Tolerance in
Cryptography, pages 54–62. IEEE, 2011.
[HHS+ 11] Yu-ichi Hayashi, Naofumi Homma, Takeshi Sugawara, Takaaki Mizuki, Takafumi
Aoki, and Hideaki Sone. Non-invasive EMI-based fault injection attack against
cryptographic modules. In 2011 IEEE International Symposium on Electromag-
netic Compatibility, pages 763–767. IEEE, 2011.
[HLMS14] Ronglin Hao, Bao Li, Bingke Ma, and Ling Song. Algebraic fault attack on the
sha-256 compression function. International Journal of Research in Computer
Science, 4(2):1, 2014.
[HOM06] Christoph Herbst, Elisabeth Oswald, and Stefan Mangard. An AES smart card
implementation resistant to power analysis attacks. In International conference on
applied cryptography and network security, pages 239–252. Springer, 2006.
[HPS98] Jeffrey Hoffstein, Jill Pipher, and Joseph H Silverman. NTRU: A ring-based public
key cryptosystem. In International algorithmic number theory symposium, pages
267–288. Springer, 1998.
References 475
[HS13] Michael Hutter and Jörn-Marc Schmidt. The temperature side channel and heating
fault attacks. In International Conference on Smart Card Research and Advanced
Applications, pages 219–235. Springer, 2013.
[HSP20] Max Hoffmann, Falk Schellenberg, and Christof Paar. Armory: fully automated
and exhaustive fault simulation on arm-m binaries. IEEE Transactions on
Information Forensics and Security, 16:1058–1073, 2020.
[Hun12] Thomas W Hungerford. Algebra, volume 73. Springer Science & Business Media,
2012.
[HZ12] Annelie Heuser and Michael Zohner. Intelligent machine homicide: Breaking
cryptographic devices using support vector machines. In Constructive Side-
Channel Analysis and Secure Design: Third International Workshop, COSADE
2012, Darmstadt, Germany, May 3–4, 2012. Proceedings 3, pages 249–264.
Springer, 2012.
[JAB+ 03] M Rabaey Jan, Chandrakasan Anantha, Nikolic Borivoje, et al. Digital integrated
circuits: a design perspective. Prentice Hall, 2003.
[Jea16] Jérémy Jean. TikZ for Cryptographers. https://www.iacr.org/authors/tikz/, 2016.
[JP04] Jean Jacod and Philip Protter. Probability essentials. Springer Science & Business
Media, 2004.
[JPY01] Marc Joye, Pascal Paillier, and Sung-Ming Yen. Secure evaluation of modular
functions. In 2001 International Workshop on Cryptology and Network Security,
pages 227–229. Citeseer, 2001.
[JQBD97] Marc Joye, Jean-Jacques Quisquater, Feng Bao, and Robert H Deng. RSA-type
signatures in the presence of transient faults. In IMA International Conference on
Cryptography and Coding, pages 155–160. Springer, 1997.
[JVDVF+ 22] Patrick Jattke, Victor Van Der Veen, Pietro Frigo, Stijn Gunter, and Kaveh Razavi.
Blacksmith: Scalable rowhammering in the frequency domain. In 2022 IEEE
Symposium on Security and Privacy (SP), pages 716–734. IEEE, 2022.
[JY03] Marc Joye and Sung-Ming Yen. The montgomery powering ladder. In
Cryptographic Hardware and Embedded Systems-CHES 2002: 4th International
Workshop Redwood Shores, CA, USA, August 13–15, 2002 Revised Papers, pages
291–302. Springer, 2003.
[KAF+ 10] Thorsten Kleinjung, Kazumaro Aoki, Jens Franke, Arjen K Lenstra, Emmanuel
Thomé, Joppe W Bos, Pierrick Gaudry, Alexander Kruppa, Peter L Montgomery,
Dag Arne Osvik, et al. Factorization of a 768-bit RSA modulus. In Annual
Cryptology Conference, pages 333–350. Springer, 2010.
[KBJ+ 22] Niclas Kühnapfel, Robert Buhren, Hans Niklas Jacob, Thilo Krachenfels, Christian
Werling, and Jean-Pierre Seifert. Em-fault it yourself: Building a replicable EMFI
setup for desktop and server hardware. In 2022 IEEE Physical Assurance and
Inspection of Electronics (PAINE), pages 1–7. IEEE, 2022.
[KDB+ 22] Satyam Kumar, Vishnu Asutosh Dasu, Anubhab Baksi, Santanu Sarkar, Dirmanto
Jap, Jakub Breier, and Shivam Bhasin. Side channel attack on stream ciphers:
A three-step approach to state/key recovery. IACR Transactions Cryptographic
Hardware and Embedded. Systems, 2022(2):166–191, 2022.
[KDK+ 14] Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk
Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory
without accessing them: An experimental study of dram disturbance errors. ACM
SIGARCH Computer Architecture News, 42(3):361–372, 2014.
[KHN+ 19] Mustafa Khairallah, Xiaolu Hou, Zakaria Najm, Jakub Breier, Shivam Bhasin,
and Thomas Peyrin. Sok: On DFA vulnerabilities of substitution-permutation
networks. In Steven D. Galbraith, Giovanni Russello, Willy Susilo, Dieter
Gollmann, Engin Kirda, and Zhenkai Liang, editors, Proceedings of the 2019
ACM Asia Conference on Computer and Communications Security, AsiaCCS 2019,
Auckland, New Zealand, July 09–12, 2019, pages 403–414. ACM, 2019.
476 References
[KJ01] Paul C Kocher and Joshua M Jaffe. Secure modular exponentiation with leak
minimization for smartcards and other cryptosystems, October 2 2001. US Patent
6,298,442.
[KJJ99] Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential power analysis. In
Advances in Cryptology—CRYPTO’99: 19th Annual International Cryptology
Conference Santa Barbara, California, USA, August 15–19, 1999 Proceedings 19,
pages 388–397. Springer, 1999.
[KJJ10] Paul C Kocher, Joshua M Jaffe, and Benjamin C Jun. Cryptographic computation
using masking to prevent differential power analysis and other attacks, February 23
2010. US Patent 7,668,310.
[KJJR11] Paul Kocher, Joshua Jaffe, Benjamin Jun, and Pankaj Rohatgi. Introduction to
differential power analysis. Journal of Cryptographic Engineering, 1:5–27, 2011.
[KJP14] Raghavan Kumar, Philipp Jovanovic, and Ilia Polian. Precise fault-injections using
voltage and temperature manipulation for differential cryptanalysis. In 2014 IEEE
20th International On-Line Testing Symposium (IOLTS), pages 43–48. IEEE, 2014.
[KKG03] Ramesh Karri, Grigori Kuznetsov, and Michael Goessel. Parity-based con-
current error detection of substitution-permutation network block ciphers. In
Cryptographic Hardware and Embedded Systems-CHES 2003: 5th International
Workshop, Cologne, Germany, September 8–10, 2003. Proceedings 5, pages 113–
124. Springer, 2003.
[KKT04] Mark Karpovsky, Konrad J Kulikowski, and Alexander Taubin. Robust protection
against fault-injection attacks on smart cards implementing the advanced encryp-
tion standard. In International Conference on Dependable Systems and Networks,
2004, pages 93–101. IEEE, 2004.
[KKY+ 89] Yasuhiro Konishi, Masaki Kumanoya, Hiroyuki Yamasaki, Katsumi Dosaka, and
Tsutomu Yoshihara. Analysis of coupling noise between adjacent bit lines in
megabit drams. IEEE Journal of Solid-State Circuits, 24(1):35–42, 1989.
[KM20] Martin S Kelly and Keith Mayes. High precision laser fault injection using low-
cost components. In 2020 IEEE International Symposium on Hardware Oriented
Security and Trust (HOST), pages 219–228. IEEE, 2020.
[KMBM17] Fatma Kahri, Hassen Mestiri, Belgacem Bouallegue, and Mohsen Machhout. Fault
attacks resistant architecture for keccak hash function. International Journal of
Advanced Computer Science and Applications, 8(5), 2017.
[Koç94] CK Koç. High-speed RSA implementation technical report. RSA Laboratories,
Redwood City, 1994.
[Koc96] Paul C Kocher. Timing attacks on implementations of diffie-hellman, RSA,
DSS, and other systems. In Advances in Cryptology—CRYPTO’96: 16th Annual
International Cryptology Conference Santa Barbara, California, USA August 18–
22, 1996 Proceedings 16, pages 104–113. Springer, 1996.
[Kos02] Thomas Koshy. Elementary number theory with applications. Academic press,
2002.
[KPH+ 19] Jaehun Kim, Stjepan Picek, Annelie Heuser, Shivam Bhasin, and Alan Hanjalic.
Make some noise. unleashing the power of convolutional neural networks for
profiled side-channel analysis. IACR Transactions on Cryptographic Hardware
and Embedded Systems, pages 148–179, 2019.
[KPP+ 22] Alexandr Alexandrovich Kuznetsov, Oleksandr Volodymyrovych Potii, Niko-
lay Alexandrovich Poluyanenko, Yurii Ivanovich Gorbenko, and Natalia Kryvin-
ska. Stream Ciphers in Modern Real-time IT Systems. Springer, 2022.
[KQ07] Chong Hee Kim and Jean-Jacques Quisquater. Fault attacks for crt based RSA:
New attacks, new results, and new countermeasures. In IFIP International
Workshop on Information Security Theory and Practices, pages 215–228. Springer,
2007.
References 477
[KS09] Emilia Käsper and Peter Schwabe. Faster and timing-attack resistant AES-GCM.
In International Workshop on Cryptographic Hardware and Embedded Systems,
pages 1–17. Springer, 2009.
[KSV13] Duško Karaklajić, Jörn-Marc Schmidt, and Ingrid Verbauwhede. Hardware
designer’s guide to fault attacks. IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, 21(12):2295–2306, 2013.
[Kwa00] Matthew Kwan. Reducing the gate count of bitslice DES. IACR Cryptol. ePrint
Arch., 2000(51):51, 2000.
[LBM15] Liran Lerman, Gianluca Bontempi, and Olivier Markowitch. A machine learning
approach against a masked AES: Reaching the limit of side-channel attacks with a
learning model. Journal of Cryptographic Engineering, 5:123–139, 2015.
[Len96] Arjen K Lenstra. Memo on RSA signature generation in the presence of faults.
Technical report, EPFL, 1996.
[LSG+ 10] Yang Li, Kazuo Sakiyama, Shigeto Gomisawa, Toshinori Fukunaga, Junko Taka-
hashi, and Kazuo Ohta. Fault sensitivity analysis. In Cryptographic Hardware
and Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 320–334. Springer, 2010.
[LWLX17] Yannan Liu, Lingxiao Wei, Bo Luo, and Qiang Xu. Fault injection attack on deep
neural network. In Proceedings of the 36th International Conference on Computer-
Aided Design, pages 131–138. IEEE, 2017.
[LX04] San Ling and Chaoping Xing. Coding theory: a first course. Cambridge University
Press, 2004.
[LZC+ 21] Xiangjun Lu, Chi Zhang, Pei Cao, Dawu Gu, and Haining Lu. Pay attention to
raw traces: A deep learning architecture for end-to-end profiling attacks. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 235–274,
2021.
[Mah45] Patrick Mahon. History of hut 8 to December 1941 (1945). B. Jack Copeland, page
265, 1945.
[Man03] Stefan Mangard. A simple power-analysis (spa) attack on implementations of the
AES key expansion. In Information Security and Cryptology—ICISC 2002: 5th
International Conference Seoul, Korea, November 28–29, 2002 Revised Papers 5,
pages 343–358. Springer, 2003.
[May03] Alexander May. New RSA vulnerabilities using lattice reduction methods. PhD
thesis, Citeseer, 2003.
[MBFC22] Saurav Maji, Utsav Banerjee, Samuel H Fuller, and Anantha P Chandrakasan.
A threshold implementation-based neural network accelerator with power and
electromagnetic side-channel countermeasures. IEEE Journal of Solid-State
Circuits, 2022.
[MDB+ 02] Jack A Mandelman, Robert H Dennard, Gary B Bronner, John K DeBrosse, Rama
Divakaruni, Yujun Li, and Carl J Radens. Challenges and future directions for the
scaling of dynamic random-access memory (DRAM). IBM Journal of Research
and Development, 46(2.3):187–212, 2002.
[MDS99a] Thomas S Messerges, Ezzy A Dabbish, and Robert H Sloan. Investigations of
power analysis attacks on smartcards. Smartcard, 99:151–161, 1999.
[MDS99b] Thomas S Messerges, Ezzy A Dabbish, and Robert H Sloan. Power analysis
attacks of modular exponentiation in smartcards. In Cryptographic Hardware and
Embedded Systems: First InternationalWorkshop, CHES’99 Worcester, MA, USA,
August 12–13, 1999 Proceedings 1, pages 144–157. Springer, 1999.
[Mes00] Thomas S Messerges. Securing the AES finalists against power analysis attacks.
In International Workshop on Fast Software Encryption, pages 150–164. Springer,
2000.
[MMR19] Thorben Moos, Amir Moradi, and Bastian Richter. Static power side-channel
analysis—an investigation of measurement factors. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, 28(2):376–389, 2019.
478 References
[MMS01a] David May, Henk L Muller, and Nigel P Smart. Non-deterministic processors.
In Information Security and Privacy: 6th Australasian Conference, ACISP 2001
Sydney, Australia, July 11–13, 2001 Proceedings 6, pages 115–129. Springer, 2001.
[MMS01b] David May, Henk L Muller, and Nigel P Smart. Random register renaming to foil
DPA. In Cryptographic Hardware and Embedded Systems—CHES 2001: Third
International Workshop Paris, France, May 14–16, 2001 Proceedings 3, pages 28–
38. Springer, 2001.
[Mon85] Peter L Montgomery. Modular multiplication without trial division. Mathematics
of computation, 44(170):519–521, 1985.
[Mon87] Peter L Montgomery. Speeding the pollard and elliptic curve methods of
factorization. Mathematics of computation, 48(177):243–264, 1987.
[MOP08] Stefan Mangard, Elisabeth Oswald, and Thomas Popp. Power analysis attacks:
Revealing the secrets of smart cards, volume 31. Springer Science & Business
Media, 2008.
[MPC00] Lauren May, Lyta Penna, and Andrew Clark. An implementation of bitsliced DES
on the pentium MMX TM processor. In Australasian Conference on Information
Security and Privacy, pages 112–122. Springer, 2000.
[MPG05] Stefan Mangard, Thomas Popp, and Berndt M Gammel. Side-channel leakage of
masked CMOS gates. In Cryptographers’ Track at the RSA Conference, pages
351–365. Springer, 2005.
[MPP16] Houssem Maghrebi, Thibault Portigliatti, and Emmanuel Prouff. Breaking crypto-
graphic implementations using deep learning techniques. In Security, Privacy, and
Applied Cryptography Engineering: 6th International Conference, SPACE 2016,
Hyderabad, India, December 14–18, 2016, Proceedings 6, pages 3–26. Springer,
2016.
[MS77] Florence Jessie MacWilliams and Neil James Alexander Sloane. The theory of
error correcting codes, volume 16. Elsevier, 1977.
[MS00] Rita Mayer-Sommer. Smartly analyzing the simplicity and the power of simple
power analysis on smartcards. In International Workshop on Cryptographic
Hardware and Embedded Systems, pages 78–92. Springer, 2000.
[MSB16] Houssem Maghrebi, Victor Servant, and Julien Bringer. There is wisdom in
harnessing the strengths of your enemy: Customized encoding to thwart side-
channel attacks. In Fast Software Encryption: 23rd International Conference, FSE
2016, Bochum, Germany, March 20–23, 2016, Revised Selected Papers 23, pages
223–243. Springer, 2016.
[MSGR10] Marcel Medwed, François-Xavier Standaert, Johann Großschädl, and Francesco
Regazzoni. Fresh re-keying: Security against side-channel and fault attacks for
low-cost devices. In International Conference on Cryptology in Africa, pages 279–
296. Springer, 2010.
[MSY06] Tal G Malkin, François-Xavier Standaert, and Moti Yung. A comparative
cost/security analysis of fault attack countermeasures. In Fault Diagnosis and Tol-
erance in Cryptography: Third International Workshop, FDTC 2006, Yokohama,
Japan, October 10, 2006. Proceedings, pages 159–172. Springer, 2006.
[MVOV18] Alfred J Menezes, Paul C Van Oorschot, and Scott A Vanstone. Handbook of
applied cryptography. CRC press, 2018.
[MWK+ 22] Catinca Mujdei, Lennert Wouters, Angshuman Karmakar, Arthur Beckers, Jose
Maria Bermudo Mera, and Ingrid Verbauwhede. Side-channel analysis of lattice-
based post-quantum cryptography: Exploiting polynomial multiplication. ACM
Transactions on Embedded Computing Systems, 2022.
[MWM21] Thorben Moos, Felix Wegener, and Amir Moradi. Dl-la: Deep learning leakage
assessment: A modern roadmap for SCA evaluations. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 552–598, 2021.
References 479
[MZMM16] Zdenek Martinasek, Vaclav Zeman, Lukas Malina, and Josef Martinasek. K-
nearest neighbors algorithm in profiling power analysis attacks. Radioengineering,
25(2):365–382, 2016.
[NIS01] NIST. Federal information processing standards publication (fips) 197. Advanced
Encryption Standard (AES), 2001.
[NIS19] NIST. FIPS 140-3: Security Requirements for Cryptographic Modules, National
Institute of Standards and Technology. Technical report, Federal Inf. Process. Stds.
(NIST FIPS), National Institute of Standards and Technology, Gaithersburg, MD,
2019.
[Nov02] Roman Novak. Spa-based adaptive chosen-ciphertext attack on RSA implemen-
tation. In International Workshop on Public Key Cryptography, pages 252–262.
Springer, 2002.
[NRS11] Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Secure hardware implemen-
tation of nonlinear functions in the presence of glitches. Journal of Cryptology,
24:292–321, 2011.
[NY21] Yusuke Nozaki and Masaya Yoshikawa. Shuffling countermeasure against power
side-channel attack for MLP with software implementation. In 2021 IEEE
4th International Conference on Electronics and Communication Engineering
(ICECE), pages 39–42. IEEE, 2021.
[NYGD22] Len Luet Ng, Kim Ho Yeap, Magdalene Wan Ching Goh, and Veerendra Dakulagi.
Power consumption in CMOS circuits. In Electromagnetic Field in Advancing
Science and Technology. IntechOpen, 2022.
[O’D14] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press,
2014.
[O’F23] Colin O’Flynn. PicoEMP: A low-cost EMFI platform compared to BBI and voltage
fault injection using TDC and external VCC measurements. Cryptology ePrint
Archive, 2023.
[Ogg] Frédérique Oggier. Lecture notes. https://feog.github.io/. Accessed: 2012-11-30.
[ORBG17] Marco Oliverio, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Secure Page
Fusion with VUsion: https://www.vusec.net/projects/VUsion. In Proceedings of
the 26th Symposium on Operating Systems Principles, pages 531–545, 2017.
[Org17] European Cyber Security Organisation. Overview of existing cybersecurity stan-
dards and certification schemes v2, wg1—standardisation, certification, labelling
and supply chain management, 2017.
[ORJ+ 13] Rachid Omarouayache, Jérémy Raoult, Sylvie Jarrix, Laurent Chusseau, and
Philippe Maurine. Magnetic microprobe design for em fault attack. In 2013
International Symposium on Electromagnetic Compatibility, pages 949–954. IEEE,
2013.
[OS05] Elisabeth Oswald and Kai Schramm. An efficient masking scheme for AES
software implementations. In International Workshop on Information Security
Applications, pages 292–305. Springer, 2005.
[Osw] David Oswald. Lecture notes: Hardware and embedded systems security. https://
github.com/david-oswald/hwsec_lecture_notes. Accessed: 2012-12-03.
[PBMB17] Sikhar Patranabis, Jakub Breier, Debdeep Mukhopadhyay, and Shivam Bhasin.
One plus one is more than two: a practical combination of power and fault analysis
attacks on present and present-like block ciphers. In 2017 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 25–32. IEEE, 2017.
[PBP21] Guilherme Perin, Ileana Buhan, and Stjepan Picek. Learning when to stop: a mutual
information approach to prevent overfitting in profiled side-channel analysis. In
Constructive Side-Channel Analysis and Secure Design: 12th International Work-
shop, COSADE 2021, Lugano, Switzerland, October 25–27, 2021, Proceedings 12,
pages 53–81. Springer, 2021.
[PCP20] Guilherme Perin, Łukasz Chmielewski, and Stjepan Picek. Strength in numbers:
Improving generalization with ensembles in machine learning-based profiled side-
480 References
[Sha97] A Shamir. Method and apparatus for protecting public key schemes from timing
and fault attacks. In EUROCRYPT’97, 1997.
[Sha00] Adi Shamir. Protecting smart cards from passive power analysis with detached
power supplies. In Cryptographic Hardware and Embedded Systems–CHES
2000: Second International Workshop Worcester, MA, USA, August 17–18, 2000
Proceedings 2, pages 71–77. Springer, 2000.
[SHS16] Bodo Selmke, Johann Heyszl, and Georg Sigl. Attack on a DFA protected AES
by simultaneous laser fault injections. In 2016 Workshop on Fault Diagnosis and
Tolerance in Cryptography (FDTC), pages 36–46. IEEE, 2016.
[SI20a] SOG-IS. Application of attack potential to smartcards and similar devices, v3.1,
2020.
[SI20b] SOG-IS. Attack methods for smartcards and similar devices, 2020.
[Sie88] Waclaw Sierpinski. Elementary Theory of Numbers: Second English Edition
(edited by A. Schinzel). Elsevier, 1988.
[Siv17] Nimisha Sivaraman. Design of magnetic probes for near field measurements and
the development of algorithms for the prediction of EMC. PhD thesis, Université
Grenoble Alpes, 2017.
[SJB+ 18] Sayandeep Saha, Dirmanto Jap, Jakub Breier, Shivam Bhasin, Debdeep
Mukhopadhyay, and Pallab Dasgupta. Breaking redundancy-based countermea-
sures with random faults and power side channel. In 2018 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 15–22. IEEE, 2018.
[SM12] Pushpa Saini and Rajesh Mehra. A novel technique for glitch and leakage power
reduction in CMOS vlsi circuits. International Journal of Advanced Computer
Science and Applications, 3(10), 2012.
[SM15] Tobias Schneider and Amir Moradi. Leakage assessment methodology: A clear
roadmap for side-channel evaluations. In Cryptographic Hardware and Embedded
Systems–CHES 2015: 17th International Workshop, Saint-Malo, France, Septem-
ber 13–16, 2015, Proceedings 17, pages 495–513. Springer, 2015.
[SMG16] Tobias Schneider, Amir Moradi, and Tim Güneysu. Parti–towards combined
hardware countermeasures against side-channel and fault-injection attacks. In
Advances in Cryptology–CRYPTO 2016: 36th Annual International Cryptology
Conference, Santa Barbara, CA, USA, August 14–18, 2016, Proceedings, Part II
36, pages 302–332. Springer, 2016.
[SMKLM02] Yen Sung-Ming, Seungjoo Kim, Seongan Lim, and Sangjae Moon. RSA speedup
with residue number system immune against hardware fault cryptanalysis. In
international conference on information security and cryptology, pages 397–413.
Springer, 2002.
[SMR09] Dhiman Saha, Debdeep Mukhopadhyay, and Dipanwita RoyChowdhury. A
diagonal fault attack on the advanced encryption standard. Cryptology ePrint
Archive, 2009.
[SMY09] François-Xavier Standaert, Tal G Malkin, and Moti Yung. A unified framework
for the analysis of side-channel key recovery attacks. In Advances in Cryptology-
EUROCRYPT 2009, pages 443–461. Springer, 2009.
[Sor84] Arthur Sorkin. Lucifer, a cryptographic algorithm. Cryptologia, 8(1):22–42, 1984.
[SP06] Kai Schramm and Christof Paar. Higher order masking of the AES. In Topics
in Cryptology–CT-RSA 2006: The Cryptographers’ Track at the RSA Conference
2006, San Jose, CA, USA, February 13–17, 2005. Proceedings, pages 208–225.
Springer, 2006.
[SS16] Peter Schwabe and Ko Stoffelen. All the AES you need on cortex-m3 and m4.
In International Conference on Selected Areas in Cryptography, pages 180–194.
Springer, 2016.
[Sta10] François-Xavier Standaert. Introduction to side-channel attacks. Secure integrated
circuits and systems, pages 27–42, 2010.
484 References
[Sti05] Douglas R Stinson. Cryptography: theory and practice. Chapman and Hall/CRC,
2005.
[SVK+ 03] H Saputra, N Vijaykrishnan, M Kandemir, MJ Irwin, and R Brooks. Masking
the energy behaviour of encryption algorithms. IEE Proceedings-Computers and
Digital Techniques, 150(5):274–284, 2003.
[SWM18] Robert Schilling, Mario Werner, and Stefan Mangard. Securing conditional
branches in the presence of fault attacks. In 2018 Design, Automation & Test in
Europe Conference & Exhibition (DATE), pages 1586–1591. IEEE, 2018.
[SWP03] Kai Schramm, Thomas Wollinger, and Christof Paar. A new class of collision
attacks and its application to DES. In Fast Software Encryption: 10th International
Workshop, FSE 2003, Lund, Sweden, February 24–26, 2003. Revised Papers 10,
pages 206–222. Springer, 2003.
[TAV02] Kris Tiri, Moonmoon Akmal, and Ingrid Verbauwhede. A dynamic and differential
CMOS logic with signal independent power consumption to withstand differential
power analysis on smart cards. In Proceedings of the 28th European solid-state
circuits conference, pages 403–406. IEEE, 2002.
[TBM14] Harshal Tupsamudre, Shikha Bisht, and Debdeep Mukhopadhyay. Destroying fault
invariant with randomization: A countermeasure for AES against differential fault
attacks. In Cryptographic Hardware and Embedded Systems–CHES 2014: 16th
International Workshop, Busan, South Korea, September 23–26, 2014. Proceedings
16, pages 93–111. Springer, 2014.
[THM07] Stefan Tillich, Christoph Herbst, and Stefan Mangard. Protecting AES software
implementations on 32-bit processors against power analysis. In Applied Cryptog-
raphy and Network Security: 5th International Conference, ACNS 2007, Zhuhai,
China, June 5–8, 2007. Proceedings 5, pages 141–157. Springer, 2007.
[TIA+ 23] M Caner Tol, Saad Islam, Andrew J Adiletta, Berk Sunar, and Ziming Zhang.
Don’t knock! rowhammer at the backdoor of DNN models. In 2023 53rd Annual
IEEE/IFIP International Conference on Dependable Systems and Networks (DSN),
pages 109–122. IEEE, 2023.
[Tim19] Benjamin Timon. Non-profiled deep learning-based side-channel attacks with sen-
sitivity analysis. IACR Transactions on Cryptographic Hardware and Embedded
Systems, pages 107–131, 2019.
[TKA+ 18] Andrei Tatar, Radhesh Krishnan Konoth, Elias Athanasopoulos, Cristiano Giuf-
frida, Herbert Bos, and Kaveh Razavi. Throwhammer: Rowhammer attacks
over the network and defenses. In 2018 USENIX Annual Technical Conference
(USENIX ATC 18), pages 213–226, 2018.
[TMA11] Michael Tunstall, Debdeep Mukhopadhyay, and Subidh Ali. Differential fault
analysis of the advanced encryption standard using a single fault. In Information
Security Theory and Practice. Security and Privacy of Mobile Devices in Wireless
Communication: 5th IFIP WG 11.2 International Workshop, WISTP 2011, Her-
aklion, Crete, Greece, June 1–3, 2011. Proceedings 5, pages 224–233. Springer,
2011.
[TSS+ 06] Pim Tuyls, Geert Jan Schrijen, Boris Skoric, Jan Van Geloven, Nynke Verhaegh,
and Rob Wolters. Read-proof hardware from protective coatings. In Ches,
volume 6, pages 369–383. Springer, 2006.
[TSS17] Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo. CLKSCREW: Expos-
ing the Perils of Security-Oblivious Energy Management. In 26th USENIX Security
Symposium (USENIX Security 17), pages 1057–1074, 2017.
[TV06] Kris Tiri and Ingrid Verbauwhede. A digital design flow for secure integrated
circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 25(7):1197–1208, 2006.
[UXT+ 22] Rei Ueno, Keita Xagawa, Yutaro Tanaka, Akira Ito, Junko Takahashi, and Naofumi
Homma. Curse of re-encryption: A generic power/EM analysis on post-quantum
References 485
[WPP22] Lichao Wu, Guilherme Perin, and Stjepan Picek. I choose you: Automated
hyperparameter tuning for deep learning-based side-channel analysis. IEEE
Transactions on Emerging Topics in Computing, 2022.
[WvWM11] Marc F Witteman, Jasper GJ van Woudenberg, and Federico Menarini. Defeating
RSA multiply-always and message blinding countermeasures. In Topics in
Cryptology–CT-RSA 2011: The Cryptographers’ Track at the RSA Conference
2011, San Francisco, CA, USA, February 14–18, 2011. Proceedings, pages 77–88.
Springer, 2011.
[WW10] Gaoli Wang and Shaohui Wang. Differential fault analysis on present key schedule.
In 2010 International Conference on Computational Intelligence and Security,
pages 362–366. IEEE, 2010.
[XIU+ 21] Keita Xagawa, Akira Ito, Rei Ueno, Junko Takahashi, and Naofumi Homma.
Fault-injection attacks against nist’s post-quantum cryptography round 3 KEM
candidates. In International Conference on the Theory and Application of
Cryptology and Information Security, pages 33–61. Springer, 2021.
[XLZ+ 18] Sen Xu, Xiangjun Lu, Kaiyu Zhang, Yang Li, Lei Wang, Weijia Wang, Haihua
Gu, Zheng Guo, Junrong Liu, and Dawu Gu. Similar operation template attack on
RSA-crt as a case study. Science China Information Sciences, 61:1–17, 2018.
[XZY+ 20] Guorui Xu, Fan Zhang, Bolin Yang, Xinjie Zhao, Wei He, and Kui Ren. Pushing
the limit of PFA: enhanced persistent fault analysis on block ciphers. IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems,
40(6):1102–1116, 2020.
[Yeh14] James J Yeh. Real analysis: theory of measure and integration. World Scientific
Publishing Company, 2014.
[YJ00] Sung-Ming Yen and Marc Joye. Checking before output may not be enough against
fault-based cryptanalysis. IEEE Transactions on computers, 49(9):967–970, 2000.
[YKM06] Sung-Ming Yen, Dongryeol Kim, and SangJae Moon. Cryptanalysis of two
protocols for RSA with crt based on fault infection. In International Workshop
on Fault Diagnosis and Tolerance in Cryptography, pages 53–61. Springer, 2006.
[YMY+ 20] Honggang Yu, Haocheng Ma, Kaichen Yang, Yiqiang Zhao, and Yier Jin. Deepem:
Deep neural networks model recovery through em side-channel information leak-
age. In 2020 IEEE International Symposium on Hardware Oriented Security and
Trust (HOST), pages 209–218. IEEE, 2020.
[YRF20] Fan Yao, Adnan Siraj Rakin, and Deliang Fan. {DeepHammer}: Depleting the
intelligence of deep neural networks through targeted chain of bit flips. In 29th
USENIX Security Symposium (USENIX Security 20), pages 1463–1480, 2020.
[ZBHV20] Gabriel Zaid, Lilian Bossuet, Amaury Habrard, and Alexandre Venelli. Method-
ology for efficient CNN architectures in profiling attacks. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 1–36, 2020.
[ZDT+ 14] Loic Zussa, Amine Dehbaoui, Karim Tobich, Jean-Max Dutertre, Philippe Mau-
rine, Ludovic Guillaume-Sage, Jessy Clediere, and Assia Tria. Efficiency of a
glitch detector against electromagnetic fault injection. In 2014 Design, Automation
& Test in Europe Conference & Exhibition (DATE), pages 1–6. IEEE, 2014.
[ZLZ+ 18] Fan Zhang, Xiaoxuan Lou, Xinjie Zhao, Shivam Bhasin, Wei He, Ruyi Ding,
Samiya Qureshi, and Kui Ren. Persistent fault analysis on block ciphers. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 150–172,
2018.
[ZZJ+ 20] Fan Zhang, Yiran Zhang, Huilong Jiang, Xiang Zhu, Shivam Bhasin, Xinjie Zhao,
Zhe Liu, Dawu Gu, and Kui Ren. Persistent fault attack in practice. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 172–195,
2020.
[ZZY+ 19] Yiran Zhang, Fan Zhang, Bolin Yang, Guorui Xu, Bin Shao, Xinjie Zhao, and Kui
Ren. Persistent fault injection in FPGA via BRAM modification. In 2019 IEEE
Conference on Dependable and Secure Computing (DSC), pages 1–6. IEEE, 2019.
Index
C
B Caesar cipher, 108
Bellcore attack, 392 CBC mode, 127
Binary code, 57 Chinese Remainder Theorem, 44
anticode, 65 Correlation coefficient, 81
binary .(n, M)−code, 57 Cryptographic primitives, 102
binary .(n, M, d)−code, 58 Cryptosystem, 104
binary .[n, k, d]− linear code, 61 block cipher, 105
codeword, 57 block length, 132
dimension, 61 Feistel cipher, 133
dual code, 62 key length, 132
error correcting, 59 key schedule, 132
error-detecting, 58 master key, 132
generator matrix, 63 round function, 132
length, 57 Sbox, 133
linear, 60 SPN cipher, 133
maximum distance, 65 computationally secure, 107
(minimum) distance, 58 perfectly secure, 107, 124
.n−repetition code, 61 secure in practice, 107
parity-check code, 62 stream cipher, 105
© The Editor(s) (if applicable) and The Author(s), under exclusive license to 487
Springer Nature Switzerland AG 2024
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
488 Index
D order, 14
Data Encryption Standard (DES), 135 order of an element, 15
Difference distribution table, 276 symmetric group of degree n, 14
Differential fault analysis, 354 Guessing entropy, 270
Distribution, 72
.χ −distribution, 83
2
Gaussian distribution, 80 H
multivariate normal distribution, 80 Hamming distance, 58
normal distribution, 77 Hamming weight, 62
standard normal distribution, 74 Hash function, 103
.t−distribution, 84 Hill cipher, 114
uniform, 74
I
E Infective countermeasure, 386, 414
ECB mode, 126 Integer
Equivalence class, 33 base.−b representation, 6
Equivalence relation, 33 Bézout’s identity, 7
Euler’s Theorem, 39 binary representation, 6
Euler’s totient function, 38 bit length, 6
Event, 65 composite (number), 10
independent, 69 congruence class modulo n, 34
congruent modulo n, 33
Euclidean algorithm, 9
F Euclid’s division, 8
Fault mask, 353 extended Euclidean algorithm, 10
Fault model, 353 Fundamental Theorem of Arithmetic, 11
Fermat’s Little Theorem, 40 greatest common divisor, 7
Field, 18 hexadecimal representation, 6
characteristic, 19 modulus, 33
.Fp n , 20 prime (number), 10
finite field, 18 Integral domain, 18
isomorphism, 20 Interval estimator, 88
subfield, 19
Forgery, 168
existential forgery, 168 K
selective forgery, 168 Kerckhoffs’ principle, 106
Frequency analysis, 117
Function, 3
bijective, 4 L
codomain, 3 Linear congruence, 42
composition, 4 Logical AND, 17
domain, 3 Logical XOR, 13
injective, 4
inverse, 4
surjective, 4 M
Matrix, 21
addition, 22
G adjoint matrix, 25
Garner’s algorithm, 176 determinant, 24
Gauss’s algorithm, 176 diagonal, 21
Group, 12 identity matrix, 21
abelian, 12 inverse, 23
cyclic, 15 multiplication, 22
Index 489
S
N Safe error attack, 402
Nibble, 27 Sample mean, 86
Sample space, 65
Sample variance, 86
O
Set, 1
OFB mode, 127
cardinality, 1
One-time pad, 123
Cartesian product, 2
complement, 2
P difference, 2
Permutation, 13 intersection, 2
Persistent fault analysis, 375 power set, 1
Point estimator, 88 union, 2
Polynomial, 48 Shamir’s countermeasure, 411
congruence class, 51 Shift cipher, 108
congruent modulo .f (x), 51 Square and multiply algorithm, 171
degree, 48 left-to-right, 172
Division Algorithm, 49 right-to-left, 171
greatest common divisor, 52 Statistical fault analysis, 368
polynomial ring, 48 Student’s t-test, 98
reducible, 49 Substitution cipher, 111
PRESENT, 149 Success rate, 270
Probability measure, 67 System of simultaneous congruences, 42
Bayes’ Theorem, 70
conditional probability, 69
probability, 67 T
probability space, 67 Test vector leakage assessment (TVLA), 218
uniform, 68
V
R Vector space, 26
Random variable, 71 basis, 30
continuous, 73 dimension, 31
expectation, 75 generating set, 29
covariance, 78 linearly independent, 29
cumulative distribution function (CDF), 72 orthogonal complement, 32
discrete, 72 scalar, 26
expectation, 74 subspace, 28
independent, 78 vector, 26
normal random variable, 77 Vigenère cipher, 113
probability density function (PDF), 73
probability mass function (PMF), 73
standard normal random variable, 74 W
uncorrelated, 79 Welch’s .t−test, 99
variance, 76 Word size of an architecture, 106