0% found this document useful (0 votes)
92 views516 pages

Crypt Emb Sys

Crypt_Emb_Sys

Uploaded by

Mouna Gharbi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views516 pages

Crypt Emb Sys

Crypt_Emb_Sys

Uploaded by

Mouna Gharbi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Xiaolu Hou

Jakub Breier

Cryptography and
Embedded Systems
Security
Cryptography and Embedded Systems Security
Xiaolu Hou • Jakub Breier

Cryptography and Embedded


Systems Security
Xiaolu Hou Jakub Breier
Faculty of Informatics and Information TTControl GmbH
Technologies Vienna, Austria
Slovak University of Technology
Bratislava, Slovakia

ISBN 978-3-031-62204-5 ISBN 978-3-031-62205-2 (eBook)


https://doi.org/10.1007/978-3-031-62205-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

If disposing of this product, please recycle the paper.


For our Aurel
Foreword

In an era defined by interconnectedness, the importance of security is undeniable.


Across billions of devices and computing systems, cryptographic algorithms and
protocols stand as sentinels, safeguarding the confidentiality, integrity, and non-
repudiation of transactions. However, even with the remarkable capabilities of
cryptographic algorithms, the systems they safeguard are not necessarily immune to
vulnerabilities. These vulnerabilities frequently emerge during the transition from
theory to practical implementations, underscoring the pivotal role of cryptographic
engineering in achieving comprehensive security measures. The present book serves
to nicely bridge this gap and provide practitioners and researchers interested in
the world of embedded security a wide perspective of secure implementations of
cryptographic algorithms.
While strong cryptographic algorithms are an important starting point in the
design of secured systems, they also need to be efficiently implemented for real-
life practical applications. While in the early days they were implemented largely
on general-purpose computers, it was gradually felt necessary to realize them on
hardware and embedded platforms. This shift was an outcome of multiple factors.
The complexity of cryptographic algorithms and their real-time requirements to
ensure practical applications motivated researchers to implement the ciphers on
hardware and embedded platforms. Moreover, because of the various attacks on
software platforms, designing security systems relying on hardware root-of-trusts
became a popular design choice. Further, the growth of embedded applications, and
thereof the advent of Cyber-Physical Systems (CPS) and Internet-of-Things (IoT),
obviated the integration of cryptographic algorithms into special-purpose devices.
However, great care needs to be taken in such implementations, as apart from
the classic design objectives, like power, energy, throughput, and area, designers
also need to tackle side-channel information leakages which can be exploited
by attackers with physical access to the devices. Common side-channel attacks
based on power/electromagnetic analysis and fault analysis have become one of
the biggest threats in deploying crypto algorithms on embedded devices. The
ubiquitousness of such devices and easy physical access by adversaries offer

vii
viii Foreword

novel attack surfaces which can cripple the best of crypto-algorithms if suitable
countermeasures are not implemented along with.
The contribution of this book is to address these aspects of secured crypto-design
and provide a vivid description to develop an end-to-end understanding. The designs
of cryptographic algorithms and their analysis are often based on mathematical and
statistical tools. The book starts with a nice summary of important mathematical
principles, which are needed to comprehend the cipher constructions and their attack
analysis. Subsequently, the book provides a summary of both classical and modern
cryptosystems. The following chapters also stress on implementations of these
modern cryptosystems, before delving into various forms of physical attacks on the
implementations. The book discusses techniques for side-channel analysis of both
symmetric-key and public-key cryptosystems, along with suitable countermeasures.
The book then presents a contemporary summary of various forms of fault attacks
on cryptosystems, and countermeasures against them. The book concludes with
practical aspects of physical attacks, providing much-needed details of physical
setups, useful to develop practical setups for hardware security research.
Engaging and informative, this book is fine reading for anyone fascinated by
the intricate realm of embedded security and cryptographic engineering. It offers
a compelling glimpse into the workings of attacks on cryptosystems in embedded
devices and provides actionable strategies for mitigation. Enjoy the journey into the
captivating world of security engineering!

Kharagpur, India Debdeep Mukopadhyay


April 2024

Starting my doctoral studies several decades ago, I found myself immensely


interested in the area of physical side channels and the resulting attacks, which
at the time disrupted the way in which cryptographers approached designing and
analyzing ciphers. This was a fortunate encounter for me: today my research is still
driven by the challenge of efficiently detecting, quantifying, and as far as possible
mitigating physical side channels.
Research in the area of side channels has developed and grown, not only in
volume but also in maturity. In the early days, researchers playfully discovered how
to tap into side channels, as well as how to extract more information from available
side channels, and to make side channels harder to exploit. There was little emphasis
on the development of a methodology. Countermeasures were (re)invented and
applied to different types of cryptosystems, acknowledging, but not systematizing,
that different discoveries were in fact related.
Only when, together with two colleagues, I wrote the first comprehensive
research book on power analysis attacks, a clearer picture emerged of the factors
that contribute to the success of attacks and how we can mitigate leakage. Other
researchers pushed our initial attempts further, and today, we have sound theories
for many aspects of side-channel attacks and countermeasures. Similarly, the area
of fault attacks has seen significant progress over the past two decades.
Foreword ix

This book here provides a contemporary summary of techniques for attacks and
countermeasures. There are many good examples provided: I encourage all readers
of this book to pay particular attention to these and implement and extend as many
as possible. The best way to understand the foundational aspects of any field is by
active learning: do as much as you can yourself!

Birmingham, UK Elisabeth Oswald


April 2024
Preface

Cryptography is an indispensable tool used to protect information in computing


systems. Billions of people all over the world use it in their daily lives without
even noticing there is some cryptographic algorithm running behind the scenes.
Cryptographic computations can be found in any form of electronic communication,
electronic passports, security tokens, payment systems, etc.
Cryptographic algorithms in use nowadays are considered secure in theory. But
in the real world, these algorithms are implemented on physical devices in the
form of integrated circuits. These circuits have their physical properties, such as
power consumption dependent on the processed data, emanation of electromagnetic
waves, and susceptibility to computational errors due to environmental influences.
To evaluate the security level of cryptographic implementations, it is necessary to
include the physical security assessment.
There are various physical attack methods, e.g., fault attacks, side-channel
attacks, hardware trojans, etc. Side-channel attacks can be divided into different spe-
cific attacks, depending on the exploited information, e.g., electromagnetic/power
analysis, timing analysis, cache attacks. In this book, we focus on fault attacks and
electromagnetic/power analysis attacks on cryptographic implementations.
We assume the readers have basic knowledge of real numbers, rational numbers,
integers, and complex numbers, which will be denoted by .R, .Q, .Z, and .C
respectively in this book. We also assume the readers have completed a course in
linear algebra.
This book is primarily aimed at graduate students who take a course on hardware
security and/or cryptography. However, it provides useful resources for anyone
willing to explore the exciting world of physical attacks—designers, implementers,
evaluators, as well as academic scholars.

Bratislava, Slovakia Xiaolu Hou


Vienna, Austria Jakub Breier
April 2024

xi
Acknowledgment

We would like to thank Debdeep Mukhopadhyay and Elisabeth Oswald for writing
a nice and motivating foreword for this book.
Our thanks also go to Mladen Kovačević, Romain Poussier, and Dirmanto Jap for
proofreading an earlier version of the book and for their detailed and constructive
comments.
We would also like to acknowledge the editorial team of Springer Nature,
especially Bakiyalakshmi R M, and Charles Glaser.
For the unwavering support and encouragement from our parents and especially
our son, Aurel, who turned our writing process into a wild adventure.

This project has received funding from the European Union’s Horizon 2020
Research and Innovation Programme under the Programme SASPRO 2 COFUND
Marie Sklodowska-Curie grant agreement No. 945478.

xiii
Contents

1 Mathematical and Statistical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Abstract Algebra. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.3 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.1 Matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.2 Vector Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.4.1 Solving Linear Congruences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.5 Polynomial Rings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.5.1 Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6 Coding Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 Probability Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.7.1 σ -Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.7.2 Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
1.7.3 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
1.8 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1.8.1 Important Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1.8.2 Estimating Mean and Difference of Means of
Normal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
1.8.3 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
1.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
2 Introduction to Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.1 Cryptographic Primitives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.1.1 Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

xv
xvi Contents

2.1.2 Cryptosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104


2.1.2.1 Converting Message to Plaintext . . . . . . . . . . . . . . . . . . . . 105
2.1.3 Security of Cryptosystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.2 Classical Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
2.2.1 Shift Cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2.2.2 Affine Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.2.3 Substitution Cipher. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2.2.4 Vigenère Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.2.5 Hill Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
2.2.6 Cryptanalysis of Classical Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
2.2.6.1 Frequency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
2.2.6.2 Kasiski Test: Vigenère Cipher. . . . . . . . . . . . . . . . . . . . . . . 120
2.2.6.3 Index of Coincidence: Vigenère Cipher . . . . . . . . . . . . 121
2.2.7 One-Time Pad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.3 Encryption Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
2.4 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3 Modern Cryptographic Algorithms and Their Implementations . . . . . . 131
3.1 Symmetric Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.1.1 DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
3.1.2 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
3.1.3 PRESENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
3.2 Implementations of Symmetric Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . 151
3.2.1 Implementing Sboxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
3.2.2 Implementing Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3.2.2.1 Implementing pLayer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3.2.2.2 AES T-tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.2.3 Bitsliced Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
3.2.3.1 Algebraic Normal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
3.2.3.2 Bitsliced Implementation of PRESENT . . . . . . . . . . . . 162
3.3 RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3.4 RSA Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.5 Implementations of RSA Cipher and RSA Signatures . . . . . . . . . . . . . . . 170
3.5.1 Implementing Modular Exponentiation . . . . . . . . . . . . . . . . . . . . . . . 170
3.5.1.1 Square and Multiply Algorithm . . . . . . . . . . . . . . . . . . . . . 170
3.5.1.2 Montgomery Powering Ladder. . . . . . . . . . . . . . . . . . . . . . 173
3.5.1.3 Chinese Remainder Theorem (CRT) Based RSA . . 175
3.5.2 Implementing Modular Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 180
3.5.2.1 Blakely’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
3.5.2.2 Montgomery’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
3.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
4 Side-Channel Analysis Attacks and Countermeasures . . . . . . . . . . . . . . . . . . 205
4.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
4.1.1 Attack Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.2 Side-Channel Leakages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Contents xvii

4.2.1 Distribution of the Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212


4.2.2 Estimating Leakage Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
4.2.3 Leakage Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
4.2.4 Signal-to-Noise Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers . . . . . . . 243
4.3.1 Non-profiled Differential Power Analysis Attacks . . . . . . . . . . . 243
4.3.1.1 Non-profiled DPA Attack Steps . . . . . . . . . . . . . . . . . . . . . 243
4.3.1.2 Identity Leakage Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
4.3.1.3 Hamming Weight Leakage Model . . . . . . . . . . . . . . . . . . 250
4.3.2 Profiled Differential Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 251
4.3.2.1 Profiled DPA Attack Steps . . . . . . . . . . . . . . . . . . . . . . . . . . 253
4.3.2.2 Stochastic Leakage Model . . . . . . . . . . . . . . . . . . . . . . . . . . 258
4.3.2.3 Template-Based DPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
4.3.2.4 Success Rate and Guessing Entropy . . . . . . . . . . . . . . . . 267
4.3.3 Side-Channel Assisted Differential Plaintext Attack . . . . . . . . . 275
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures . . . . . . . 302
4.4.1 Simple Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
4.4.2 Differential Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
4.5 Countermeasures Against Side-Channel Analysis Attacks . . . . . . . . . . . 311
4.5.1 Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
4.5.1.1 Encoding-Based Countermeasure for
Symmetric Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
4.5.1.2 Square and Multiply Always . . . . . . . . . . . . . . . . . . . . . . . . 322
4.5.2 Masking and Blinding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
4.5.2.1 Introduction to Boolean Masking . . . . . . . . . . . . . . . . . . . 327
4.5.2.2 Boolean Masking for AES-128 . . . . . . . . . . . . . . . . . . . . . 328
4.5.2.3 Boolean Masking for PRESENT. . . . . . . . . . . . . . . . . . . . 331
4.5.2.4 Blinding for RSA and RSA Signatures . . . . . . . . . . . . . 340
4.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
4.6.1 AI-Assisted SCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
5 Fault Attacks and Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
5.1 Fault Attacks on Symmetric Block Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
5.1.1 Differential Fault Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
5.1.1.1 DFA on DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
5.1.1.2 Diagonal DFA on AES-128 . . . . . . . . . . . . . . . . . . . . . . . . . 361
5.1.2 Statistical Fault Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
5.1.2.1 SFA Attack on AES-128 Round 9 . . . . . . . . . . . . . . . . . . 369
5.1.2.2 SFA on AES-128 Round 8 . . . . . . . . . . . . . . . . . . . . . . . . . . 373
5.1.3 Persistent Fault Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
5.1.4 Implementation-Specific Fault Attack . . . . . . . . . . . . . . . . . . . . . . . . 377
5.2 Fault Countermeasures for Symmetric Block Ciphers. . . . . . . . . . . . . . . . 379
5.2.1 Encoding-Based Countermeasure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
5.2.2 Infective Countermeasure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
5.3 Fault Attacks on RSA and RSA Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
xviii Contents

5.3.1 Bellcore Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392


5.3.2 Attack on the Square and Multiply Algorithm. . . . . . . . . . . . . . . . 396
5.3.3 Attack on the Public Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
5.3.4 Safe Error Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
5.3.4.1 Safe Error Attack on the Montgomery
Powering Ladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
5.3.4.2 Safe Error Attack on the Square and
Multiply Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
5.4 Fault Countermeasures for RSA and RSA Signatures . . . . . . . . . . . . . . . . 410
5.4.1 Shamir’s Countermeasure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
5.4.2 Infective Countermeasure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
5.4.3 Countermeasure for Attacks on the Square and
Multiply Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
5.4.4 Countermeasures Against the Safe Error Attack . . . . . . . . . . . . . 424
5.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
6 Practical Aspects of Physical Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
6.1 Side-Channel Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
6.1.1 Origins of Leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
6.1.2 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
6.1.2.1 Oscilloscopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
6.1.2.2 Probes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.2 Fault Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.2.1 Fault Injection Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.2.1.1 Clock/Voltage Glitching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.2.1.2 Optical Fault Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
6.2.1.3 Electromagnetic Fault Injection . . . . . . . . . . . . . . . . . . . . . 441
6.2.1.4 Rowhammer Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
6.3 Industry Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
6.3.1 Common Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
6.3.2 FIPS 140-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

A Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
A.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
A.2 Invertible Matrices for the Stochastic Leakage Model . . . . . . . . . . . . . . . 448

B Long Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453


C DES Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
D Algebraic Normal Forms for PRESENT Sbox Output Bits . . . . . . . . . . . . . 457
E Encoding-Based Countermeasure for Symmetric Block Ciphers . . . . . . 461
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
List of Figures

Fig. 1.1 Probability density function of the standard normal


random variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Fig. 1.2 Probability density function of a normal random variable . . . . . . . . . . 78
Fig. 1.3 Probability density function .f (z) for .Z ∼ N(0, 1).
.P (Z > zα ) = α, .α corresponds to the area under .f (z) for .z > zα 83
Fig. 1.4 Probability density function for .X ∼ χ82 . .P (X ≥ χα,8 2 ) = α ....... 84
Fig. 1.5 Probability density functions for .Tn ∼ tn (.n = 2, 5, 10)
and for the standard normal random variable Z . . . . . . . . . . . . . . . . . . . . . 84
Fig. 1.6 Probability density function for .T5 , .P (T5 ≥ tα,5 ) = α . . . . . . . . . . . . . 85
Fig. 2.1 Categorization of cryptographic primitives. The ones
highlighted in blue color will be discussed in this book . . . . . . . . . . . 102
Fig. 2.2 ECB mode for encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Fig. 2.3 ECB mode for decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 2.4 Original picture and encrypted picture with ECB and
CBC modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 2.5 CBC mode for encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 2.6 CBC mode for decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Fig. 2.7 OFB mode for encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Fig. 2.8 OFB mode for decryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Fig. 3.1 An illustration of Feistel cipher encryption algorithm . . . . . . . . . . . . . 134
Fig. 3.2 An illustration of SPN cipher encryption algorithm . . . . . . . . . . . . . . . . 134
Fig. 3.3 An illustration of DES encryption algorithm . . . . . . . . . . . . . . . . . . . . . . . 135
Fig. 3.4 Function f in DES round function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Fig. 3.5 DES key schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Fig. 3.6 AES round function for round i, .1 ≤ i ≤Nr.−1. SB,
SR, MC, and AK stand for SubBytes, ShiftRows,
MixColumns, and AddRoundKey respectively . . . . . . . . . . . . . . . . . . . . . 140
Fig. 3.7 Key schedule for AES-128 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Fig. 3.8 An illustration of PRESENT encryption algorithm . . . . . . . . . . . . . . . . . 149
Fig. 3.9 Two rounds of PRESENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Fig. 3.10 PRESENT-80 key schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

xix
xx List of Figures

Fig. 4.1 Side-channel measurement setup used for the


experiments: a laptop, the ChipWhisperer-Lite
measurement board (black), and the CW308 UFO
board (red) with the mounted ARM Cortex-M4 target
board (blue). Note that the benchtop oscilloscope in
the back was only used for the initial analysis—all the
measurements were done by the ChipWhisperer . . . . . . . . . . . . . . . . . . . 207
Fig. 4.2 Power trace of the first five rounds of PRESENT
encryption. A sequence of nop instructions was executed
before and after the cipher computation to clearly
distinguish the operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Fig. 4.3 The averaged trace for 5000 traces from the Fixed dataset
A (see Sect. 4.1). The blue, pink, and green parts of
the trace correspond to addRoundKey, sBoxLayer, and
pLayer, respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Fig. 4.4 The averaged trace for 1000 plaintexts with the 0th bit
equal to 0. The computation corresponds to one round of
PRESENT with a fixed round key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Fig. 4.5 The averaged trace for 1000 plaintexts with the 0th bit
equal to 1. The computation corresponds to one round of
PRESENT with a fixed round key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Fig. 4.6 The difference between traces from Figs. 4.4 and 4.5 . . . . . . . . . . . . . . 211
Fig. 4.7 Part of five random traces from the Fixed dataset A (see
Sect. 4.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Fig. 4.8 Histogram of leakages at time sample .t = 3520 across
5000 traces from the Fixed dataset A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Fig. 4.9 Histogram of leakages at time sample .t = 2368 across
5000 traces from the Fixed dataset A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Fig. 4.10 Histogram of leakages at time sample .t = 392 across
5000 traces from the Random plaintext dataset . . . . . . . . . . . . . . . . . . . . . 226
Fig. 4.11 Histogram of leakages at time sample .t = 392 across
10,000 traces from the Random dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Fig. 4.12 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with Fixed dataset A and Fixed dataset B. The
signal is given by the plaintext value, and the fixed versus
fixed setting is chosen. Blue dashed lines correspond to
the threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Fig. 4.13 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with 50 traces from Fixed dataset A and 50
traces from Fixed dataset B. The signal is given by the
plaintext value, and the fixed versus fixed setting is
chosen. Blue dashed lines correspond to the threshold .4.5
and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
List of Figures xxi

Fig. 4.14 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600
computed with Fixed dataset A and Random plaintext
dataset. The signal is given by the plaintext value, and the
fixed versus random setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Fig. 4.15 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600
computed with 50 traces from Fixed dataset A and 50
traces from Random plaintext dataset. The signal is given
by the plaintext value, and the fixed versus random setting
is chosen. Blue dashed lines correspond to the threshold
.4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Fig. 4.16 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. .T1 contains
.M1 = 634 traces and .T2 contains .M2 = 651 traces.

The signal is given by the 0th Sbox output, and the


fixed versus fixed setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Fig. 4.17 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. .T1 contains
.M1 = 634 traces and .T2 contains .M2 = 10,000 traces.

The signal is given by the 0th Sbox output, and the


fixed versus random setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Fig. 4.18 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. Both .T1 and
.T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is

given by the 0th Sbox output, and the fixed versus fixed
setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Fig. 4.19 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with traces from Random dataset. Both .T1 and
.T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is

given by the 0th Sbox output, and the fixed versus random
setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Fig. 4.20 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Fig. 4.21 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Fig. 4.22 SNR for each time sample, computed using Random
dataset. The signal is given by the exact value of the 0th
Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
xxii List of Figures

Fig. 4.23 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the Hamming weight of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.24 SNR for each time sample, computed using Random
dataset. The signal is given by the Hamming weight of
the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.25 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the Hamming weight of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . 240
Fig. 4.26 Sample variance of the signal for each time sample,
computed using Random dataset. The signal is given by
the 0th bit of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.27 SNR for each time sample, computed using Random
dataset. The signal is given by the 0th bit of the 0th Sbox
output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.28 Sample variance of the noise for each time sample,
computed using Random dataset. The signal is given by
the 0th bit of the 0th Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Fig. 4.29 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for
all time samples .t = 1, 2, . . . , 3600. Computed following
Eq. 4.21 with the identity leakage model and the Random
plaintext dataset. The blue line corresponds to the correct
key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Fig. 4.30 Sample correlation coefficients .r10,t (corresponds to
the correct key hypothesis 9) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the identity leakage model and the Random plaintext dataset . . . . . . 248
Fig. 4.31 Sample correlation coefficients .r1,t (corresponds
to a wrong key hypothesis 0) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.32 Sample correlation coefficients .r5,t (corresponds
to a wrong key hypothesis 4) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.33 Sample correlation coefficients .r14,t (corresponds
to a wrong key hypothesis D) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the identity leakage model and the Random plaintext dataset . . . . . . 249
Fig. 4.34 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16)
for all time samples .t = 1, 2, . . . , 3600. Computed
following Eq. 4.21 with the Hamming leakage model and
the Random plaintext dataset. The blue line corresponds
to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
List of Figures xxiii

Fig. 4.35 Sample correlation coefficients .r10,t (corresponds to


the correct key hypothesis 9) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the Hamming leakage model and the Random plaintext dataset . . . 251
Fig. 4.36 Sample correlation coefficients .r1,t (corresponds
to a wrong key hypothesis 0) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the Hamming leakage model and the Random plaintext dataset . . . 251
Fig. 4.37 Sample correlation coefficients .r5,t (corresponds
to a wrong key hypothesis 4) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the Hamming leakage model and the Random plaintext dataset . . . 252
Fig. 4.38 Sample correlation coefficients .r14,t (corresponds
to a wrong key hypothesis D) for all time samples
.t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with

the Hamming leakage model and the Random plaintext dataset . . . 252
M̂p
Fig. 4.39 Sample correlation (.i = 1, 2, . . . , 16)
coefficients .ri,POI
for .POI = 392. Computed following Eq. 4.23 with the
identity leakage model and the Random plaintext dataset.
The blue line corresponds to the correct key hypothesis
k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
.

p
Fig. 4.40 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with
the Hamming weight leakage model and the Random
plaintext dataset. The blue line corresponds to the correct
key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

p
Fig. 4.41 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with the
stochastic leakage model and the Random plaintext
dataset. The blue line corresponds to the correct key
hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Fig. 4.42 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of .v, the 0th Sbox output. Three POIs (time samples
.392, 218, 1328) were chosen. The blue line corresponds

to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266


xxiv List of Figures

Fig. 4.43 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by .wt (v), the
Hamming weight of the 0th Sbox output. Three POIs
(time samples .392, 1309, 1304) were chosen. The blue
line corresponds to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . 266
Fig. 4.44 SNR for each time sample, computed using Random
dataset. The signal is given by the exact value of the 1st
Sbox output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Fig. 4.45 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of the 1st Sbox output. One POI (time samples 404)
was chosen. The blue line corresponds to the correct key
hypothesis .8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Fig. 4.46 Probability scores (Eq. 4.33) for each key hypothesis
computed with different numbers of traces from Random
plaintext dataset. The target signal is given by the exact
value of the 1st Sbox output. One POI (time samples 464)
was chosen. The blue line corresponds to the correct key
hypothesis 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
M̂p
Fig. 4.47 Sample correlation (.i = 1, 2, . . . , 16)
coefficients .ri,POI
for .POI = 392. Computed following Eq. 4.23 with the
identity leakage model and the Random plaintext dataset
arranged in reverse order. The blue line corresponds to the
correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

p
Fig. 4.48 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16)
for .POI = 392. Computed following Eq. 4.23 with
the Hamming weight leakage model and the Random
plaintext dataset arranged in reverse order. The blue line
corresponds to the correct key hypothesis .k̂10 = 9 . . . . . . . . . . . . . . . . . 269
Fig. 4.49 Estimations of success rate computed following
Algorithm 4.1 for profiled DPA attacks based on the
stochastic leakage model, the identity leakage model, and
the Hamming weight leakage model using the Random
plaintext dataset as attack traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Fig. 4.50 Estimations of guessing entropy computed following
Algorithm 4.1 for profiled DPA attacks based on the
stochastic leakage model, the identity leakage model, and
the Hamming weight leakage model using the Random
plaintext dataset as attack traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
List of Figures xxv

Fig. 4.51 Estimations of success rate computed following


Algorithm 4.1 for template-based DPA attacks using the
Random plaintext dataset as attack traces and the Random
dataset as profiling traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Fig. 4.52 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attacks using the
Random plaintext dataset as attack traces and the Random
dataset as profiling traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Fig. 4.53 Estimations of success rate computed following
Algorithm 4.1 for leakage model-based and
template-based DPA attacks with the Random plaintext
dataset as attack traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Fig. 4.54 Estimations of guessing entropy computed following
Algorithm 4.1 for leakage model-based template-based
DPA attacks with the Random plaintext dataset as attack
traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Fig. 4.55 A possible sequence of XOR differences between the
cipher states of two encryptions, where colored squares
correspond to active bytes. AK, SB, SR, and MC stand for
AddRoundKey, SubBytes, ShiftRows, and MixColumns,
respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Fig. 4.56 An example of how the XOR differences between the
cipher states can change after each round operation of
PRESENT. The output differences of the four active
Sboxes in round 1 are 1. The output difference of the
single active Sbox in round 2 is also 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Fig. 4.57 Illustration of how active bytes change for
all four differential patterns that start with
.ΔS0 = 1000010000100001 and converge in round 1.

Blue squares correspond to active bytes. AK, SB, SR, and


MC stand for AddRoundKey, SubBytes, ShiftRows, and
MixColumns, respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Fig. 4.58 An illustration of how the XOR differences between the
cipher states can change after each round operation for
PRESENT such that the pair of plaintexts achieves a
differential pattern starting with .ΔS0 given in Eq. 4.46
and converging in round 2. The output differences of the
four active Sboxes in round 1 are 1. The output difference
of the single active Sbox in round 2 is 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
xxvi List of Figures

Fig. 4.59 An illustration of how the XOR differences between the


cipher states can change after each round operation for
PRESENT such that the pair of plaintexts achieves a
differential pattern starting with .ΔS0 given in Eq. 4.46
and converges in round 2. The output differences of the
four active Sboxes in round 1 are 4. The output difference
of the single active Sbox in round 2 is 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Fig. 4.60 An illustration of how the XOR differences between the
cipher states can change after each round operation for
PRESENT such that the pair of plaintexts achieves a
differential pattern starting with .ΔS0 given in Eq. 4.46
and converges in round 2. The output differences of the
four active Sboxes in round 1 are 4. The output difference
of the single active Sbox in round 2 is 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Fig. 4.61 Illustration of how active bytes change from round 1 to
round 3 of AES computation, for differential patterns that
start with .ΔS0 = 1000010000100001 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Fig. 4.62 The difference between the averaged traces of plaintext
pairs from Eqs. 4.47, 4.48, 4.49, and 4.50 is in red, blue,
green, and yellow, respectively. The averaged trace for the
first plaintext in Eq. 4.47 is in gray. With this gray plot,
similar to Fig. 4.3 we can find the rough time interval for
the SubBytes operation in round 3, which is colored in
pink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Fig. 4.63 Zoom in to the SubBytes computation (pink area) in
Fig. 4.62. The difference between the averaged traces
of plaintext pair from Eqs. 4.47, 4.48, 4.49, and 4.50
is in red, blue, green, and yellow, respectively. They
correspond to a single active column at the first, second,
third, and fourth positions, respectively, during the
SubBytes operation in round 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Fig. 4.64 The difference between the averaged traces of .S0 and
'
.S from Eq. 4.51 (in red), plaintext pair from Eq. 4.52
0
(in blue), and plaintext pair from Eq. 4.53 (in green).
The averaged trace for .S0 is in gray. With this gray plot,
similar to Fig. 4.3 we can find the rough time interval for
the sBoxLayer operation in round 3, which is colored in
pink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Fig. 4.65 Zoom in to the sBoxLayer computation (pink area) in
Fig. 4.64. The difference between the averaged traces
of .S0 and .S0' from Eq. 4.51 (in red), plaintext pair from
Eq. 4.52 (in blue), and plaintext pair from Eq. 4.53 (in
green). They correspond to active Sboxes .SB30 ; .SB38 ;
3 3 3
.SB , SB , SB
4 8 12 before pLayer of round 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 293
List of Figures xxvii

Fig. 4.66 An illustration of differential values for the


differential pattern .ΔS0 = 1000010000100001 and
.ΔS1 = 1000000000000000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Fig. 4.67 The possible differential patterns for
AES encryption with .ΔS0 equal to
.1000010000100001, 0100001000011000, 0010000110000100,

0001100001000010, respectively. Each figure represents


four different differential patterns starting with the same
.ΔS0 . The blue-colored squares represent active bytes and

only one of those 4 colored bytes is active in the last two


cipher states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Fig. 4.68 The possible differential patterns for PRESENT
encryption that start with .ΔS0 = 00000000FFFF0000
and converge in round 2. There are in total four
patterns—the single active bit at the end of round 2 can
be the 4th, 6th, 32nd, or 34th bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Fig. 4.69 The possible differential patterns for PRESENT
encryption that start with .ΔS0 = 0000FFFF00000000
and converge in round 2. There are in total four
patterns—the single active bit at the end of round 2 can
be the 8th, 10th, 36th, or 38th bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Fig. 4.70 The possible differential patterns for PRESENT
encryption that start with .ΔS0 = FFFF000000000000 and
converge in round 2. There are in total four patterns—the
single active bit at the end of round 2 can be the 12th,
14th, 40th, or 42nd bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Fig. 4.71 One trace corresponding to the computation of
Algorithm 4.2. We can see ten similar patterns . . . . . . . . . . . . . . . . . . . . . 304
Fig. 4.72 Highlighted two types of patterns from Fig. 4.71. One
pattern with a single cluster of peaks (colored in green)
and one with more than one cluster of peaks (colored in
blue) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Fig. 4.73 One trace corresponding to the computation of
Algorithm 4.4. We can see 18 similar patterns . . . . . . . . . . . . . . . . . . . . . 307
Fig. 4.74 Sample correlation coefficients .rt (Eq. 4.63) for time
samples .t = 1, 2, . . . , 9500. We can see a sequence of 18
patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Fig. 4.75 Sample correlation coefficients from Fig. 4.73 (in red)
with one power trace from Fig. 4.74 in gray. We can see
that the 18 patterns corresponding to sample correlation
coefficients and those corresponding to leakages coincide . . . . . . . . . 310
Fig. 4.76 There are mainly two types of patterns in Fig. 4.74: one
with a lower peak and one with a higher peak and a small
high peak at the end of the pattern. In this figure, they are
highlighted in green and blue, respectively . . . . . . . . . . . . . . . . . . . . . . . . . 310
xxviii List of Figures

Fig. 4.77 An example of a trace from dataset .T1 , obtained in


Code-SCA Step 3, which corresponds to MOV instruction
surrounded by NOPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Fig. 4.78 SNR values for each time sample computed with dataset
.T1 obtained in Code-SCA Step 3. The highest point is our

.POI = 430 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314


Fig. 4.79 Estimations of success rate computed following
Algorithm 4.1 for template-based DPA attack on the MOV
instruction taking the PRESENT Sbox output as an input.
The black line corresponds to unprotected intermediate
values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all
codewords have Hamming weight 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Fig. 4.80 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attack on the MOV
instruction taking the PRESENT Sbox output as an input.
The black line corresponds to unprotected intermediate
values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all
codewords have Hamming weight 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Fig. 4.81 Estimations of success rate computed following
Algorithm 4.1 for template-based DPA attack on the
MOV instruction taking the PRESENT Sbox output as
an input. The black line corresponds to unprotected
intermediate values. The other lines correspond to
encoded intermediate values with .(8, 16)-binary codes
obtained following Code-SCA Step 1–Code-SCA Step 7,
where we have set .wH = 2, 3, 4, 5, 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Fig. 4.82 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attack on the
MOV instruction taking the PRESENT Sbox output as
an input. The black line corresponds to unprotected
intermediate values. The other lines correspond to
encoded intermediate values with .(8, 16)-binary codes
obtained following Code-SCA Step 1—Code-SCA Step
1, where we have set .wH = 2, 3, 4, 5, 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Fig. 4.83 One trace corresponding to the computation of
Algorithm 4.8. We can see ten similar patterns . . . . . . . . . . . . . . . . . . . . . 324
Fig. 4.84 One trace corresponding to the computation of
Algorithm 4.9. We can see 21 similar patterns. Each of
them corresponds to one execution of MonPro . . . . . . . . . . . . . . . . . . . . . 325
List of Figures xxix

Fig. 4.85 Sample correlation coefficients computed following


attack steps from Sect. 4.4.2 with .10, 000 traces for the
computation of Algorithm 4.9. The trace from Fig. 4.84
is gray in the background. We can see that there are 21
patterns in the sample correlation coefficient plot that
coincide with those from Fig. 4.84—each corresponds to
one execution of MonPro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Fig. 4.86 There are mainly two types of patterns in the sample
correlation coefficient plot from Figure 4.85—one with a
higher peak cluster (colored in blue) and one with a lower
peak cluster (colored in green). Among the blue-colored
patterns, we further divide them into two types—one with
a high peak at the end (in lighter blue) and one without
this peak (in darker blue) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Fig. 4.87 An illustration of the relation between Sbox outputs in
a Quotient group to Sbox inputs in the corresponding
Remainder group. Sboxes in Quotient groups .Q0i , .Q1i ,
i i
.Q2 , .Q3 and their corresponding Remainder groups

.R0
i+1 , .R1i+1 , .R2i+1 , .R3i+1 are in orange, blue, green,
and red colors, respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Fig. 4.88 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600
computed with 50 traces from Masked fixed dataset A and
50 traces from Masked fixed dataset B. The signal is given
by the plaintext value, and the fixed versus fixed setting is
chosen. Blue dashed lines correspond to the threshold .4.5
and .−4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Fig. 4.89 SNR computed with Masked random dataset. The signal
is given by the exact value of the 0th Sbox output . . . . . . . . . . . . . . . . . 338
Fig. 4.90 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attacks on the
Masked random plaintext dataset (in black) and on the
Random plaintext dataset (in red) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Fig. 4.91 Estimations of guessing entropy computed following
Algorithm 4.1 for template-based DPA attacks on the
Masked random plaintext dataset (in black) and on the
Random plaintext dataset (in red) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Fig. 5.1 An illustration of DFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Fig. 5.2 Visual illustration of how the fault propagates when a
fault is injected at the beginning of one AES round (not
the last round) in byte .s00 . Blue squares correspond to
bytes that can be affected by the fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Fig. 5.3 Visual illustration of how the fault propagates when a fault
is injected at the beginning of one AES round in bytes:
(a) .s00 , s11 , (b) .s00 , s11 , s22 , and (c) .s00 , s11 , s22 , s33 . Blue
squares correspond to bytes that can be affected by the fault. . . . . . . 362
xxx List of Figures

Fig. 5.4 Visual illustration of fault propagation in the 9th round


of AES when the fault was injected in the diagonal
.s00 , s11 , s22 , s33 of the AES cipher state at the end of round 7 . . . . . 362
Fig. 5.5 Fault propagation for random byte fault injected in the
“diagonals” of the cipher state at the end of round 7. . . . . . . . . . . . . . . . 368
Fig. 5.6 Illustration of fault propagation for a fault injected in the
first byte of .S8 (the cipher state at the end of round 8) . . . . . . . . . . . . . . 374
Fig. 6.1 Power consumption types in CMOS circuits. The main
type considered for SCA is the switching power . . . . . . . . . . . . . . . . . . . 434
Fig. 6.2 Switching of the CMOS circuit, showing (a) the charging
path from .VDD to .CL and (b) the discharging path .CL to
GND of the capacitive load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Fig. 6.3 Digital sampling of a continuous signal with ten samples
of (a) low-frequency signal and (b) high-frequency signal . . . . . . . . 436
Fig. 6.4 Depiction of a voltage glitch on a smart card . . . . . . . . . . . . . . . . . . . . . . . 438
Fig. 6.5 Depiction of (a) laser fault injection on an AVR
microcontroller mounted on Arduino UNO board and (b)
zoomed infrared image of the chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Fig. 6.6 Depiction of a chemical decapsulation by using fuming
nitric acid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Fig. 6.7 Absorption depth in silicon. The most common
laser wavelengths for testing integrated circuits are
highlighted—532 nm (green), 808 nm (near-infrared),
and 1064 (near-infrared) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Fig. 6.8 Depiction of a pulsed electromagnetic fault injection on
an AVR microcontroller mounted on Arduino UNO board . . . . . . . . 442
Fig. 6.9 A depiction of a generic design of an electromagnetic
fault injection probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Fig. 6.10 Different ways of spatial arrangement of aggressor rows
(black) and target/victim rows (red/pink) in DRAM. (a)
Single-sided. (b) Double-sided. (c) 4-sided . . . . . . . . . . . . . . . . . . . . . . . . . 444
List of Tables

Table 1.1 Correspondence between decimal and hexadecimal


(base .b = 16) numerals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Table 1.2 Addition and multiplication in .F2 [x]/(f (x)), where
.f (x) = x + x + 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 52
Table 1.3 Addition and multiplication in .F2 [x]/(g(x)), where .g(x) = x 2 . . 53
Table 1.4 Values of .zα (see Eq. 1.43) with corresponding .α . . . . . . . . . . . . . . . . . 82
Table 2.1 Converting English letters to elements in .Z26 . . . . . . . . . . . . . . . . . . . . 105
Table 2.2 Examples of methods for converting message symbols
to bytes. The second column in each table is the binary
representation of the byte value, and the third column is
the corresponding hexadecimal representation . . . . . . . . . . . . . . . . . . . 106
Table 2.3 Shift cipher with k = 5. The second row represents the
ciphertexts for the letters in the first row . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Table 2.4 Definition of .σ , a key for substitution cipher . . . . . . . . . . . . . . . . . . . . . 112
Table 2.5 Definition of .σ −1 , where .σ ∈ S26 is a key for
substitution cipher shown in Table 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Table 2.6 Probabilities of each letter in a standard English
text [BP82] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Table 3.1 Initial permutation (IP) and final permutation (IP.−1 ) in
DES algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Table 3.2 Expansion function .EDES : F32 2 → F2 in DES round
48

function. The 1st bit of the output is given by the 32nd


bit of the input. The 2nd bit of the output is given by the
1st bit of the input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Table 3.3 SB.1DES in DES found function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Table 3.4 Permutation function .PDES : F32 2 → F2 in DES round
32

function. The 1st bit of the output is given by the 16th


bit of the input. The 2nd bit of the output comes from
the 7th bit of the input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

xxxi
xxxii List of Tables

Table 3.5 Left and right part of the intermediate values in DES key
schedule after PC1. The 1st bit of the left part comes
from the 57th bit of the master key (input to PC1) . . . . . . . . . . . . . . . . 138
Table 3.6 Number of key bits rotated per round in DES key schedule . . . . . . 138
Table 3.7 PC2 in DES key schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Table 3.8 Specifications of Rijndael design, where blue-colored
values are adopted by AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Table 3.9 AES Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Table 3.10 Inverse of AES Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Table 3.11 PRESENT Sbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Table 3.12 PRESENT pLayer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Table 3.13 The Boolean function .ϕ0 takes input .x and outputs the
0th bit of SB.PRESENT (x). The second last row lists the
output of .ϕ0 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Table 4.1 Difference distribution table for PRESENT Sbox
(Table 3.11). The columns correspond to input
difference .δ, and the rows correspond to output
difference .Δ. The row for .Δ = 0 is omitted since it is empty . . . . 277
Table 4.2 In the first column, we list the possible
values of .α such that the following
entries of AES Sbox DDT are nonempty
.(0E·α, 4F), (09·α, 8F), (0D·α, 21), (0B·α, 9F).
The corresponding hypotheses for
.k00 ⊕ 4C, k11 ⊕ AA, k22 ⊕ 10, k33 ⊕ 90 are

listed in the second, third, and fourth columns,


respectively. The correct value of .α is marked in blue. A
detailed analysis is shown in Example 4.3.15 . . . . . . . . . . . . . . . . . . . . . 296
Table 4.3 Possible values of .α and the corresponding key
hypotheses for .k00 , k11 , k22 , k33 , the main diagonal of
the AES master key. The correct key bytes are marked
in blue. A detailed analysis is shown in Example 4.3.15 . . . . . . . . . 296
Table 4.4 Relation between the output bits of Sboxes from
the Quotient group .Qj i and the input bits of Sboxes
from the corresponding Remainder group .Rj i+1 . For
example, the 0th input bit of SB.i+1 j +4 in .Rj
i+1 comes

from the first output bit of .SBi4j in .Qj i . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331


Table 4.5 An example of T2, which specifies the output mask
.mout, SB for each input mask .min, SB of PRESENT

Sbox [SBM18] such that all possible values of


.min ⊕ mout appear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

Table 5.1 Part of the difference distribution table for SB.1DES (Table 3.3) . . . 355
List of Tables xxxiii

Table 5.2 Part of the difference distribution table for AES Sbox
(Table 3.9) corresponding to output differences 0C, 69,
8C, and ED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Table 5.3 Fault distribution tables for fault models: (a) stuck-at-0,
(b) bit flip, and (c) random fault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Table 5.4 Fault distribution tables for fault models: (a) stuck-at-0
with probability .0.5 and (b) random-AND with .δ, where
.δ follows a uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Table 5.5 Lookup table for carrying out XOR between .a, b
(.a, b ∈ F2 ) using 01 as the codeword for 0 and 10 as the
codeword for 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Table 5.6 Lookup table for error-correcting code based
computation of AND between .a, b (.a, b ∈ F2 ), using the
3-repetition code .{000, 111}. 000 is the codeword for 0,
and 111 is the codeword for 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Table C.1 Sboxes in DES (Sect. 3.1.1) round function . . . . . . . . . . . . . . . . . . . . . . . 455
Table D.1 The Boolean function .ϕ1 takes input .x and outputs the
1st bit of SB.PRESENT (x). The second last row lists the
output of .ϕ1 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table D.2 The Boolean function .ϕ2 takes input .x and outputs the
2nd bit of SB.PRESENT (x). The second last row lists the
output of .ϕ2 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table D.3 The Boolean function .ϕ3 takes input .x and outputs the
3rd bit of SB.PRESENT (x). The second last row lists the
output of .ϕ3 for different input values. The last row lists
the coefficients (Eq. 3.10) for the algebraic normal form
of .ϕ3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Table E.1 Table .TSG , estimated signals for each integer between
00 and FF with Hamming weight 6, computed with the
stochastic leakage model obtained in Code-SCA Step
6 from Sect. 4.5.1.1. The first (resp. second) column
contains the hexadecimal (resp. binary) representations
of the integers. The last column lists the corresponding
estimated signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Table E.2 Sorted version of .TSG from Table E.1 such that the
estimated signals (values in the last column) are
in ascending order. The hexadecimal (resp. binary)
representations of the corresponding integers are in the
first (resp. second) column. Words highlighted in blue
constitute the chosen binary code with Algorithm 4.5 . . . . . . . . . . . 463
List of Algorithms

1.1 Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9


1.2 Extended Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1 KeyExpansion—AES-128 key schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
3.2 A lookup table implementation of PRESENT Sbox in pseudocode . . . 152
3.3 A more efficient lookup table implementation of PRESENT
Sbox in pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
3.4 A lookup table implementation combining two PRESENT
Sboxes in parallel in pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3.5 An implementation that combines sBoxLayer and pLayer for
PRESENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
3.6 Bitsliced implementation of round i of PRESENT, .1 ≤ i ≤ 31 . . . . . . . 164
3.7 Right-to-left square and multiply algorithm for computing
modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
3.8 Left-to-right square and multiply algorithm for computing
modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
3.9 Montgomery powering ladder for computing modular exponentiation 175
3.10 Standard multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
3.11 Blakely’s method for computing modular multiplication . . . . . . . . . . . . . . 183
3.12 Blakely’s method for computing modular multiplication by
taking .ω = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
3.13 Right-to-left square and multiply algorithm with Blakely’s
method for modular multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.14 Left-to-right square and multiply algorithm with Blakely’s
method for modular multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
3.15 Montgomery powering ladder with Blakely’s method for
computing modular multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
3.16 MonPro, Montgomery product algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
3.17 MonPro, Montgomery product algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
3.18 Montgomery’s method for computing modular multiplication . . . . . . . . 197
3.19 Montgomery right-to-left square and multiply algorithm . . . . . . . . . . . . . . 199

xxxv
xxxvi List of Algorithms

3.20 Montgomery left-to-right square and multiply algorithm . . . . . . . . . . . . . . 199


3.21 Montgomery powering ladder with Montgomery’s method
for modular multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4.1 Computation of estimations for guessing entropy and success rate. . . . 272
4.2 Left-to-right square and multiply algorithm for computing
modular exponentiation (see Algorithm 3.8) with parameters
from Eq. 4.59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
4.3 MonPro, Montgomery product algorithm with parameters
from Eq. 4.59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
4.4 Montgomery left-to-right square and multiply algorithm with
parameters from Eq. 4.59. MonPro is given by Algorithm 4.3 . . . . . . . . . 306
4.5 Finding the optimal code for encoding countermeasure
against SCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
4.6 Right-to-left square and multiply always algorithm for
computing modular exponentiation. A hiding-based
countermeasure against SCA attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
4.7 Left-to-right square and multiply always algorithm for
computing modular exponentiation. A hiding-based
countermeasure against SCA attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
4.8 Protected implementation of Algorithm 4.2. Left-to-right
square and multiply always algorithm for computing modular
exponentiation (see Algorithm 3.8) with parameters from Eq. 4.59 . . . 324
4.9 Montgomery left-to-right square and multiply always
algorithm with parameters from Eq. 4.59. MonPro is given by
Algorithm 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
4.10 Masked implementation of PRESENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
5.1 Part of an implementation for PRESENT encryption
that combines sBoxLayer and pLayer in AVR
assembly [PV13, AV13]. A pseudocode can be found in
Algorithm 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
5.2 A simple program to demonstrate protection against single
instruction skip attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
5.3 Infective Countermeasure for AES-128 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
5.4 Computation of AES round in the infective Countermeasure
for AES-128 from Algorithm 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
5.5 Computation of redundant AES round in the infective
Countermeasure for AES-128 from Algorithm 5.3 . . . . . . . . . . . . . . . . . . . . . 390
5.6 Computation of the dummy round in the infective
Countermeasure for AES-128 from Algorithm 5.3 . . . . . . . . . . . . . . . . . . . . . 390
5.7 Computing RSA signature with the right-to-left square and
multiply algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
5.8 RSA signature computation with Montgomery powering
ladder and Blakely’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
List of Algorithms xxxvii

5.9 An algorithm involving computing modular multiplication


with Blakely’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
5.10 RSA signature signing computation with the right-to-left
square and multiply algorithm and Blakely’s method . . . . . . . . . . . . . . . . . . 408
5.11 Modified Algorithm 5.9 to counter the safe error attack . . . . . . . . . . . . . . . 425
5.12 RSA signature computation with Montgomery powering
ladder and Blakely’s method (Algorithm 5.8), protected
against the safe error attack from Sect. 5.3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
5.13 RSA signature signing computation with the right-to-left
square and multiply algorithm and Blakely’s method
(Algorithm 5.10), protected against the safe error attack from
Sect. 5.3.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Chapter 1
Mathematical and Statistical Background

To study attacks on cryptographic algorithms, we need to first understand the


computations that are carried out in each step of those algorithms. To achieve this,
we need knowledge of certain math concepts. In this chapter, we will introduce
the necessary mathematical background for the rest of the book, including abstract
algebra, linear algebra, coding theory, and probability theory. In Sect. 1.8, we will
also provide statistical tools that will be useful for Chap. 4.

1.1 Preliminaries

Before we start with math, let us first introduce the basic notations.

1.1.1 Sets

By a set, we refer to a collection of objects without repetition. We will normally


use a capital letter to denote a set. For example, .A = { 0, 1, 2 } is a set consisting of
three numbers, and .B = { ◦, Δ, □ } is a set consisting of three shapes. The objects
in a set S are called elements of S. If an element a is in a set S, we write .a ∈ S. If
an element a is not in S, we write .a /∈ S. When there is no element in a set, we call
it an empty set and denote it by .∅. The total number of elements in a set S is called
the cardinality of S, denoted by .|S|.
Now let us look at two sets, S and T . We say S is a subset of T , denoted by
.S ⊆ T , if any element of S is also an element of T . Namely, .S ⊆ T if for any .s ∈ S,

.s ∈ T . Two sets are said to be equal if they contain the same elements. In other

words, .S = T if .S ⊆ T and .T ⊆ S. The power set of a set S, denoted by .2S , is the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1


X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_1
2 1 Mathematical and Statistical Background

set of all subsets of S. We note that by definition, .S ∈ 2S , .∅ ∈ 2S , and .∅ ⊆ S for


any set S.
Example 1.1.1 Let .T = { 0, 1, 2, 3 } and .S = { 2, 3 }, then
• .S ⊆ T and .T /⊆ S.
• .2 ∈ S, .0 /∈ S.
• .|S| = 2, .|T | = 4.

.2 = { ∅, S, { 2 } , { 3 } }.
• S

The union of two sets A and B, denoted .A ∪ B, is the set that contains all elements
from A or B.

A ∪ B := { x | x ∈ A or x ∈ B } .
.

The intersection of .A and .B, denoted .A ∩ B, is the set that contains elements in both
A and B.

A ∩ B := { x | x ∈ A and x ∈ B } .
.

Example 1.1.2 Let .A = { 0, 1, 2 } and .B = { 2, 3, 4 }, then .A ∪ B = { 0, 1, 2, 3, 4 }


and .A ∩ B = { 2 }.
Similarly, the union and the intersection of n sets .A1 , A2 , . . . , An are defined as
follows:


n ⋂
n
. Ai := { a | a ∈ Ai for some i } , Ai := { a | a ∈ Ai for all i } .
i=1 i=1

The difference between set A and set B is the set of all elements of A that are
not in B:

A − B := { a | a ∈ A, a /∈ B } .
. (1.1)

The complement of a set A in a set S is the difference between S and A,

Ac := S − A = { s | s ∈ S, s /∈ A } .
.

The Cartesian product of A and B is the set of ordered pairs .(a, b) such that
a ∈ A and .b ∈ B,
.

. A × B := { (a, b) | a ∈ A, b ∈ B } .

The Cartesian product of n sets can be defined similarly,


n
. Ai := { (a1 , a2 , . . . , an ) | ai ∈ Ai for all i } .
i=1
1.1 Preliminaries 3

Example 1.1.3 Let .A = { 2, 4, 6 }, .B = { 1, 3, 5 }, and .S = A ∪ B. Then .A − B =


A; the complement of A in S is B, and

A × B = { (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5), (6, 1), (6, 3), (6, 5) } .
.

We note that, in general, .A × B /= B × A. In Example 1.1.3,

B × A = { (1, 2), (3, 2), (5, 2), (1, 4), (3, 4), (5, 4), (1, 6), (3, 6), (5, 6) } /= A × B.
.

1.1.2 Functions

Functions (also called maps) will be used a lot in the rest of the book. Here we
provide the formal definition.
Definition 1.1.1 A function/map .f : S → T is a rule that assigns each element
s ∈ S a unique element .t ∈ T .
.

• S is called the domain of f .


• T is called the codomain of f .
• If .f (s) = t, then t is called the image of s, and s is called a preimage of t.
• For any .A ⊆ T ,

f −1 (A) := { s ∈ S | f (s) ∈ A }
.

is called the preimage of A under f .


Example 1.1.4 Define

f :R→R
.

x ⍿→ x 2 ,

where .R is the set of real numbers. Then f has domain .R and codomain .R.
Let .A = { 1 } ⊆ R, the preimage of A under f is given by

f −1 (A) = { −1, 1 } .
.

1 is the image of .−1 and .−1 is a preimage of 1. 1 is another preimage of 1.


Let .B = { −1 } ⊆ R, then .f −1 (B) = ∅.
We note that the image of an element .s ∈ S is unique, and preimages of .t ∈ T may
not exist. Even if a preimage of .t ∈ T exists, it may not be unique. In case every
.t ∈ T has a preimage, we say that f is surjective. In case such a preimage is also

unique, we say that f is bijective.


4 1 Mathematical and Statistical Background

Definition 1.1.2
• A function .f : S → T is called onto or surjective, if given any .t ∈ T , there
exists .s ∈ S, such that .t = f (s).
• A function .f : S → T is said to be one-to-one (written 1-1) or injective if for
any .s1 , s2 ∈ S such that .s1 /= s2 , we have .f (s1 ) /= f (s2 ).
• f is called 1-1 correspondence or bijective if f is 1-1 and onto.
Example 1.1.5
• Define f

f : R → R≥0
.

x ⍿→ x 2 ,

then f is surjective as for any .y ∈ R≥0 , we can find a preimage x of y by



calculating .x = y. But f is not injective, since .f (−1) = f (1) = 1.
• Define g

g:R→R
.

x ⍿→ x.

It can be easily seen that g is bijective.


As mentioned above, if .f : S → T is not surjective, there exists .t ∈ T such that
f −1 (t) = ∅. If f is not injective, there are at least two .s1 , s2 ∈ S such that .s1 /= s2
.

and .f (s1 ) = f (s2 ) = t, which means .f −1 (t) is not a unique element. However,
when f is bijective, .f −1 : T → S is a function—it assigns to each .t ∈ T a unique
element .s ∈ S. In such a case, .f −1 is called the inverse of f .
Example 1.1.6 Define f

f :R→R
.

x ⍿→ x 3 .

Then the inverse of f exists and is given by

.f −1 : R → R

x ⍿→ 3 x.

When the domain of one function is the codomain of another function, we can define
the composition of those two functions.
Definition 1.1.3 For two functions .f : T → U , .g : S → T , the composition of f
and g, denoted by .f ◦ g, is the function
1.1 Preliminaries 5

f ◦g :S → U
.

s ⍿→ f (g(s)).

Example 1.1.7 Suppose we have f

. f :R→R
x ⍿→ x 2

and g

g:R→R
.

x ⍿→ x 3 .

Then the composition of f and g is given by

f ◦g :R → R
.

x ⍿→ (x 3 )2 = x 6 .

For a function whose domain and codomain are the same, say .f : S → S, we can
define .f ◦f ◦· · ·◦f in a similar way. For simplicity, we write .f n for the composition
of n copies of f . When .f : S → S is bijective, .f −1 is a function. And we write
.f
−n for the composition of n copies of .f −1 .

Example 1.1.8 Define

f :R→R
.

x ⍿→ x 2 ,

then

fn : R → R
.
n
x ⍿→ x 2 .

1.1.3 Integers

We deal with integers every day. We would write one hundred and twenty-three as
123 because

123 = 1 × 100 + 2 × 10 + 3 × 1.
.
6 1 Mathematical and Statistical Background

Such a representation of an integer is called a base.−10 representation. In general,


for any integer .b ≥ 2, we can have a base.−b representation for a positive integer.
Theorem 1.1.1 Let .b ≥ 2 be an integer. Then any .n ∈ Z, .n > 0, can be expressed
uniquely in the form


𝓁−1
n=
. ai bi , (1.2)
i=0

where .0 ≤ ai < b .(0 ≤ i < 𝓁), .a𝓁−1 /= 0, and .𝓁 ≥ 1. .a𝓁−1 a𝓁−2 . . . a1 a0 is called a
base.−b representation for n. .𝓁 is called the length of n in base.−b representation.
The proof can be found in, e.g., [Kos02, page 81]. To emphasize the base b, we
sometimes put b as a subscript for the representation. When .b = 2, a base.−2
representation is also called a binary representation, .𝓁 is also called the bit length
of n, .a0 is said to be the least significant bit (LSB) of n, and .a𝓁−1 is said to be the
most significant bit of n. When .b = 16, a base.−16 representation is also called a
hexadecimal representation.
The correspondence between decimal numerals and hexadecimal (base .b = 16)
numerals is listed in Table 1.1.
Example 1.1.9

310 = 112 = 316 .


. 410 = 1002 = 416 .
6010 = 1111002 = 3C16 .

We have learned in primary school that when we divide 6 by 4 we get quotient 1


and remainder 2. Such a computation can be done thanks to the following theorem.
The proof involves well-ordering principles of integers, which will not be covered
in this book. Interested readers are referred to, e.g., [Her96, page 22].
Theorem 1.1.2 If .m, n ∈ Z, .n > 0, then there exist .q, r ∈ Z, such that .0 ≤ r < n
and .n = qm + r.
q is called the quotient and r is called the remainder.
Definition 1.1.4 Given .m, n ∈ Z, if .m /= 0 and .n = am for some integer a, we say
that m divides n, written .m|n. We call m a divisor of n and n a multiple of m. If m
does not divide n, we write .m ∤ n.

Table 1.1 Correspondence between decimal and hexadecimal (base .b = 16) numerals
Base 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Base 16 0 1 2 3 4 5 6 7 8 9 A B C D E F
1.1 Preliminaries 7

Example 1.1.10
• . 3|6, .−2|4, .1|8, .5|5.
• . 7 ∤ 9, .4 ∤ 6.
• All the positive divisors of 4 are .1, 2, 4.
• All the positive divisors of 6 are .1, 2, 3, 6.
We can see that there are some common divisors between 4 and 6. The largest of
them will be of importance to us. Formally, we can define the greatest common
divisor between two integers that are not both zero.
Definition 1.1.5 Take .m, n ∈ Z, .m /= 0 or .n /= 0, and the greatest common divisor
of m and n, denoted .gcd(m, n), is given by .d ∈ Z such that
• .d > 0.
• .d|m, .d|n.
• If .c|m and .c|n, then .c|d.
Example 1.1.11
• Continuing Example 1.1.10, common divisors of 4 and 6 are 1 and 2. So
.gcd(4, 6) = 2.

• All the positive divisors of 2 are 1 and 2. All the positive divisors of 3 are 1 and
3. So .gcd(2, 3) = 1.
It can be proven that the greatest common divisor of two integers (not both zero)
always exists and it is unique. The proof of the theorem can be found in, e.g.,
[Her96, page 23].
Theorem 1.1.3 (Bézout’s identity) For any .m, n ∈ Z, such that .m /= 0 or .n /= 0,
gcd(m, n) exists and is unique. Moreover, there exist .s, t ∈ Z such that .gcd(m, n) =
.

sm + tn.
The equation .gcd(m, n) = sm + tn is usually called Bézout’s identity. We note that
the choices of .s, t are not unique. Indeed, if .gcd(m, n) = sm+tn, then .gcd(m, n) =
(s + n)m + (t − m)n.
Example 1.1.12

. gcd(4, 6) = 2 = (−1) × 4 + 1 × 6.
gcd(2, 3) = 1 = (−4) × 2 + 3 × 3.

Next, we prove some simple but useful results.


Lemma 1.1.1 For any .m, n, a ∈ Z, we have
(1) 1|n for all n.
.

(2) If .m /= 0, then .m|0.


(3) If .m|n and .n|a, then .m|a.
(4) If .m|1, then .m = ±1.
8 1 Mathematical and Statistical Background

(5) If .m|n and .n|m, then .m = ±n.


(6) If .m|n and .m|a, then .m|(un + va), .∀u, v ∈ Z.1
(7) If .a|mn and .gcd(a, m) = 1, then .a|n.
(8) If .m|a, .n|a and .gcd(m, n) = 1, then .mn|a.
Proof Proofs of (1)–(4) easily follow from the definitions.
To prove (5), as .m|n and .n|m, by Definition 1.1.4, there are integers .c1 , c2 such
that .n = mc1 and .m = c2 n. This gives .n = nc1 c2 and we have .c1 c2 = 1. Since all
the divisors of 1 are .±1, we have .c1 = c2 = 1 or .c1 = c2 = −1.
To prove (6), since .m|n, m|q, there are integers .c1 , c2 such that .n = mc1 and
.q = mc2 . Then

un + vq = uc1 m + vc2 m = (uc1 + vc2 )m


.

is a multiple of m.
To prove (7), we note that by Bézout’s identity, there exist .s, t ∈ Z such that
.as + mt = 1. Multiplying both sides by n, we get .asn + mnt = n. Since .a|asn and

.a|mnt, we have .a|n.

Finally, we prove (8). Since .m|a, .a = mk for some .k ∈ Z. We have .n|mk. Now
because .gcd(m, n) = 1, by (7), .n|k and so .k = nk ' for some .k ' ∈ Z. Thus .a = mnk '
is divisible by mn. ⨆

In general, to find .gcd(m, n), it would be too time-consuming to list all the divisors
of m and n. The following theorem allows us to simplify the computation.
Theorem 1.1.4 (Euclid’s division) Given .m, n ∈ Z, take .q, r such that .n = qm +
r. Then .gcd(m, n) = gcd(m, r).
Proof We first note that we can find .q, r by Theorem 1.1.2. By Lemma 1.1.1 (6),
.gcd(m, n)|n − qm, i.e., .gcd(m, n)|r. Similarly we have .gcd(m, r)|qm + r, i.e.,
.gcd(m, r)|n.

By Definition 1.1.5, .gcd(m, n)| gcd(m, r) and .gcd(m, r)| gcd(m, n). By
Lemma 1.1.1 (5), .gcd(m, r) = ± gcd(m, n). By Definition 1.1.5, .gcd(m, r) > 0
and .gcd(m, n) > 0. We have .gcd(m, n) = gcd(m, r). ⨆

Thus, to find .gcd(m, n), we can compute Euclid’s division repeatedly until we get
r = 0.
.

Example 1.1.13 We can calculate .gcd(120, 35) as follows:

120 = 35 × 3 + 15 gcd(120, 35) = gcd(35, 15),


. 35 = 15 × 2 + 5 gcd(35, 15) = gcd(15, 5),
15 = 5 × 3 gcd(15, 5) = 5 =⇒ gcd(120, 35) = 5.

1 The notation .∀ stands for “for all.”


1.1 Preliminaries 9

The procedure is called the Euclidean algorithm, and the details are provided in
Algorithm 1.1. By Theorem 1.1.4, .gcd(m, n) = gcd(m, r) after each loop from
line 1. In the end, we get .gcd(m, n).

Algorithm 1.1: Euclidean algorithm


Input: m, n// m, n ∈ Z, m /= 0
Output: gcd(m, n)
1 while m /= 0 do
2 r = n%m// remainder of n divided by m
3 n=m
4 m=r
5 return r

Furthermore, with the intermediate results we have from the Euclidean algorithm,
we can also find a pair of .s, t such that .gcd(m, n) = sm + tn (Bézout’s identity).
Example 1.1.14 Continuing Example 1.1.13, we can find integers .s, t such that
gcd(120, 35) = 120s + 35t as follows:
.

5 = 35 − 15 × 2, 15 = 120 − 35 × 3,
.
=⇒ 5 = 35 − (120 − 35 × 3) × 2 = 120 × (−2) + 35 × 7.

Such a procedure is called the extended Euclidean algorithm.


Example 1.1.15 We can calculate .gcd(160, 21) using the Euclidean algorithm

160 = 21 × 7 + 13 gcd(160, 21) = gcd(21, 13),


21 = 13 × 1 + 8 gcd(21, 13) = gcd(13, 8),
13 = 8 × 1 + 5 gcd(13, 8) = gcd(8, 5),
.8 = 5 × 1 + 3 gcd(8, 5) = gcd(5, 3),
5=3×1+2 gcd(5, 3) = gcd(3, 2),
3=2×1+1 gcd(3, 2) = gcd(2, 1),
2=1×2 gcd(2, 1) = 1 =⇒ gcd(160, 21) = 1.

By the extended Euclidean algorithm, we can also find integers .s, t such that
gcd(160, 21) = s160 + t35
.

1 = 3 − 2, 2 = 5 − 3,
. 3 = 8 − 5, 5 = 13 − 8,
8 = 21 − 13, 13 = 160 − 21 × 7.
10 1 Mathematical and Statistical Background

We have

1 = 3 − (5 − 3) = 3 × 2 − 5 = 8 × 2 − 5 × 3 = 8 × 2 − (13 − 8) × 3
.

= 8 × 5 − 13 × 3 = 21 × 5 − 13 × 8 = 21 × 5 − (160 − 21 × 7) × 8
= (−8) × 160 + 61 × 21.

An algorithmic description of the extended Euclidean algorithm is shown in


Algorithm 1.2. By Definition 1.1.5, .m /= 0 or .n /= 0. If .m = 0, .gcd(m, n) = n.
If .n = 0, .gcd(m, n) = m. Both cases are trivial; hence in the algorithm, we assume
.n /= 0 and .m /= 0. We also note that we can just compute the coefficient s and then

compute t using s.

Algorithm 1.2: Extended Euclidean algorithm


Input: m, n// m, n ∈ Z, n /= 0, m /= 0
Output: s, t such that gcd(m, n) = sm + tn
1 s = 0, ss = 1, r = m, rr = n
2 while r /= 0 do
// quotient of rr divided by r
3 q = rr/r
4 tmp = r
// remainder of rr divided by r
5 r = rr%r
6 rr = tmp
7 tmp = s
8 s = ss − q ∗ s
9 ss = tmp
// rr = gcd(m, n)
10 t = (rr − ss ∗ n)/m
11 return ss, t

Definition 1.1.6
• For .m, n ∈ Z such that .m /= 0 or .n /= 0, m and n are said to be relatively
prime/coprime if .gcd(m, n) = 1.
• Given .p ∈ Z, .p > 1. p is said to be prime (or a prime number) if for any
.m ∈ Z, either m is a multiple of p (i.e., .p|m) or m and p are coprime (i.e.,

.gcd(p, m) = 1).

• Given .n ∈ Z, .n > 1. If n is not prime, it is said to be composite (or a composite


number).
Example 1.1.16
• 4 and 9 are relatively prime.
• 8 and 6 are not coprime.
• .2, 3, 5, 7 are prime numbers.
1.2 Abstract Algebra 11

• .6, 9, 21 are not prime numbers.


We have the following lemma concerning prime numbers.

Lemma 1.1.2 For .p ∈ Z a prime number, if .p| ni=1 ai , where .ai ∈ Z, then .p|ai
for some i (.1 ≤ i ≤ n).

∏ we are done. Otherwise, .gcd(p, a1 ) = 1, and by Lemma 1.1.1


Proof If .p|a1 , then
(7), we have .p| ni=2 ai . We can repeat the argument and conclude that .p|ai for
some i. ⨆

It can be proven that an integer .n > 1 is either a prime number or a product of prime
numbers (see, e.g., [Her96, page 26]). Then, we have the Fundamental Theorem of
Arithmetic which says that this product is unique up to permutation.
Theorem 1.1.5 (The Fundamental Theorem of Arithmetic) For any .n ∈ Z,
n > 1, n can be written in the form
.


k
n=
. piei ,
i=1

where the exponents .ei are positive integers, and .p1 , p2 , . . . , pk are prime numbers
that are pairwise distinct and unique up to permutation.
Proof We prove by contradiction. Assume the theorem is false. Let .n ∈ Z (.n > 1)
be the smallest integer with two distinct factorizations. We can write


k ∏
𝓁
d
n=
. piei = qj j .
i=1 j =1

∏ d
Since .p1 | 𝓁j =1 qj j , by Lemma 1.1.2, .p1 |qj for some j . Without loss of generality,
we assume .p1 |q1 . Since .p1 and .q1 are prime numbers, we have .p1 = q1 . Then
∏ ∏ d
the integer .n' = ki=2 piei = 𝓁j =2 qj j has two distinct factorizations and .n' < n,
which contradicts the minimality of n. ⨆

Example 1.1.17 .20 = 22 × 5, .135 = 33 × 5.

1.2 Abstract Algebra

In this section, we discuss the basics of abstract algebra and get to know a few
abstract structures. Most of us are already familiar with examples of such structures,
probably just not by the name. Those structures will become useful when we discuss
modern cryptographic algorithms.
12 1 Mathematical and Statistical Background

1.2.1 Groups

First, we define a group.


Definition 1.2.1 A group .(G, ·) is a nonempty set G with a binary operation .·
satisfying the following conditions:
• G is closed under .· (closure property), .∀g1 , g2 ∈ G, .g1 · g2 ∈ G.
• .· is associative, .∀g1 , g2 , g3 ∈ G, .g1 · (g2 · g3 ) = (g1 · g2 ) · g3 .
.∃e ∈ G, an identity element, such that .∀g ∈ G, .e · g = g · e = g.
• 2

• −1 −1
Every .g ∈ G has an inverse .g ∈ G such that .g · g = g · g = e. −1

When it is clear from the context, we omit .· and say that G is a group.
Example 1.2.1 There are many examples of groups that we are familiar with.
• .(Z, +), the set of integers with addition, is a group. The identity element is 0.
• Similarly, .(Q, +) and .(C, +) are groups.
• .(Q, ×) is not a group. Because .0 ∈ Q does not have an inverse with respect to
multiplication.
• But .(Q\ { 0 } , ×) is a group. The identity element is 1.
Next, we give an example of formally proving that a set with a binary operation is a
group. Let .G = R+ be the set of positive real numbers, and let .· be the multiplication
of real numbers, denoted .×. We will show that .(R+ , ×) is a group.
1. .R+ is closed under .×: for any .a1 , a2 ∈ R+ , .a1 × a2 ∈ R and .a1 × a2 > 0, hence
.a1 × a2 ∈ R .
+

2. .× is associative: .∀a1 , a2 , a3 ∈ R+ , .a1 × (a2 × a3 ) = (a1 × a2 ) × a3 .


3. 1 is the identity element in .R+ : .∀a ∈ R+ , .1 × a = a × 1 = a.
4. Take any .a ∈ R+ , . a1 ∈ R, and . a1 > 0, so . a1 ∈ R+ . Moreover,

1 1

. = × a = 1,
a a

hence .a −1 = 1
a ∈ R+ .
By definition, we have proved that .(R+ , ×) is a group.
Definition 1.2.2 Let .(G, ·) be a group. If .· is commutative, i.e.,

∀g1 , g2 ∈ G, g1 · g2 = g2 · g1 ,
.

then the group is called abelian.

2 The notation .∃ stands for “there exist.”


1.2 Abstract Algebra 13

The name abelian is in honor of the great mathematician Niels Henrik Abel (1802–
1829).
Example 1.2.2 The groups we have seen so far .(Z, +), .(R+ , ×), .(Q\ { 0 } , ×),
.(Q, +), and .(C, +) are all abelian groups.

Example 1.2.3 Let us consider the set of .2 × 2 matrices with coefficients in .R. We
denote this set by .M2×2 (R).
⎛ Recall ⎞ that
⎛ matrix ⎞ addition, denoted by .+, is defined
a00 a10 b00 b10
componentwise. For any . , in .M2×2 (R),
a01 a11 b01 b11
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a10 b b a + b00 a10 + b10
. + 00 10 = 00 .
a01 a11 b01 b11 a01 + b01 a11 + b11

(M2×2 (R), +) is an abelian group: Closure, associativity, and


.
⎛ commutativity
⎞ of .+
00
are easy to show. The identity element is the zero matrix . . The inverse of
00
⎛ ⎞ ⎛ ⎞
a a −a00 −a10
any matrix . 00 10 is . , which is also in .M2×2 (R). Section 1.3.1
a01 a11 −a01 −a11
presents a more general discussion on matrices.
Example 1.2.4 Let .F2 := { 0, 1 }. We define logical XOR , denoted .⊕, in .F2 as
follows:

0 ⊕ 0 = 0,
. 0 ⊕ 1 = 1 ⊕ 0 = 1, 1 ⊕ 1 = 0.

Closure, associativity, and commutativity can be directly seen from the definition.
The identity element is 0, and the inverse of 1 is 1. Hence .(F2 , ⊕) is an abelian
group.
Example 1.2.5 Let .E = { a, b }, .a /= b. Define addition in E as follows:

a + a = a,
. a + b = b + a = b, b + b = a.

Closure, associativity, and commutativity can be directly seen from the definition.
The identity element is a, and the inverse of b is b. Hence .(E, +) is an abelian
group.
Next, we will see a group that is not abelian. To introduce this group, we start by
defining permutations.
Definition 1.2.3 A permutation of a set S is a bijective function .σ : S → S.
Example 1.2.6
• Let .S = {0, 1, 2}. Define .σ : S → S as follows:

0 ⍿→ 1,
. 1 ⍿→ 2, 2 ⍿→ 0.
14 1 Mathematical and Statistical Background

Then .σ is a permutation of S.
• Let .S = {◦, Δ, □}. Define .τ : S → S as follows:

◦ ⍿→ Δ,
. Δ ⍿→ □, □→
⍿ ◦.

Then .τ is a permutation of S.
We note that what matters for a permutation is how many objects we have, not the
objects’ nature. We can label a set of n objects with .1, 2, . . . , n. In Example 1.2.6,
we can label .◦ as 0, .Δ as 1, and .□ as 2. Then .σ and .τ are the same permutation.
Now, we take a set S of n elements. Labeling the elements allows us to consider
.S = {1, . . . , n}. Let .Sn denote the set of all permutations of S. And let .◦ denote the

composition of functions (see Definition 1.1.3). Then we have the following lemma:
Lemma 1.2.1 .(Sn , ◦) is a group.
The proof is easy. We leave it as an exercise for the readers.
We note that the identity element in the group is the identity function .σ : S → S,
.σ (s) = s ∀s ∈ S. Any .σ ∈ Sn is bijective (see Definition 1.1.2), and the inverse of

.σ in .Sn is then given by .σ


−1 .

Definition 1.2.4 .(Sn , ◦) is called the symmetric group of degree n.


Example 1.2.7 Let .n = 2 and .S = { 1, 2 }. There are only two ways to permute
two elements. So .S2 = { σ1 , σ2 }, where .σ1 : S → S, .1 ⍿→ 1, 2 ⍿→ 2 is the identity,
and .σ2 : S → S, .1 ⍿→ 2, 2 ⍿→ 1.
Example 1.2.8 (A group that is not abelian) Let .n = 3 and .S = { 1, 2, 3 }. There
are .3! = 6 ways of permuting three elements. In particular, we have the following
two permutations:

σ1 : S → S, 1 ⍿→ 2, 2 ⍿→ 3, 3 ⍿→ 1;
. σ2 : S → S, 1 ⍿→ 3, 2 ⍿→ 2, 3 ⍿→ 1.

We note that .σ1 ◦ σ2 /= σ2 ◦ σ1 since

.σ1 ◦ σ2 (1) = 1, but σ2 ◦ σ1 (1) = 2.

Hence, .S3 is not abelian.


We can extend .σ1 and .σ2 in Example 1.2.8 to permuting n elements by keeping the
other .n − 3 elements unchanged. Thus .Sn is not abelian for any .n ≥ 3.
Definition 1.2.5 The order of a group .(G, ·) is the number of elements in G or the
cardinality of the set G, .|G|. A group G is said to be finite if .|G| < ∞ and infinite
if .|G| = ∞.
Example 1.2.9
• We have seen a few infinite groups, for example, .(Z, +) and .(R+ , ×).
• We have also seen two finite groups, .|S2 | = 2 and .|S3 | = 6.
1.2 Abstract Algebra 15

• Let .S = { 1, 2, . . . , n }. To permute the elements in S, there are n choices for the


image of 1, .n − 1 choices for the image of 2, etc. Thus .|Sn | = n!, and .Sn is a
finite group.
Definition 1.2.6 Let .(G, ·) be a group with identity element e. The order of an
element .g ∈ G, denoted .ord (g), is the smallest positive integer k such that

. g · g · · · g = g k = e.
◟ ◝◜ ◞
k times

When such a k does not exist, we define .ord (g) = ∞.


Example 1.2.10
• In .(Z, +), the identity element is 0, .ord (1) = ∞.
• Continuing Example 1.2.7, .σ1 is the identity. And .σ22 : S → S, .1 ⍿→ 1, .2 ⍿→ 2.
Hence .ord (σ2 ) = 2.
Definition 1.2.7 A group G is called cyclic if it is generated by one element, i.e., if
there exists an element .g ∈ G such that
{ }
G = gk | k ∈ Z .
.

Example 1.2.11 We have seen in Example 1.2.7, .S2 = { σ1 , σ2 }, where .σ1 is


the identity element. In Example 1.2.10, we discussed that .σ22 = σ1 . Hence
{ }
.S2 = σ2 , σ22 is a cyclic group.
We now state a very useful theorem about the order of a group and the order
of an element in the group. The proof follows from a famous theorem (Lagrange
Theorem) named after Joseph-Louis Lagrange (1736–1813). Details can be found
in, e.g., [Her96, page 59].
Theorem 1.2.1 Let .(G, ·) be a finite group with identity element e. For any .g ∈ G,
ord (g) divides .|G|, in particular, .g |G| = e.
.

A direct corollary is as follows.


Corollary 1.2.1 Let G be a group. If .|G| is a prime number, then G is cyclic.
Proof Let e denote the identity element in G. Take any element .g ∈ G such that
g /= e. By Theorem 1.2.1, .ord (g) divides .|G|. Since .|G| is prime and g is not the
.

identity element, .ord (g) = |G|.


We claim that
{ }
.G = g, g 2 , g 3 , . . . , g |G| .

Otherwise, we would have .g i = g j for some .1 ≤ i, j ≤ |G|, where .i /= j .


16 1 Mathematical and Statistical Background

Without loss of generality, we assume .i ≥ j . Multiplying both sides of .g i = g j


by .g −j , we get .g i−j = e.
By Definition 1.2.6, since { .0 ≤ i − j < }ord (g), we must have .i = j , a
contradiction. Hence .G = g, g 2 , g 3 , . . . , g |G| . ⨆

1.2.2 Rings

Next, we move to another abstract structure, rings.


Definition 1.2.8 A set R together with two binary operations .+ and .·, .(R, +, ·), is a
ring if .(R, +) is an abelian group, and for any .a, b, c ∈ R, the following conditions
are satisfied:
• R is closed under .· (closure), .a · b ∈ R.
• .· is associative, .(a · b) · c = a · (b · c).
• The distributive laws holds: .a · (b + c) = a · b + a · c and .(b + c) · a = b · a + c · a.
• The identity element for .· exists, which is different from the identity element for
.+.

Definition 1.2.9 If .a · b = b · a for all .a, b ∈ R, R is a commutative ring.


Remark 1.2.1
• For most cases, we will denote the identity element for .+ as 0 and the identity
element for .· as 1.
• We normally refer to the operation .+ as addition and 0 as the additive identity.
Similarly, we refer to the operation .· as multiplication and 1 as the multiplicative
identity.
• The inverse of an element .a ∈ R with respect to .+ is called the additive inverse
of a, usually denoted by .−a.
• The last condition in Definition 1.2.8 implies that a set consisting of only 0 is not
a ring.
• For simplicity, we sometimes write ab instead of .a · b.
• When the operations in .(R, +, ·) are clear from the context, we omit them and
write R.
Example 1.2.12 We have seen that .(Z, +) is an abelian group, and the identity
element is 0. It can be easily shown that .(Z, +, ×) is a commutative ring. The
identity element for .× is 1.
Similarly .(Q, +, ×), .(R, +, ×), and .(C, +, ×) are all commutative rings with 0
as the additive identity and 1 as the multiplicative identity.
Example 1.2.13 In Example 1.2.3, we have shown that .(M2×2 (R), +) is an abelian
⎛group. We ⎞ recall
⎛ matrix ⎞ multiplication, denoted by .×, for .2 × 2 matrices: For any
a00 a10 b00 b10
. , in .M2×2 (R),
a01 a11 b01 b11
1.2 Abstract Algebra 17

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a10 b00 b10 a00 b00 + a10 b01 a00 b10 + a10 b11
. × = .
a01 a11 b01 b11 a01 b00 + a11 b01 a01 b10 + a11 b11

.(M2×2 (R), +, ×) is a ring: Associativity and distributive laws


⎛ are
⎞ easy to show.
10
The identity element for .× is the .2 × 2 identity matrix . . We note that
01
.(M2×2 (R), +, ×) is not a commutative ring. For example,

⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
10 00 00 00 10 00
. = , but = .
00 10 00 10 00 10

Example 1.2.14 In Example 1.2.4 we have shown that .(F2 , ⊕) is an abelian group.
Let us define logical AND , denoted .&, in .F2 as follows:

0 & 0 = 0,
. 1 & 0 = 0 & 1 = 0, 1 & 1 = 1.

Closure of .F2 with respect to .&, associativity and commutativity of .&, and the
distributive laws are easy to see from the definitions. The identity element for .& is
1. .(F2 , ⊕, &) is a commutative ring.
Example 1.2.15 In Example 1.2.5 we showed that .(E, +) is an abelian group.
Define multiplication in E as follows:

a · a = a,
. a · b = b · a = a, b · b = b.

Closure of E with respect to .·, associativity of .·, commutativity of .·, and the
distributive laws are easy to see from the definitions. The identity element for .·
is b. Thus .(E, +, ·) is a commutative ring.
Definition 1.2.10 Let .(R, +, ·) be a ring with additive identity 0 and multiplicative
identity 1. Let .a, b ∈ R. If .a /= 0 and .b /= 0 but .a · b = 0, then a and b are called
zero divisors. If .a · b = b · a = 1, a (also b) is said to be invertible, and it is called
a unit.
Example 1.2.16
• There are no zero divisors in .(Z, +, ×), .(Q, +, ×), .(R, +, ×), or .(C, +, ×).
• Any nonzero element in .(Z, +, ×), .(Q, +, ×), .(R, +, ×), or .(C, +, ×) is a unit.
Example 1.2.17 As shown in ⎛ Examples
⎞ 1.2.3 and 1.2.13, .(M2×2 (R), +, ×) is a
00
ring. The additive identity is . . Since
00
⎛ ⎞⎛ ⎞ ⎛ ⎞
10 00 00
. = ,
00 10 00
18 1 Mathematical and Statistical Background

⎛ ⎞ ⎛ ⎞
10 00
by Definition 1.2.10, . and . are zero divisors.
00 10
Definition 1.2.11 An integral domain is a commutative ring with no zero divisors.
Example 1.2.18 .(Z, +, ×), .(Q, +, ×), .(R, +, ×), and .(C, +, ×) are all integral
domains.

1.2.3 Fields

Definition 1.2.12 A field is a commutative ring in which every nonzero element is


invertible.
By definition, for any a ∈ F , there exists b ∈ F such that a · b = b · a = 1. Then
b is called the multiplicative inverse of a. It is easy to show that the multiplicative
inverse of an element a is unique: Let b, c ∈ F be such that

ab = ac = 1.
.

Multiplying by b on the left, we get

bab = bac = b =⇒ b = c = b.
.

We will denote the multiplicative inverse of a nonzero element a ∈ F by a −1 .


Lemma 1.2.2 A field is an integral domain.
Proof Let F be a field. Suppose there are zero divisors in F . By Definition 1.2.10,
there exist a, b ∈ F such that a /= 0, b /= 0, and a · b = 0. Since F is a field, by the
above discussion, a −1 ∈ F . Multiplying both sides of a · b = 0 by a −1 , we get

a −1 · a · b = 1 · b = 0 =⇒ b = 0,
.

a contradiction. ⨆

Example 1.2.19
• (Q, +, ×), (R, +, ×), and (C, +, ×) are all fields.
• (Z, +, ×) is not a field. For example, 2 ∈ Z is not invertible and 2 /= 0.
For the rest of this subsection, let F be a field with addition + and multiplication
·.
Definition 1.2.13 A field with finitely many elements is called a finite field.
Example 1.2.20 In Example 1.2.14 we have shown that (F2 , ⊕, &) is a commu-
tative ring. The only nonzero element is 1, which has inverse 1 with respect to &.
Thus (F2 , ⊕, &) is a finite field.
1.2 Abstract Algebra 19

Example 1.2.21 In Example 1.2.15 we have shown that (E, +, ·) is a commutative


ring with additive identity a and multiplicative identity b. The only nonzero element,
i.e., the element not equal to the additive identity, is b. b has multiplicative inverse
b since b · b = b. Hence (E, +, ·) is a finite field.
For an element a ∈ F and an integer p, we define


p
p⊙a =
. a.
i=1

Definition 1.2.14 The characteristic of a field F is the smallest positive integer p


such that p ⊙ 1 = 0, where 1 is the multiplicative identity of F . If no such p exists,
we define the characteristic of the field to be 0.
Example 1.2.22
• The characteristics of R, Q, and C are 0.
• The characteristic of the field F2 in Example 1.2.20 is 2 since

. 2 ⊙ 1 = 1 ⊕ 1 = 0.

• The characteristic of the field E in Example 1.2.21 is 2 since

2 ⊙ b = b + b = a.
.

Theorem 1.2.2 The characteristic of a field is either 0 or a prime number.


Proof First, we note that the characteristic of a field is not equal to 1 since 1 ⊙ 1 =
1 /= 0.
Suppose the characteristic p = mn is not a prime, where m, n ∈ Z and 1 < m, n <
p. Let a = n ⊙ 1, b = m ⊙ 1. Then
⎛ ⎞⎛ m ⎞

n ⎲
a·b = (n⊙1)·(m⊙1) =
. 1 ·⎝ 1⎠ = (mn)⊙1 = 0 =⇒ n⊙1 = 0 or m⊙1 = 0,
i=1 j =1

where the last part follows from Lemma 1.2.2. As n, m are both strictly smaller than
p, we have a contradiction. ⨆

Definition 1.2.15 Let E, F be two fields with F ⊂ E. F is called a subfield of E
if the addition and multiplication of E, when restricted to F , are the same as those
in F .
Example 1.2.23 Q is a subfield of R, and R is a subfield of C.
20 1 Mathematical and Statistical Background

Definition 1.2.16 Let (F, +F , ·F ), (E, +E , ·E ) be two fields. F is said to be


isomorphic to E, written F ∼ = E, if there is a bijective function f : F → E
such that for any a, b ∈ F ,
(1) f (a +F b) = f (a) +E f (b).
(2) f (a ·F b) = f (a) ·E f (b).
The function f is called a field isomorphism.
A function f : F → E that satisfies condition (1) in Definition 1.2.16 is said to
preserve the addition. Similarly, a function g : F → E that satisfies condition (2)
in Definition 1.2.16 is said to preserve the multiplication.
Example 1.2.24 Let us consider the fields (F2 , ⊕, &) from Example 1.2.20 and
(E, +, ·) from Example 1.2.21. Define f : F → E, such that

f (0) = a,
. f (1) = b.

It is easy to see that f is bijective. Also, it can be shown that f preserves both
addition and multiplication. For example,

f (1 ⊕ 0) = f (1) = a, f (1) + f (0) = a + b = a =⇒ f (1 ⊕ 0) = f (1) + f (0).


.

Thus f is a field isomorphism and F2 ∼


= E.
In fact, it can be shown that any finite field with two elements is always isomorphic
to F2 . The next theorem says that, in general, there is only one finite field up to
isomorphism. The proof can be found in, e.g., [Her96, page 224].
Theorem 1.2.3
• Let K be a finite field of characteristic p. Then K contains pn elements.
• For any prime p and any positive integer n, there exists, up to isomorphism, a
unique field with pn elements.

Remark 1.2.2
• We will use Fpn to denote the unique finite field with pn elements.
• Let K be a finite field with characteristic p and multiplicative identity 1. Then
K contains 1, 2, . . . , p − 1, 0, the p multiples of 1. Thus, K contains a subfield
isomorphic to Fp .
Furthermore, we define the notion of bit formally.
Definition 1.2.17
• Variables that range over F2 are called Boolean variables or bits.
• Addition of two bits is defined to be logical XOR , also called exclusive OR.
• Multiplication of two bits is defined to be logical AND.
• When the value of a bit is changed, we say the bit is flipped.
1.3 Linear Algebra 21

1.3 Linear Algebra

The most readers are probably very familiar with linear algebra. However, when we
learned about matrices in high school, we focused on the case when the underlying
abstract structure is a field. In Sect. 1.3.1 we will see the general case when the
underlying abstract structure is a commutative ring. Then in Sect. 1.3.2 we recap
concepts for vector spaces.

1.3.1 Matrices

Let R be a commutative ring with additive identity 0 and multiplicative identity 1


throughout this subsection.
Definition 1.3.1 A matrix with coefficients in R is a rectangular array where each
entry is an element of R.
Matrix A as shown in Eq. 1.3 is said to have m rows and n columns and is of size .m×
n. The transpose of A, denoted .AT , is the .n × m matrix obtained by interchanging
the rows and columns of A.
⎛ ⎞ ⎛ ⎞
a00 ... a0(n−1) a00 ... a(m−1)0
⎜ a10 ... a1(n−1) ⎟ ⎜ a01 ... a(m−1)1 ⎟
⎜ ⎟ ⎜ ⎟
.A = ⎜ .. ⎟, AT = ⎜ .. ⎟. (1.3)
⎝ . ⎠ ⎝ . ⎠
a(m−1)0 . . . a(m−1)(n−1) a0(n−1) . . . a(m−1)(n−1)

The ith row of A is


( )
. ai0 ai1 . . . ai(n−1) ,

and the j th column of A is


⎛ ⎞
a0j
⎜ a1j ⎟
⎜ ⎟
. ⎜ .. ⎟,
⎝ . ⎠
a(m−1)j

where .aij denotes the entry in the ith row and j th column. If .aij = 0 for .i /= j , A
is said to be a diagonal matrix. An n-dimensional identity matrix, denoted .In , is a
diagonal matrix whose diagonal entries are 1, i.e., .aii = 1 for .i = 0, 1, . . . , n − 1.
A .1 × n matrix is called a row vector. An .n × 1 matrix is called a column vector. An
.n × n matrix is called a square matrix (i.e., a matrix with the same number of rows

and columns).
22 1 Mathematical and Statistical Background

Example 1.3.1 Let .R = Z.


⎛ ⎞
9 1
• .A = is a .2 × 2 matrix with coefficients in .Z. .a00 = 9 and .a01 = 1.
0 −2
⎛ ⎞
⎛ ⎞ 100
10
• .I2 = and .I3 = ⎝0 1 0⎠.
01
001
⎛ ⎞
5 0
• . is a diagonal matrix.
0 −1
We define the addition of two .m × n matrices componentwise:
⎛ ⎞ ⎛ ⎞
a00 ... a0(n−1) b00 ... b0(n−1)
⎜ a10 ... a1(n−1) ⎟ ⎜ b10 ... b1(n−1) ⎟
⎜ ⎟ ⎜ ⎟
⎜ .. ⎟+⎜ .. ⎟
⎝ . ⎠ ⎝ . ⎠
a(m−1)0 . . . a(m−1)(n−1) b(m−1)0 . . . b(m−1)(n−1)
. (1.4)
⎛ ⎞
a00 + b00 ... a0(n−1) + b0(n−1)
⎜ a10 + b10 ... a1(n−1) + b1(n−1) ⎟
⎜ ⎟
=⎜ .. ⎟.
⎝ . ⎠
a(m−1)0 + b(m−1)0 . . . a(m−1)(n−1) + b(m−1)(n−1)

Example 1.3.2 Let .R = Z. Below is an example of addition between two .2 × 2


matrices with coefficients in .Z:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
2 3 4 −2 6 1
. + = .
1 −1 0 −5 1 −6

Definition 1.3.2 The scalar product of a .1 × n row vector .v = (v0 , v1 , . . . , vn−1 )


with an .n × 1 column vector .w = (w0 , w1 , . . . , wn−1 )T is given by
⎛ ⎞
w0
( )⎜
⎜ w1 ⎟ ⎲

n−1
v · w = v0 v1 . . . vn−1 ⎜
. .. ⎟= vi wi .
⎝ . ⎠
i=0
wn−1
( ) ( )T
Example 1.3.3 Let .R = Z. The scalar product of . 2 3 and . 4 0 is
⎛ ⎞
( ) 4
. 23 = 2 × 4 + 3 × 0 = 8 + 0 = 8.
0

We define the multiplication of an .m×n matrix A with an .n×r matrix B as follows:


1.3 Linear Algebra 23

⎛ ⎞⎛ ⎞
a00 ... a0(n−1) b00 ... b0(r−1)
⎜ a10 ... ⎟ ⎜
a1(n−1) ⎟ ⎜ b10 ... b1(r−1) ⎟
⎜ ⎟
AB = ⎜ .. ⎟⎜ .. ⎟
⎝ . ⎠⎝ . ⎠
a(m−1)0 . . . a(m−1)(n−1) b(n−1)0 . . . b(n−1)(r−1)
.
⎛ ⎞ (1.5)
c00 . . . c0(r−1)
⎜ c10 . . . c1(r−1) ⎟
⎜ ⎟
=⎜ .. ⎟,
⎝ . ⎠
c(m−1)0 . . . c(m−1)(r−1)

where .cij is the scalar product of the ith row of A and the j th column of B:


n−1
cij =
. aik bkj , i = 0, 1, . . . , m − 1, j = 0, 1, . . . , r − 1.
k=0

Example 1.3.4 Let .R = Z. Below is an example for multiplication of two .2 × 2


matrices with coefficients in .Z:
⎛ ⎞⎛ ⎞ ⎛ ⎞
2 3 4 −2 8 −19
. = .
1 −1 0 −5 4 3

Definition 1.3.3 An .n × n square matrix A is said to be invertible if there exists an


n × n matrix B such that
.

.AB = BA = In .

B is called the inverse of A. We will use .A−1 to denote this matrix.


Example 1.3.5 Let .R = Z. We have
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
21 1 −1 1 −1 21 10
. = = .
11 −1 2 −1 2 11 01
⎛ ⎞ ⎛ ⎞
21 1 −1
Hence, the .2 × 2 matrix .A = is invertible and its inverse .A−1 = .
11 −1 2
Theorem 1.3.1 Let n be a positive integer. We define .Mn×n (R) to be the set of .n×n
square matrices with coefficients in R. Then .Mn×n (R) together with addition and
multiplication defined in Eqs. 1.4 and 1.5 is a ring. It is not a commutative ring when
.n ≥ 2.

Proof In Examples 1.2.3 and 1.2.13 we have shown that .M2×2 (R) is a ring. The
proof for the general case is similar.
24 1 Mathematical and Statistical Background

The closure of .Mn×n (R) with respect to both operations is easy to see.
Associativity and distributive laws for addition and multiplication follow from the
corresponding properties of R.
The additive identity is the zero matrix of size .n × n. The additive inverse of a
matrix A with coefficients .aij (.0 ≤ i, j ≤ n − 1) is given by .−A with coefficients
.−aij , .(0 ≤ i, j ≤ n − 1). The multiplicative identity is .In .

When .n = 1, .M1×1 (R) is a commutative ring because R is commutative.


When .n ≥ 2, let
⎛ ⎞ ⎛ ⎞
1 0 ... 0 0 0 ... 0
⎜0 0 ... 0⎟ ⎜0 0 ... 0⎟
⎜ ⎟ ⎜ ⎟
.A = ⎜ . .. .. .. ⎟ , B = ⎜. .. . . .. ⎟ .
⎝ .. . . .⎠ ⎝ .. . . .⎠
0 0 ... 0 1 0 ... 0

Then
⎛ ⎞ ⎛ ⎞
0 0 ... 0 0 0 ... 0
⎜0 0 ... 0⎟ ⎜0 0 ... 0⎟
⎜ ⎟ ⎜ ⎟
.AB = ⎜ . .. .. .. ⎟ , BA = ⎜ . .. . . .. ⎟ .
⎝ .. . . .⎠ ⎝ .. . . .⎠
0 0 ... 0 1 0 ... 0

Hence .AB /= BA and .Mn×n (R) is not commutative for .n ≥ 2 ⨆



In general, not every matrix is invertible. To find the inverse of an invertible matrix,
we will need the following definition.
Definition 1.3.4 Let n be a positive integer. For any .A ∈ Mn×n (R), the determi-
nant of A, denoted .det(A), is defined as follows:
• If .n = 1, .A = (a), .det(A) := a.
• If .n > 1, let .Aij denote the matrix obtained from A by deleting the ith row and
the j th column. Fix an .i0 ,


n−1
. det(A) := (−1)i0 +j ai0 j det(Ai0 j ). (1.6)
j =0

We note that the value of .det(A) is independent of the choice of .i0 in Eq. 1.6 (see
Appendix A.1). Similarly, .det(A) can also be found by fixing a .j0 and computing


n−1
. det(A) = (−1)i+j0 aij0 det(Aij0 ).
i=0
1.3 Linear Algebra 25

⎛ ⎞
a00 a01
Example 1.3.6 Let .n = 2; for any .A ∈ M2×2 (R), we can write .A = .
a10 a11
Take .i0 = 0,


n−1 ⎲
1
. det(A) = (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ) = a00 a11 − a01 a10 .
j =0 j =0

Theorem 1.3.2 A matrix .A ∈ Mn×n (R) is invertible in .Mn×n (R) if and only if
det(A) is a unit in R.
.
When .det(A) is a unit in R, if .n = 1 and .A = (a), then .A−1 = (a −1 ). If .n > 1,
we define the adjoint matrix of A as follows:
⎛ ⎞
(−1)0+0 det(A00 ) (−1)0+1 det(A10 ) .. (−1)0+(n−1) det(A(n−1)0 )
⎜ . . . ⎟
.adjA:=⎜ ⎟,
. . .. .
⎝ . . . . ⎠
(−1)(n−1)+0 det(A0(n−1) ) (−1)(n−1)+1 det(A1(n−1) ) .. (−1)(n−1)+(n−1) det(A(n−1)(n−1) )

where the .(i, j )-entry of .adjA is given by .(−1)i+j det(Aj i ). Then

.A−1 = (det(A))−1 adjA.

The proof can be found in, e.g., [Hun12, page 353].


Example
⎛ ⎞1.3.7 Let .n = 2; by Example 1.3.6 and Theorem 1.3.2, a matrix .A =
a00 a01
from .M2×2 (R) is invertible if and only if .a00 a11 − a01 a10 is a unit in R.
a10 a11
When .a00 a11 − a01 a10 is a unit in R, the adjoint matrix of A is given by
⎛ ⎞
a11 −a01
adjA =
. .
−a10 a00

And the inverse of matrix A is given by


⎛ ⎞
−1 −1 a11 −a01
A
. = (a00 a11 − a01 a10 ) . (1.7)
−a10 a00
⎛ ⎞
23
Example 1.3.8 Let .R = Z. By Example 1.3.6, .A = has determinant .14 −
47
12 = 2. 2 is not a unit in .Z. By Theorem 1.3.2, A is not invertible in .M2×2 (Z).
However, if we consider .R = Q, 2 is a unit in .Q. By Theorem 1.3.2, A is invertible
in .M2×2 (Q), and we can compute .A−1 using Eq. 1.7:
⎛ ⎞ ⎛ ⎞
−1 1 7 −3 3.5 −1.5
A
. = = .
2 −4 2 −2 1
26 1 Mathematical and Statistical Background

1.3.2 Vector Spaces

Let F be a field with additive identity 0 and multiplicative identity 1.


Definition 1.3.5 (Vector space) A nonempty set V , together with two binary
operations—vector addition (denoted by .+) and scalar multiplication by elements
of F , which is a map with domain .V × F and codomain V , is called a vector space
over F if .(V , +) is an abelian group, and for any .v, w ∈ V and any .a, b ∈ F , we
have:
1. .a(v + w) = av + aw.
2. .(a + b)v = av + bv.
3. .a(bv) = (ab)v.

4. .1v = v, where 1 is the multiplicative identity of F .

Elements of V are called vectors, and elements of F are called scalars.


Remark 1.3.1 It is easy to see that if 0 is the additive identity in F and .v any vector
in V , then .0v = 0 is the additive identity in V (or the identity for vector addition).
Example 1.3.9 The set of complex numbers .C = { x + iy | x, y ∈ R } is a vector
space over .R. Note that for any .a1 + b1 i, a2 + b2 i ∈ C, vector addition is defined as

(a1 + b1 i) + (a2 + b2 i) = (a1 + a2 ) + (b1 + b2 )i.


.

And for any .a ∈ R, scalar multiplication by elements of .R is defined as

a(a1 + b1 i) = aa1 + ab1 i.


.

The identity element for vector addition is 0. Furthermore, for any .a + bi ∈ C, its
inverse with respect to vector addition is given by .−a − bi.
Let .F n = { (v0 , v1 , . . . , vn−1 ) | vi ∈ F ∀i } be the set of n-tuples over F . Define
vector addition and scalar multiplication by elements of F componentwise: For any
.v = (v0 , v1 , . . . , vn−1 ) ∈ F , .w = (w0 , w1 , . . . , wn−1 ) ∈ F , and any .a ∈ F ,
n n

v + w := (v0 + w0 , v1 + w1 , . . . , vn−1 + wn−1 ),


. (1.8)

av := (av0 , av1 , . . . , avn−1 ).


. (1.9)

Theorem 1.3.3 Together with vector addition and scalar multiplication by ele-
ments of F defined in Eqs. 1.8 and 1.9, respectively, .F n = { (v0 , v1 , . . . , vn−1 ) | vi ∈
F ∀i } is a vector space over F .
Proof Take any .v = (v0 , v1 , . . . , vn−1 ), w = (w0 , w1 , . . . , wn−1 ) from .F n and
any .a, b ∈ F .
By Eq. 1.8, it is easy to see that .F n is closed under vector addition. The
associativity and commutativity of vector addition follow from that for addition in
1.3 Linear Algebra 27

F . The identity element for vector addition is .(0, 0, . . . , 0), where 0 is the additive
identity in F . The inverse of .v ∈ F n is .(−v0 , −v1 , . . . , −vn−1 ), where .−vi is the
additive inverse of .vi in F . Thus .F n with vector addition is an abelian group.
By the definition of scalar multiplication by elements of F (Eq. 1.9), .av ∈ F n .
Properties 1 and 2 in Definition 1.3.5 follow from distributive law in F . Property 3
follows from the associativity of multiplication in F . Property 4 follows from the
definition of multiplicative identity in F . ⨆

Example 1.3.10 Let .F = F2 , the unique finite field with two elements (see
Example 1.2.20 and Theorem 1.2.3). Let n be a positive integer, and it follows from
Theorem 1.3.3 that .Fn2 is a vector space over .F2 .
The identity element for vector addition is .(0, 0, . . . , 0). For any .v =
(v0 , v1 , . . . , vn−1 ) ∈ Fn2 , the inverse of .v with respect to vector addition is
.(−v0 , −v1 , . . . , −vn−1 ) = v.

Recall that variables ranging over .F2 are called bits (see Definition 1.2.17). We have
shown that .(F2 , ⊕, &) is a finite field (see Example 1.2.20), where .⊕ is logical XOR
(see Example 1.2.4), and .& is logical AND (see Example 1.2.14).
Definition 1.3.6 Vector addition in .Fn2 is called bitwise XOR , also denoted .⊕.
Similarly, we define bitwise AND between any two vectors .v = (v0 , v1 , . . . , vn−1 ),
.w = (w0 , w1 , . . . , wn−1 ) from .F as follows:
n
2

v & w := (v0 & w0 , v1 & w1 , . . . , vn−1 & wn−1 ).


.

Remark 1.3.2 Another useful binary operation, logical OR, denoted .∨, on .F2 is
defined as follows:

.0 ∨ 0 = 0, 1 ∨ 0 = 1, 0 ∨ 1 = 1, 1 ∨ 1 = 1.

It can also be extended to .Fn2 in a bitwise manner, and we get bitwise OR .


For simplicity, we sometimes write .v0 v1 . . . vn−1 instead of .(v0 , v1 , . . . , vn−1 ).
Example 1.3.11 Let .n = 3, and take .111, 101 ∈ F32 , .111 ⊕ 101 = 010,
.111 & 101 = 101, .111 ∨ 101 = 111.

Definition 1.3.7 A vector in .Fn2 is called an n-bit binary string. A 4-bit binary string
is called a nibble. An 8-bit binary string is called a byte.
Example 1.3.12
• .1010, 0011 ∈ F42 are two nibbles. Furthermore,

1010 ⊕ 0011 = 1001,


. 1010 & 0011 = 0010.

• 00101100 is a byte.
28 1 Mathematical and Statistical Background

Remark 1.3.3 By Theorem 1.1.1, a byte can be considered as a base.−2 repre-


sentation/binary representation of an integer (see Theorem 1.1.1). By Eq. 1.2, the
value of this integer is between 0 and 255 or between 00.16 and FF.16 with base.−16
representation/hexadecimal representation.
For the rest of this section, let V be a vector space over F .
Definition 1.3.8 A nonempty subset .U ⊆ V is called a subspace of V if U
is a vector space over F under the same operations (vector addition and scalar
multiplication by elements of F ) in V .
Remark 1.3.4 To show .U ⊂ V is a subspace of V , by Definitions 1.3.5, 1.2.1,
and 1.2.2, we need to prove the following:
1. .(U, +) is an abelian group.
(a) U is closed under .+ (closure property): .∀u, v ∈ U , .u + v ∈ U .
(b) + is associative: .∀u, v, w ∈ U , .u + (v + w) = (u + v) + w.
.

(c) The identity element for vector addition in V is also in U .


(d) For .v ∈ U , its additive inverse in V is also in U .
2. Scalar multiplication by elements of F is a function with domain .U × F and
codomain U .
3. For any .v, w ∈ U and any .a, b ∈ F , we have
(a) .a(v + w) = av + aw.
(b) .(a + b)v = av + bv.
(c) .a(bv) = (ab)v.

(d) .1v = v, where 1 is the multiplicative identity in F .

We note that 1-(b) and 3 follow from the corresponding properties of V . Thus, to
prove U is a subspace of V , we need to prove 1-(a), 1-(c), 1-(d), and 2.
In case .F = F2 , by Example 1.3.10, 1-(d) is true by default. Furthermore, 2 is
also true as there are only two elements in .F2 : 0 and 1. To show U is a subspace
when .F = F2 , it suffices to prove 1-(a) and 1-(c).
Definition 1.3.9 A linear combination of .v 1 , v 2 , . . . , v r ∈ V is a vector of the form
a1 v 1 + a2 v 2 + · · · + ar v r , where .ai ∈ F ∀i.
.

Lemma 1.3.1 For any .v 1 , v 2 , . . . , v r ∈ V (.r ≥ 1), .U := { a1 v 1 + a2 v 2 + · · · +


ar v r | ai ∈ F } is a subspace of V .
Proof By Remark 1.3.4, we will prove 1-(a), 1-(c), 1-(d), and 2.

r
Take any .v = ai v i ∈ U .
i=1

r
1-(a). For any .u = bi v i ∈ U ,
i=1
1.3 Linear Algebra 29


r ⎲
r ⎲
r
v+u=
. ai v i + bi v i = (ai + bi )v i ∈ U.
i=1 i=1 i=1

1-(c). Let .ai = 0 ∈ F , then (see Remark 1.3.1)


r
0=
. ai v i ∈ U.
i=1

1-(d). The inverse of .v with respect to vector addition is given by


r
u :=
. (−ai )v i
i=1

because .v + u = 0. Furthermore, since .−ai ∈ F , we have .u ∈ U .


2. For any .α ∈ F ,


r ⎲
r
α
. ai v i = (αai )v i ∈ U.
i=1 i=1



Definition 1.3.10 Let .S = { v 1 , v 2 , . . . , v r } ⊆ V ,

〈S〉 := { a1 v 1 + a2 v 2 + · · · + ar v r | ai ∈ F }
.

is called the (linear) span of S over F . For any subspace .U ⊆ V and a subset S of
U , if .U = 〈S〉, S is called a generating set for U .
We note that if S is a subspace of V , then .〈S〉 = S.
Example 1.3.13 Let .V = F32 and .S = { 001, 100 }, then .〈S〉 = { 000, 001, 100, 101 }
Definition 1.3.11 A set of vectors .{ v 1 , v 2 , . . . , v r } ⊆ V are linearly independent
over F if


r
. ai v i = 0 =⇒ ai = 0 ∀i.
i=1

Otherwise, they are said to be linearly dependent over F .


Example 1.3.14
• Let .F = F, V = F32 . 001 and 100 are linearly independent.
• For any .S ⊆ V , if .0 ∈ S, then the vectors in S are linearly dependent.
• Let .F = R, V = R3 , and .(0, 1, 0) and .(0, 0, 1) are linearly independent.
30 1 Mathematical and Statistical Background

(0, 1, 0), (2, 3, 0), (1, 0, 0) are linearly dependent since, for example, we have
.

3 · (0, 1, 0) + (−1) · (2, 3, 0) + 2 · (1, 0, 0) = (0, 0, 0).


.

Definition 1.3.12 Let B be a nonempty subset of V . If .V = 〈B〉 and vectors in B


are linearly independent, then B is called a basis for V over F .
Remark 1.3.5 Suppose B is a basis for V and .B = { v 1 , v 2 , . . . , v r }. Then any
element .v ∈ V has a unique representation as a linear combination of vectors in B:


r ⎲
r ⎲
r
.v= ai v i = bi v r =⇒ (ai − bi )v i = 0 =⇒ ai = bi .
i=1 i=1 i=1

Example 1.3.15
• Let .F = R, .V = R3 , and .B = { (1, 0, 0), (0, 1, 0), (0, 0, 1) }. It is easy to see
that vectors in B are linearly independent. For any .v = (v0 , v1 , v2 ) ∈ R3 , we
have

v = v0 (1, 0, 0) + v1 (0, 1, 0) + v2 (0, 0, 1).


.

Thus, B is a generating set of V. By definition, B is a basis for V over .R.


• Let .F = F2 and .V = F32 ; similarly, we can show .{ (1, 0, 0), (0, 1, 0), (0, 0, 1) }
is a basis for V over .F2 .
Example 1.3.16 Let .V = F n and .B = { v 0 , v 1 , . . . , v n−1 }, where

v i = (vi0 , vi1 , . . . , vi(n−1) ),


. vii = 1 and vij = 0 for i /= j.

It is easy to see that vectors in B are linearly independent. For any .u =


(u0 , u1 , . . . , un−1 ) ∈ V , we can write


n−1
u=
. u𝓁 v 𝓁 .
𝓁=0

Thus, B is a generating set of V . By definition, B is a basis for V over F .


Lemma 1.3.2 Let .B1 , B2 be subsets of V . If .V = 〈B1 〉 and vectors in .B2 are
linearly independent, then .|B1 | ≥ |B2 |.
{ } { }
Proof Suppose .B1 = v 1 , v 2 , . . . , v r1 and .B2 = w1 , w 2 , . . . , w r2 . Since .V =
〈B1 〉,


r1
w1 =
. aj v j
j =1
1.3 Linear Algebra 31

for some .aj ∈ F . Moreover, at least one of .aj /= 0 as vectors in .B2 are linearly
independent. Without loss of generality, let us assume .a1 /= 0, then


r1
aj 1
v1 = −
. vj + w1 ,
a1 a1
j =2

{ }
and we have . w1 , v 2 , . . . , v r1 spans V . Then, we can write


r1
w2 = b1 w1 +
. bj v j ,
j =2

where .bj ∈ F , and at least one of .bj /= 0 for .2 ≤ j ≤ r1 , otherwise .w2 is a linear
combination of .w1 . Suppose .b2 /= 0. We have

b1 ⎲
1
bj
r
1
v2 = −
. w1 − v j + w2 ,
b2 b2 b2
j =3

{ }
which means . w1 , w 2 , v 3 , . . . , v r1 spans V .
{ We can continue } in this manner, if .r1 < r2 , we will deduce that
. w 1 , w 2 . . . , w r1 spans V , and .wr1 +1 can be written as a linear combination
{ }
of . w 1 , w 2 . . . , w r1 , a contradiction. ⨆

We have the following direct corollary.
Corollary 1.3.1 If .B1 and .B2 are bases of V , then .|B1 | = |B2 |.
Proof By Lemma 1.3.2, .|B1 | ≤ |B2 | and .|B2 | ≤ |B1 |. ⨆

Definition 1.3.13 The dimension of V over F , denoted .dim(V )F , is given by the
cardinality of B, .|B|, where B is a basis of V over F .
Example 1.3.17 Continuing Example 1.3.16, .dim(F n )F = n.
Lemma 1.3.3 Let .F = F2 ; if .dim(V )F2 = k, then .|V | = 2k .
Proof Let .B = { v 1 , v 2 , . . . , v k } be a basis for V . We have discussed in
Remark 1.3.5 that every .w ∈ V has a unique representation as a linear combination
of vectors in B. In other words,
⎧ ⎫

k |
|
.V = ai v i | ai ∈ F2 , 1 ≤ i ≤ k ,
i=1

where there are two choices for each .ai . ⨆



32 1 Mathematical and Statistical Background

Example 1.3.18 Let .F = F2 , .S = { 0010, 1000 }, and .V = 〈S〉. It is easy to see


that vectors in S are linearly independent. By Definition 1.3.13, .dim(V )F2 = 2. By
Lemma 1.3.3, .|V | = 4. We can verify that .V = { 0000, 0010, 1000, 1010 }.
For any .v = (v0 , v2 , . . . , vn−1 ) ∈ Fn2 and .w = (w0 , w2 , . . . , wn−1 ) ∈ Fn2 , we can
consider .v as a row vector and .w as a column vector and compute the scalar product
(see Definition 1.3.2) between .v and .w:


n−1
.v·w = vi wi .
i=0

We note for any .u = (u0 , u1 , . . . , un−1 ) ∈ Fn2


n−1 ⎲
n−1 ⎲
n−1
(v + w) · u =
. (vi + wi )ui = vi ui + wi ui = v · u + w · u. (1.10)
i=0 i=0 i=0

Definition 1.3.14
• For any .v, w ∈ Fn2 , .v and .w are said to be orthogonal if .v · w = 0.
• Let .S ⊆ Fn2 be nonempty. The orthogonal complement, denoted .S ⊥ , of S is given
by
{ }
S ⊥ = v | v ∈ Fn2 , v · s = 0 ∀s ∈ S .
.

• If .S = ∅, we define .S ⊥ = Fn2 .
By definition, it is easy to see that .〈S〉⊥ = S ⊥ .
Lemma 1.3.4 For any .S ⊆ V , .S ⊥ is a subspace of .Fn2 .
Proof By Remark 1.3.4, we will prove 1-(a) and 1-(c).
1-(a). Take any .v, u ∈ S ⊥ and any .s ∈ S, by Eq. 1.10, we have

(v + w) · s = v · s + u · s = 0,
.

hence .v + w ∈ S ⊥ .
1-(c). .0 · s = 0 for any .s ∈ S. Hence .0 ∈ S ⊥ . ⨆

1.4 Modular Arithmetic

In this section, let .n > 1 be an integer.


We are interested in the set .{ 0, 1, 2 . . . , n − 1 }. It can be considered as the
set of possible remainders when dividing by n (see Theorem 1.1.2). We will also
associate each integer with one element in the set—namely the remainder of this
1.4 Modular Arithmetic 33

integer divided by n. Here we would like to provide a rigorous definition for this
association. First, we introduce the notion of equivalence relations.
Definition 1.4.1 A relation .∼ on a set S is called an equivalence relation if
∀a, b, c ∈ S, and the following conditions are satisfied.
.

• .a ∼ a (reflexivity).
• If .a ∼ b, then .b ∼ a (symmetry).
• If .a ∼ b and .b ∼ c, then .a ∼ c (transitivity)
Let us define a relation .∼ on the set .Z as follows:

a∼b
. if and only if n|(b − a). (1.11)

We can see that this is an equivalence relation on .Z.


• .∀a ∈ Z, .0 = a − a and .n|0; hence .a ∼ a (reflexivity).
• If .n|(a − b), then .n|(b − a), and we have .a ∼ b implies .b ∼ a (symmetry).
• If .n|(a − b) and .n|(b − c), then

n|((a − b) + (b − c)) =⇒ n|(a − c).


.

Thus .a ∼ b and .b ∼ c imply .a ∼ c (transitivity).


Definition 1.4.2 Take .a, b ∈ Z. If .a ∼ b, i.e., .n|(b − a), then we say that a is
congruent to b modulo n, written .a ≡ b mod n. n is called the modulus.
By the above definitions, saying a is congruent to b modulo n is equivalent to saying
that the remainder of a divided by n is the same as the remainder of b divided by n.
Definition 1.4.3 If .∼ is an equivalence relation on a set S, then the equivalence
class of an element .a ∈ S, denoted .a, is defined by

a := { b | b ∈ S, b ∼ a } .
.

Theorem 1.4.1 If .∼ is an equivalence relation on a set S, then .∼ partitions S into


disjoint equivalence classes. That is,
⋃ ⋂
S=
. a, and a b = ∅ if a /= b.

Proof It is easy to see that .S = a.
To prove the second part, we show that the following equivalent claim is true:

if a
. b /= ∅, then a = b.

Let c be an element of .a b. By Definition 1.4.3, .c ∼ a and .c ∼ b. By symmetry
(Definition 1.4.1), .a ∼ c. By transitivity (Definition 1.4.1), .a ∼ b. Hence .a ∈ b.
34 1 Mathematical and Statistical Background

Now for any .d ∈ a, .d ∼ a. By transitivity (Definition 1.4.1), .d ∼ b. Then by


Definition 1.4.3, .d ∈ b. We have .a ⊂ b.
Similarly, we can prove .b ⊂ a. Hence .a = b. ⨅

Definition 1.4.4 For any .a ∈ Z, the congruence class of a modulo n, denoted .a, is
defined to be the equivalence class of a with respect to the equivalence relation .∼
defined in Eq. 1.11.
We note that the set .a consists of all integers of the form .a + nk for some .k ∈ Z.
Lemma 1.4.1 Let .Zn denote the set of all congruence classes of .a ∈ Z modulo n.
Then
{ }
Zn = 0, 1, . . . , n − 1 .
.

Proof By Theorem 1.1.2, given any .b ∈ Z, we can find .q, r ∈ Z such that

0 ≤ r < n and b = qn + r =⇒ b ∼ r.
.

{ }
By Theorem 1.4.1, we have .b = r. Hence the set . 0, 1, . . . , n − 1 contains all the
congruence classes of integers modulo n, possibly with some repetitions.
If .r 1 = r 2 for some .0 ≤ r1 , r2 < n, then .n|(r1 − r2 ). Since .0 ≤ r1 , r2 < n, we
have .r1 = r2 . Thus .0, 1, . . . , n − 1 are all distinct. ⨆

Remark 1.4.1 .a = b if and only if .a ≡ b mod n.
Example
{ } Let .n = 5. We have .1 = 6 = −4. By Lemma 1.4.1, .Z5 =
1.4.1
0, 1, 2, 3, 4 .
We define the addition operation on the set .Zn as follows:

.a + b = a + b. (1.12)

If .a = a ' and .b = b' , we have .n|(a ' − a) and .n|(b' − b), therefore

n|((a ' − a) + (b' − b)) =⇒ n|((a ' + b' ) − (a + b)) =⇒ (a + b) ∼ (a ' + b' )
.
=⇒ a + b = a ' + b' .

Thus the addition in Eq. 1.12 is well-defined.


Example 1.4.2
• Let .n = 7, .3 + 2 = 5.
• Let .n = 4, .2 + 2 = 4 = 0.
Proposition 1.4.1 .(Zn , +), the set .Zn together with addition defined in Eq. 1.12, is
an abelian group.
1.4 Modular Arithmetic 35

Proof For any .a, b ∈ Zn , .a + b ∈ Zn . Hence .Zn is closed under .+. The
associativity follows from the associativity of the addition of integers. The identity
element is .0, and the inverse of .a is .n − a:

a + n − a = n − a + a = n = 0.
.

The commutative property follows from that for integer addition. ⨆



Remark 1.4.2 The proof also shows that the additive inverse of an element .a ∈ Zn
is .n − a = −a, and the identity element is .0.
Example 1.4.3
• Let .n = 5; the inverse of .1 in .(Z5 , +) is .5 − 1 = 4.
• Let .n = 8; the inverse of .2 in .(Z8 , +) is .8 − 2 = 6.
Lemma 1.4.2 .(Zn , +) is a cyclic group.
Proof Recall that the identity element in .(Zn , +) is .0. It is easy to see that .1 has
order n (see Definition 1.2.6):

1+1 = 2
.

1+1+1 = 3
..
.
+ . . . 1◞ = n − 1
◟1 + 1◝◜
n−1 times

+ . . . 1◞ = n = 0.
◟1 + 1◝◜
n times



We define the multiplication on .Zn as follows:

a · b = ab.
. (1.13)

If .a ' = a and .b' = b, then we can write .a ' = a + sn, b' = b + tn for some integers
.s, t. We have

a ' b' = ab + n(at + sb + st) =⇒ a ' b' ∼ ab.


.

Hence .a ' b' = ab, and the multiplication in Eq. 1.13 is well-defined.
Example 1.4.4 Let .n = 5,

−2 · 13 = 3 · 3 = 9 = 4.
.
36 1 Mathematical and Statistical Background

Theorem 1.4.2 .(Zn , +, ·), the set .Zn together with addition defined in Eq. 1.12 and
multiplication defined in Eq. 1.13, is a commutative ring. It is an integral domain if
and only if n is prime.
Proof In Proposition 1.4.1 we have shown that .(Zn , +) is an abelian group.
Take any .a, b ∈ Zn , .ab ∈ Zn . Hence .Zn is closed under .·. Associativity,
commutativity of multiplication, and distributive laws follow from that for the
integers. The identity element for multiplication is .1. We have proved that .(Zn , +, ·)
is a commutative ring.
If n is not a prime, let m be a prime that divides n. Then .d = n/m is an integer
and .d /= 0. We have

m · d = n = 0.
.

By Definition 1.2.10, .d, m are zero divisors in .Zn . By Definition 1.2.11, .Zn is not
an integral domain.
Let n be a prime. Suppose there are .a, b ∈ Zn , such that .a /= 0, .b /= 0, and
.a · b = 0. By definition, we have .n|ab. By Lemma 1.1.2, .n|a or .n|b, which gives

.a = 0 or .b = 0, a contradiction. ⨆

For simplicity, we write a instead of .a, and to make sure there is no confusion with
.a ∈ Z, we would specify that .a ∈ Zn . In particular, .Zn = { 0, 1, 2, . . . , n − 1 }.
Furthermore, to emphasize that multiplication or addition is done in .Zn , we write
.ab mod n or .a + b mod n.

Example 1.4.5 Let .n = 5, and we write

.4 × 2 mod 5 = 8 mod 5 = 3, or 4 × 2 ≡ 8 ≡ 3 mod 5.

Lemma 1.4.3 For any .a ∈ Zn , .a /= 0, a has a multiplicative inverse, denoted


a −1 mod n, if and only if .gcd(a, n) = 1.
.

Proof By Bézout’s identity (Theorem 1.1.3), .gcd(a, n) = sa+tn for some .s, t ∈ Z.
.⇐= If .gcd(a, n) = 1, then .sa + tn = 1, i.e., .n|(1 − sa). By definition, .sa ≡

1 mod n; thus .a −1 mod n = s.


.=⇒ On the other hand, if a has a multiplicative inverse, then there exists .s ∈ Zn

such that .as mod n = 1, which gives .n|(as − 1). Hence there is some .t ∈ Z such
that .1 = as + tn. By Lemma 1.1.1 (6), .gcd(a, n)|1. As .gcd(a, n) > 0, we have
.gcd(a, n) = 1. ⨆

Remark 1.4.3 Recall that by the extended Euclidean algorithm (Algorithm 1.2),
we can find integers .s, t such that .gcd(a, n) = sa+tn for any .a, n ∈ Z. In particular,
when .gcd(a, n) = 1, we can find .s, t such that .1 = as +tn, which gives .as mod n =
1. Thus, we can find .a −1 mod n = s mod n by the extended Euclidean algorithm.
Example 1.4.6 We calculated in Example 1.1.15 that .gcd(160, 21) = 1 and .1 =
(−8) × 160 + 61 × 21. We have .21−1 mod 160 = 61.
1.4 Modular Arithmetic 37

Example 1.4.7 Let

p = 5,
. q = 7.

By the extended Euclidean algorithm,

7 = 5 × 1 + 2,
. 5 = 2 × 2 + 1,

1 = 5 − 2 × 2 = 5 − (7 − 5) × 2 = 5 × 3 − 7 × 2.
.

We have

p−1 mod q = 5−1 mod 7 = 3,


. q −1 mod p = 7−1 mod 5 = −2 mod 5 = 3.

Example 1.4.8 Let

p = 7,
. q = 47.

By the extended Euclidean algorithm,

47 = 7 × 6 + 5,
. 7 = 5 × 1 + 2, 5 = 2 × 2 + 1,

1 = 5 − 2 × 2 = 5 − (7 − 5) × 2 = 5 × 3 − 7 × 2 = (47 − 7 × 6) × 3 − 7 × 2
.

= 47 × 3 − 7 × 20.

We have

p−1 mod q = 7−1 mod 47 = −20 mod 47 = 27,


. q −1 mod p = 47−1 mod 7 = 3.

Corollary 1.4.1 .Zn is a field if and only if n is prime.


Proof By Theorem 1.4.2, .Zn is a commutative ring. By Definition 1.2.12 and
Lemma 1.4.3, .Zn is a field if and only if for any .a ∈ Zn such that .a /= 0, we
have .gcd(a, n) = 1, which is true if and only if n is a prime. ⨆

Corollary 1.4.2 For any .a ∈ Zn , if .gcd(a, n) = 1, then the set .{ ab | b ∈ Zn } =
Zn .
Proof It is clear from the definition that .{ ab | b ∈ Zn } ⊆ Zn . As there are n distinct
values for b, it suffices to prove that .ab1 /≡ ab2 mod n for .b1 , b2 ∈ Zn with .b1 /= b2 .
We will prove the claim by contradiction.
Assume

. ab1 ≡ ab2 mod n (1.14)


38 1 Mathematical and Statistical Background

and .b1 /= b2 . By Lemma 1.4.3, .a −1 exists. Multiply both sides of Eq. 1.14 by .a −1 ,
we get .b1 ≡ b2 mod n, a contradiction. ⨆

We note that when p is prime, .Zp is the unique finite field .Fp up to isomorphism
(see Theorem 1.2.3 and Remark 1.2.2).
Lemma 1.4.3 leads us to the following definition.
Definition 1.4.5 Let .Z∗n denote the set of congruence classes in .Zn which have
multiplicative inverses:

. Z∗n := { a | a ∈ Zn , gcd(a, n) = 1 } .

Euler’s totient function, .ϕ, is a function defined on the set of integers bigger than 1
such that .ϕ(n) gives the cardinality of .Z∗n :

ϕ(n) = |Z∗n |.
.

Example 1.4.9
• Let .n = 3, .Z∗3 = {1, 2}, .ϕ(3) = 2.
• Let .n = 4, .Z∗4 = {1, 3}, .ϕ(4) = 2.
• Let .n = p be a prime number, .Z∗p = Zp −{0} = {1, 2, . . . , p−1},3 .ϕ(p) = p−1.
Lemma 1.4.4 .(Z∗n , ·), the set .Z∗n together with the multiplication defined in .Zn
(Eq. 1.13), is an abelian group.
Proof For any .a, b ∈ Z∗n , .a −1 , b−1 ∈ Z∗n . We note that .(ab)(b−1 a −1 ) = 1; hence
ab has an inverse in .Z∗n and .ab ∈ Z∗n (closure). The associativity follows from that
for multiplications in .Z. The identity element is 1, and Lemma 1.4.3 proves that
every element has an inverse in .Z∗n . ⨆

Recall by the Fundamental Theorem of Arithmetic (Theorem 1.1.5), every integer
n > 1 is either a prime or can be written as a product of primes in a unique way.
.

We have the following result concerning Euler’s totient function. The proof can be
found in, e.g., [Sie88, page 247].
Theorem 1.4.3 For any .n ∈ Z, .n > 1,


k k ⎛
∏ ⎞
1
if
. n= piei , then ϕ(n) = n 1− , (1.15)
pi
i=1 i=1

where .pi are distinct primes.

3 Recall the difference between sets defined in Eq. 1.1.


1.4 Modular Arithmetic 39

Example 1.4.10
• Let .n = 10. .10 = 2 × 5. We can count the elements in .Z10 that are coprime to
10 (labeled in red color):

Z10 = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } .
.

There are four of them. By Eq. 1.15, we also have


⎛ ⎞ ⎛ ⎞
1 1
.ϕ(10) = 10 × 1 − × 1− = 4.
2 5

• Let .n = 120. .120 = 23 × 3 × 5. We have


⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1
.ϕ(120) = 120 × 1 − × 1− × 1− = 32.
2 3 5

• Let .n = pq, where p and q are two distinct primes. Then


⎛ ⎞⎛ ⎞
1 1
.ϕ(n) = pq 1 − 1− = (p − 1)(q − 1).
p q

• Let .n = pk , where p is a prime and .k ∈ Z, .k ≥ 1. Then


⎛ ⎞
1
ϕ(pk ) = pk 1 −
. = pk−1 (p − 1).
p

• In particular, if .p = 2,

ϕ(2k ) = 2k−1 .
.

Theorem 1.4.4 (Euler’s Theorem) For any .a ∈ Z, .a ϕ(n) ≡ 1 mod n if


.gcd(a, n) = 1.

Proof By definition, .|Z∗n | = ϕ(n). If .gcd(a, n) = 1, then .a ∈ Z∗n . The result follows
from Theorem 1.2.1. ⨆

Example 1.4.11 Let .n = 4. We have calculated that .ϕ(4) = 2 in Example 1.4.9.
And

32 = 9 ≡ 1 mod 4.
.

Let .n = 10. We have calculated that .ϕ(10) = 4 in Example 1.4.10. And

34 = 81 ≡ 1 mod 10.
.
40 1 Mathematical and Statistical Background

Since .ϕ(p) = p − 1 (Example 1.4.9), a direct corollary of Euler’s Theorem is


Fermat’s Little Theorem.
Theorem 1.4.5 (Fermat’s Little Theorem) Let p be a prime. For any .a ∈ Z, if
p ∤ a, then .a p−1 ≡ 1 mod p.
.

Example 1.4.12
• Let .p = 3. .22 = 4 ≡ 1 mod 3.
• Let .p = 5. .24 = 16 ≡ 1 mod 5.
Corollary 1.4.3 Let p be a prime. Then for any .a, b, c ∈ Z such that .b ≡
c mod (p − 1), we have

a b ≡ a c mod p.
.

In particular,

a b ≡ a b mod (p−1) mod p.


.

Proof By Fermat’s Little Theorem (Theorem 1.4.5),



1 mod p if p ∤ a
a
.
p−1

0 mod p otherwise.

Since .b ≡ c mod (p − 1), .b − c = (p − 1)k for some .k ∈ Z. And



a c mod p if p ∤ a
a b ≡ a c+(p−1)k ≡ a c a (p−1)k ≡
. ≡ a c mod p.
0 mod p otherwise



Example 1.4.13 Let .p = 5, .a = 2, .b = 6. Then

26 ≡ 26 mod 4 ≡ 22 ≡ 4 mod 5.
.

We can verify that indeed

26 ≡ 64 ≡ 4 mod 5.
.

Corollary 1.4.4 Let p be a prime and b be an integer coprime to .ϕ(p). For any
a1 , a2 ∈ Zp , if .a1 /= a2 , then .a1b /≡ a2b mod p.
.

Proof Suppose .a1 /= a2 and .a1b ≡ a2b mod p. Let .c = b−1 mod ϕ(p), then

a1bc ≡ a2bc mod p,


. and bc ≡ 1 mod ϕ(p).
1.4 Modular Arithmetic 41

By Corollary 1.4.3, .a1 ≡ a2 mod p. Since .a1 , a2 ∈ Zp , we have .a1 = a2 , a


contradiction. ⨅

Example 1.4.14 Let .p = 7, then .ϕ(p) = 6. Let .a1 = 3, .a2 = 4, .b = 5. Then

a1b ≡ 35 ≡ 243 ≡ 5 mod 7,


. a2b ≡ 45 ≡ 1024 ≡ 2 mod 7.

This agrees with Corollary 1.4.4. On the other hand, if we let .b = 2, which is no
coprime to .ϕ(p), we have

a1b ≡ 32 ≡ 9 ≡ 2 mod 7,
. a2b ≡ 42 ≡ 16 ≡ 2 mod 7.

1.4.1 Solving Linear Congruences

In this part, we will discuss how to solve a system of linear congruences in .Zn .
We first consider one linear congruence equation.
Lemma 1.4.5 For any .a, b ∈ Z, the linear congruence

ax ≡ b mod n
.

has at least one solution in .Z if and only if .gcd(a, n)|b.


Proof By Definition 1.4.2, the linear congruence is equivalent to the following
equation for some .k ∈ Z:

ax + kn = b.
. (1.16)

=⇒ By Lemma 1.1.1 (6), .gcd(a, n)|b.


.

⇐= Assume .gcd(a, n)|b, then . gcd(a,n)


.
b
is an integer. By Bézout’s identity (Theo-
rem 1.1.3), we can find integers .s, t such that .as + tn = gcd(a, n). Multiplying both
b
sides by . gcd(a,n) , we have

sb tb
a
. +n = b.
gcd(a, n) gcd(a, n)

sb
Thus . gcd(a,n) is a solution for Eq. 1.16. ⨆

Example 1.4.15 Let .n = 10, .a = 4. Then .gcd(a, n) = 2. By Lemma 1.4.5, the
linear congruence .4x ≡ 1 mod 10 has no solution. Indeed, if we try to multiply any
integer by 4 and divide by 10, we will not get an odd remainder.
On the other hand, the linear congruence .4x ≡ 2 mod 10 has at least one solution.
For example, .x = 3 is a solution (.4 × 3 ≡ 12 ≡ 2 mod 10).
42 1 Mathematical and Statistical Background

Theorem 1.4.6 For any .a, b ∈ Z, the linear congruence

ax ≡ b mod n
.

has a unique solution .x ∈ Zn if and only if .gcd(a, n) = 1


Proof .=⇒ Suppose .gcd(a, n) > 1, and .x0 ∈ Zn is a solution for the linear
congruence. Let .x1 = x0 + gcd(a,n)
n
, then
⎛ ⎞
a
ax1 ≡ ax0 +
. n ≡ ax0 mod n.
gcd(a, n)
n
Since .gcd(a, n) > 1, . gcd(a,n) /= 0 mod n, and we have .x1 ≡
/ x0 mod n. Thus
.x1 mod n is another solution in .Zn .

.⇐= Suppose .gcd(a, n) = 1. Take any two solutions .x0 , x1 ∈ Zn , and we have

.ax1 ≡ ax0 mod n. Then

a(x0 − x1 ) ≡ 0 mod n =⇒ n|a(x0 − x1 ).


.

Since .gcd(n, a) = 1, .n ∤ a. By Lemma 1.1.1 (7), .n|(x0 − x1 ). As .x0 , x1 ∈ Zn ,


0 ≤ x0 , x1 < n, we must have .x0 − x1 = 0.
. ⨅

Example 1.4.16
• Let .n = 10, .a = 3. .3x ≡ 4 mod 10 has a unique solution .x = 8 ∈ Z10 .
• Let .n = 10, .a = 4. .4x ≡ 4 mod 10 has two solutions in .Z10 : .x = 1, 6.
We now know when there are solutions for a linear congruence and when the
solution is unique in .Zn . Next, we will discuss the formulas to find the solution
when it is unique. Also, instead of only looking at one equation, the method can
find the solution for a few equations, which are called a system of simultaneous
congruences, at the same time.
Such a problem was mentioned in an ancient Chinese math book called “Sun Zi
Suan Jing.” The question in the book asks: “There is something whose amount is
unknown. If we count by threes, 2 are remaining; by fives, 3 are remaining; and by
sevens, 2 are remaining. How many things are there?” Translating to our notations,
the question is

x ≡ 2 mod 3
.

x ≡ 3 mod 5
x ≡ 2 mod 7
x≡? (1.17)

Before answering the question, we provide the solution for a more general case.
Let us consider a system of simultaneous linear congruences
1.4 Modular Arithmetic 43

x ≡ a1 mod m1
.

x ≡ a2 mod m2
..
.
x ≡ ak mod mk , (1.18)

where .mi are pairwise coprime positive integers, i.e., .gcd(mi , mj ) = 1 for .i /= j .
Define


k
m
m=
. mi , Mi = , 1 ≤ i ≤ k. (1.19)
mi
i=1

Since .mi are pairwise coprime, .mi and .Mi are coprime. By Lemma 1.4.3, .yi :=
Mi−1 mod mi exists. It can be computed by the extended Euclidean algorithm (See
Remark 1.4.3). Let


k
x=
. ai yi Mi mod m. (1.20)
i=1

Since .yi = Mi−1 mod mi and .mj |Mi for .j /= i, we have

ai yi Mi ≡ ai mod mi ,
. and aj yj Mj ≡ 0 mod mi if j /= i.

Then,

x ≡ ai yi Mi +
. aj yj Mj ≡ ai mod mi for all i.
1≤j ≤n,j /=i

Thus, x is a solution to the system of simultaneous linear congruences in Eq. 1.18.


Now, we can compute a solution to Eq. 1.17. We have

m1 = 3,
. m2 = 5, m3 = 7, a1 = 2, a2 = 3, a3 = 2,

and

m = 3 × 5 × 7 = 105,
. M1 = 35, M2 = 21, M3 = 15.

By the extended Euclidean algorithm, we get

y1 = M1−1 mod 3 = 2−1 mod 3 = 2,


.

y2 = M2−1 mod 5 = 1−1 mod 5 = 1,


44 1 Mathematical and Statistical Background

y3 = M3−1 mod 7 = 1−1 mod 7 = 1.

And a solution to Eq. 1.17 is given by


3
x=
. ai yi Mi mod n = 2 × 2 × 35 + 3 × 1 × 21 + 2 × 1 × 15 mod 105
i=1
= 233 mod 105 = 23 mod 105.

Example 1.4.17 Let us solve the following system of simultaneous linear congru-
ences:

x ≡ 2 mod 5
.

x ≡ 1 mod 7
x ≡ 5 mod 11
x ≡ ? mod 385.

Following the above procedures, we have

m1 = 5,
. m2 = 7, m3 = 11, a1 = 2, a2 = 1, a3 = 5,

m = 5 × 7 × 11 = 385,
. M1 = 77, M2 = 55, M3 = 35.

Then

M1 ≡ 77 ≡ 2 mod 5,
. M2 ≡ 55 ≡ 6 mod 7, M3 ≡ 35 ≡ 2 mod 11.

With the extended Euclidean algorithm, we have found

y1 = M1−1 mod 5 = 3,
. y2 = M2−1 mod 7 = 6, y3 = M3−1 mod 11 = 6.

And


3
. x= ai yi Mi mod m = 2 × 3 × 77 + 1 × 6 × 55 + 5 × 6 × 35 mod 385
i=1
= 1842 mod 385 = 302.

We have shown how to find a solution to a system of simultaneous linear


congruences. The following theorem says that our solution is unique in .Zm .
Theorem 1.4.7 (Chinese Remainder Theorem) Let .m1 , m2 , . . . , mk be pairwise
coprime integers. For any .a1 , a2 , . . . , ak ∈ Z, the system of simultaneous congru-
ences
1.4 Modular Arithmetic 45

x ≡ a1 mod m1 ,
. x ≡ a2 mod m2 , ... x ≡ ak mod mk
∏k
has a unique solution modulo .m = i=1 mi .
Proof The discussions above have shown the existence of such a solution. To prove
the uniqueness, let .x1 , x2 ∈ Zm be two solutions for the system of simultaneous
congruences. Then

x1 ≡ x2 mod m1 ,
. x1 ≡ x2 mod m2 , ... x1 ≡ x2 mod mk .

By definition, we have

m1 |(x1 − x2 ),
. m2 |(x1 − x2 ), ... mk |(x1 − x2 ).

∏k .mi s are pairwise coprime, by Lemma 1.1.1 (8), we can conclude that .m =
Since
i=1 mi divides .x1 − x2 . As .x1 and .x2 are from .Zm , we must have .x1 = x2 . ⨆

Example 1.4.18 Let .p = 3, q = 5, n = 15, a = 10. We would like to find the
unique solution .x ∈ Z15 such that

x ≡ 10 mod 3,
. x ≡ 10 mod 5.

We have

m1 = p = 3,
. m2 = q = 5, a1 = a2 = a = 10.

Hence

m = n = 15,
. M1 = 5, M2 = 3, y1 = 5−1 mod 3 = 2, y2 = 3−1 mod 5 = 2.

And

x = a1 y1 M1 +a2 y2 M2 mod n = 10×2×5+10×2×3 mod 15 = 160 mod 15 = 10.


.

Example 1.4.19 Take two distinct primes .p, q, and let .n = pq. By Theorem 1.4.7,
for any .a ∈ Zn , there is a unique solution .x ∈ Zn such that

x ≡ a mod p,
. x ≡ a mod q. (1.21)

Since .a ≡ a mod p and .a ≡ a mod q, the unique solution is given by .x = a ∈ Zn .


In other words, there is no other element .b ∈ Zn different from a that satisfies
Eq. 1.21.
On the other hand, following the above procedures for finding the solution, we
have

m1 = p,
. m2 = q, a1 = a2 = a.
46 1 Mathematical and Statistical Background

And

m = n = pq,
. M1 = q, M2 = p, y1 = q −1 mod p, y2 = p−1 mod q.

Then

x = a1 y1 M1 + a2 y2 M2 mod n = (a(q −1 mod p)q + a(p−1 mod q)p) mod n


.

= (a((q −1 mod p)q + (p−1 mod q)p)) mod n.

By definition,

(q −1 mod p)q = pk1 + 1,


. (p−1 mod q)p = qk2 + 1,

for some integers .k1 , k2 . Thus

p|((q −1 mod p)q + (p−1 mod q)p − 1),


.

and

. q|((q −1 mod p)q + (p−1 mod q)p − 1).

By Lemma 1.1.1 (8), we have

n|((q −1 mod p)q+(p−1 mod q)p−1) =⇒ (q −1 mod p)q+(p−1 mod q)p≡1 mod n.
.

Thus

x = (a((q −1 mod p)q + (p−1 mod q)p)) mod n = a mod n.


.

Corollary 1.4.5 Let p and q be two distinct primes and .n = pq. For any .a, b ∈ Z,
we have

a b ≡ a b mod ϕ(n) mod n.


.

Proof Since .ϕ(n) = (p − 1)(q − 1),

b mod ϕ(n) ≡ b mod (p − 1),


. b mod ϕ(n) ≡ b mod (q − 1).

By Corollary 1.4.3,

a b ≡ a b mod ϕ(n) mod p,


. a b ≡ a b mod ϕ(n) mod q.
1.5 Polynomial Rings 47

By Example 1.4.19,

a b ≡ a b mod ϕ(n) mod n.


.



Example 1.4.20 Let .p = 3, .q = 5, .a = 2, .b = 9. Then .n = 15 and .ϕ(n) =
2 × 4 = 8. And

.29 ≡ 29 mod 8 ≡ 2 mod 15.

We can check that

29 ≡ 512 ≡ 2 mod 15.


.

Corollary 1.4.6 Let p and q be two distinct primes and .n = pq. For any .a1 , a2 ∈
Zn and .b ∈ Z∗ϕ(n) , if .a1 /= a2 , then .a1b /≡ a2b mod n.

Proof Suppose .a1b ≡ a2b mod n. Let .c = b−1 mod ϕ(n), then

a1bc ≡ a2bc mod n,


. and bc ≡ 1 mod ϕ(n).

By Corollary 1.4.5, .a1 ≡ a2 mod n. Since .a1 , a2 ∈ Zn , we have .a1 = a2 , a


contradiction. ⨅

Example 1.4.21 Let .p = 5, .q = 7, .a1 = 4, .a2 = 6. Then .n = 35 and .ϕ(n) =
4 × 6 = 24. Choose .b = 5, and we have

a1b ≡ 45 ≡ 9 mod 35,


. a2b ≡ 65 ≡ 6 mod 35.

1.5 Polynomial Rings

In this section, we introduce another example of commutative rings—polynomial


rings. Throughout this section, let .(F, +, ·) be a field with additive identity 0 and
multiplicative identity 1.
Definition 1.5.1
• Define
⎧ ⎫

n |
|
F [x] :=
. ai x | ai ∈ F, n ≥ 0
i
.
i=0
48 1 Mathematical and Statistical Background

An element .f (x) = an x n + an−1 x n−1 + · · · + a1 x + a0 ∈ F [x] is called a


polynomial over F .
• If .an /= 0, we define degree of .f (x), denoted .deg(f (x)), to be n. Following the
convention, we define .deg(0) = −∞.
Example 1.5.1 Let .F = R, then .f (x) = x + 1 ∈ R[x] is a polynomial over .R and
deg(f (x)) = 1.
.

Take .f (x) = an x n + an−1 x n−1 + · · · + a0 , .g(x) = bm x m + bm−1 x m−1 + · · · + b0


from .F [x]. Without loss of generality, let us assume .n ≥ m. Then we can write
.g(x) = bn x + bn−1 x
n n−1 + · · · + b , where .b = 0 for .i > m. We define addition
0 i
.+F [x] and multiplication .×F [x] as follows:

f (x) +F [x] g(x) := cn x n + cn−1 x n−1 + · · · + c0 , where ci = ai + bi


. (1.22)

and


i
f (x)×F [x] g(x) := dn x n +dn−1 x n−1 +· · ·+d0 , where di =
. aj bi−j . (1.23)
j =0

It is easy to show the following proposition.


Proposition 1.5.1 With the addition .+F [x] and multiplication .×F [x] defined in
Eqs. 1.22 and 1.23, .(F [x], +F [x] , ×F [x] ) is a commutative ring. It is called the
polynomial ring over F .
The identity element for .+F [x] is 0, the additive identity in F . The identity element
for .×F [x] is 1, the multiplicative identity in F . The additive inverse of a polynomial

f (x) = an x n + an−1 x n−1 + · · · + a0


.

is given by

. − f (x) = −an x n − an−1 x n−1 − · · · − a0 ,

where .−ai is the additive inverse of .ai in F . For simplicity, we will write .f (x)g(x)
and .f (x) + g(x) instead of .f (x) ×F [x] g(x) and .f (x) +F [x] g(x).
Example 1.5.2 Let .F = R, and .R[x] is a ring. The identity element for multipli-
cation is 1. The identity element for addition is 0. Take .f (x) = x + 1, g(x) = x in
.R[x],

f (x) + g(x) = 2x + 1,
. f (x)g(x) = x 2 + x.

The additive inverse of .f (x) is

. − x − 1.
1.5 Polynomial Rings 49

Lemma 1.5.1 For any .f (x), g(x) ∈ F [x], such that .f (x) /= 0, g(x) /= 0, we have

. deg(f (x)g(x)) = deg(f (x)) + deg(g(x)).

Proof Let .m = deg(f (x)) and .n = deg(g(x)). Then we can write


m ⎲
n
f (x) =
. ai x i , g(x) = bj x j , where am /= 0, bn /= 0.
i=0 j =0

By Eq. 1.23, .f (x)g(x) = d(x), where the highest power of x in .d(x) is .m + n, and
its coefficient is .am bn /= 0. We have .deg(d(x)) = m + n. ⨆

Lemma 1.5.2 .F [x] is an integral domain.
Proof For any .f (x), g(x) ∈ F [x], such that .f (x) /= 0, g(x) =
/ 0, we have
.deg(f (x)) ≥ 0, deg(g(x)) ≥ 0. By Lemma 1.5.1, .deg(f (x)g(x)) ≥ 0, and hence
.f (x)g(x) /= 0. ⨆

Similar to Euclid’s algorithm (Theorem 1.1.2), we have the following theorem.
The proof can be found in, e.g., [Her96, page 155].
Theorem 1.5.1 (Division Algorithm) For any .f (x), g(x) ∈ F [x], of
deg(f (x)) ≥ 1, there exist .s(x), r(x) ∈ F [x] such that .deg(r(x)) < deg(f (x)) and
.

g(x) = s(x)f (x) + r(x).


.

r(x) is called the remainder, and .s(x) is called the quotient.


.

Definition 1.5.2 Let .f (x), g(x) ∈ F [x]; if .f (x) /= 0 and .g(x) = s(x)f (x) for
some .s(x) ∈ F [x], then we say .f (x) divides .g(x), written .f (x)|g(x).
Example 1.5.3 Let .F = F5 . Take .g(x) = 4x 5 + x 3 , f (x) = x 3 ∈ F5 [x], then

g(x) = f (x)(4x 2 + 1),


.

and .f (x)|g(x).
Definition 1.5.3 A polynomial .f (x) ∈ F [x] of positive degree is said to be
reducible (over F ) if there exist .g(x), h(x) ∈ F [x] such that

. deg(g(x)) < deg(f (x)), deg(h(x)) < deg(f (x)), and f (x) = g(x)h(x).

Otherwise, it is said to be irreducible (over F ).


It is easy to show the following lemma from the above definitions.
Lemma 1.5.3 A polynomial .f (x) ∈ F [x] of degree n is reducible over F if and
only if it is divisible by an irreducible polynomial of degree at most .⎿n/2⏌.
50 1 Mathematical and Statistical Background

Remark 1.5.1
• .f (x) ∈ F [x] of degree 2 or 3 is reducible over F if and only if it has a root in
F .4 ∑n
• Let .f (x) = i=0 ai x ∈ F [x]. Then .f (0) = a0 . Thus .f (x) is reducible if
i

.a0 = 0.
∑n ∑n
• Let .f (x) = i=0 ai x ∈ F2 [x]. Then .f (1) = i=0 ai . If .|{ ai | ai /= 0 }| is
i

even, then .f (1) = 0 and .f (x) is reducible over .F2 . In other words, any .f (x) ∈
F2 [x] with an even number of nonzero terms is reducible over .F2 .
Example 1.5.4
• .h(x) = 4x 5 +x 3 ∈ F3 [x] has degree 5, and it is reducible since .h(x) = x 3 (4x 2 +
1).
• .g(x) = x 2 ∈ F2 [x] has degree 2, and it is reducible, .g(x) = x · x.
Example 1.5.5 Let .F = F2 .
• All the polynomials of degree 2 are .x 2 , x 2 +1, x 2 +x+1, x 2 +x. By Remark 1.5.1,
the only irreducible polynomial of degree 2 is .x 2 + x + 1.
• All the degree 3 polynomials with an odd number of nonzero terms are .x 3 , x 3 +
x + 1, x 3 + x 2 + 1, x 3 + x 2 + x. Among those, the polynomials with .a0 /= 0 are
the irreducible polynomials of degree 3:

x 3 + x + 1, x 3 + x 2 + 1.
.

• Degree 4 polynomials with .a0 /= 0 and an odd number of nonzero terms are

x 4 + x + 1, x 4 + x 2 + 1, x 4 + x 3 + 1, x 4 + x 3 + x 2 + x + 1.
.

By our choice, they are not divisible by degree 1 polynomials. By Lemma 1.5.3,
any of them is reducible if and only if it is divisible by .x 2 + x + 1, which can be
verified using the Division Algorithm (Theorem 1.5.1). For example,

x 4 + x + 1 = x 2 (x 2 + x + 1) + (x 3 + x + x 2 + 1)
.

is not divisible by .x 2 + x + 1. And

x 4 + x 2 + 1 = (x 2 + x + 1)(x 2 + x + 1)
.

is divisible by .x 2 + x + 1.
Finally, we have all the degree 4 irreducible polynomials over .F2 :

x 4 + x + 1, x 4 + x 3 + 1, x 4 + x 3 + x 2 + x + 1.
.

4 An element .a ∈ F is a root of .f (x) if .f (a) = 0.


1.5 Polynomial Rings 51

We note that there are many analogies between a polynomial ring .F [x] and the
ring of integers .Z. For example, a polynomial .f (x) corresponds to an integer n. An
irreducible polynomial .p(x) corresponds to a prime p.
For the rest of the section, let us fix a polynomial .f (x) ∈ F [x] such that .f (x) /=
0. The Same as in Eq. 1.11, we define a relation .∼ on .F [x] as follows:

.g(x) ∼ h(x) if f (x) | (g(x) − h(x)).

We have shown that the relation in Eq. 1.11 is an equivalence relation on .Z, and a
similar proof shows that .∼ is an equivalence relation on .F [x]. We can also define
congruence in .F [x] (cf. Definition 1.4.2).
Definition 1.5.4 For any .g(x), h(x) ∈ F [x], if .g(x) ∼ h(x), i.e., .f (x)|(g(x) −
h(x)), we say .h(x) is congruent to .g(x) modulo .f (x), written .g(x) ≡
h(x) mod f (x).
The congruence class of .g(x) modulo .f (x) is given by

. { h(x)|h(x) ≡ g(x) mod f (x) } .

Similar proofs for Lemma 1.4.1 can be applied to prove the following lemma.
Lemma 1.5.4 Suppose .f (x) has degree n, where .n ≥ 1. Let .F [x]/(f (x)) denote
the set of all congruence classes of .g(x) ∈ F [x] modulo .f (x). Then
⎧ n−1 ⎫
⎲ |
|
F [x]/(f (x)) =
. ai x i | ai ∈ F for 0 ≤ i < n
i=0

can be identified with the set of all polynomials of degree less than n.
Example 1.5.6 Let .f (x) = x 2 + x + 1 ∈ F2 [x]. By Lemma 1.5.4,

F2 [x]/(f (x)) = { 1, x, x + 1 } .
.

Similarly, let .g(x) = x 2 ∈ F2 [x]. Then

F2 [x]/(g(x)) = { 1, x, x + 1 } .
.

We can see that .F2 [x]/(f (x)) and .F2 [x]/(g(x)) contain equivalent classes gener-
ated by the same polynomials.
Naturally, for any .g(x), h(x) ∈ F [x]/(f (x)), the same as in Eqs. 1.12 and 1.13,
addition and multiplication in .F [x]/(f (x)) are computed modulo .f (x).
52 1 Mathematical and Statistical Background

Example 1.5.7 Let .f (x) ∈ F2 [x] be a polynomial of degree n. For any


n−1 ⎲
n−1
g(x) =
. ai x i , h(x) = bi x i
i=0 i=0

from .F2 [x]/(f (x)), we have


n−1
g(x) + h(x) mod f (x) =
. ci x i , where ci = ai + bi mod 2.
i=0

Thus the addition computations in .F2 [x]/(f (x)) are the same for all .f (x) of the
same degree.
Example 1.5.8 Let .F = F2 , .f (x) = x 2 +x+1 ∈ F2 [x], .g(x) = x ∈ F2 [x]/(f (x)),
and .h(x) = x ∈ F2 [x]/(f (x)). We have

g(x) + h(x) mod f (x) = x + x mod f (x) = 0,


.
g(x)h(x) mod f (x) = x 2 mod f (x) = x + 1.

Example 1.5.9 Let .f (x) = x 2 + x + 1, g(x) = x 2 ∈ F2 [x]. The addition


and multiplication computations in .F2 [x]/(f (x)) and .F2 [x]/(g(x)) are shown
in Tables 1.2 and 1.3, respectively. We note that the addition computations for
.F2 [x]/(f (x)) and .F2 [x]/(g(x)) are the same as discussed in Example 1.5.7.

We also have the notion of the greatest common divisors between two nonzero
polynomials in .F [x] (cf. Definition 1.1.5). Then, for any .g(x) ∈ F [x], the modified
version of the Euclidean algorithm (Algorithm 1.1) can be applied to find the
greatest common divisor for .g(x) and .f (x), denoted .gcd(g(x), f (x)). Similarly the
extended Euclidean algorithm (Algorithm 1.2) can be applied to find the inverse
of .g(x) modulo .f (x) when .gcd(f (x), g(x)) = 1. More details are presented
in [LX04, Section 3.2].
Example 1.5.10 Let .F = F2 and .f (x) = x 2 + x + 1, g(x) = x ∈ F2 [x]. By the
Euclidean algorithm, we have

f (x) = (x + 1)g(x) + 1,
. gcd(g(x), f (x)) = gcd(g(x), 1) = 1.

Table 1.2 Addition and multiplication in .F2 [x]/(f (x)), where .f (x) = x 2 + x + 1
.+ 0 1 x .x +1 .× 0 1 x .x +1
0 0 1 x .x +1 0 0 0 0 0
1 1 0 .x + 1 x 1 0 1 x .x + 1

x x .x + 1 0 1 x 0 x .x + 1 1
.x + 1 .x + 1 x 1 0 .x + 1 0 .x + 1 1 x
1.5 Polynomial Rings 53

Table 1.3 Addition and multiplication in .F2 [x]/(g(x)), where .g(x) = x 2


.+ 0 1 x .x +1 .× 0 1 x .x +1
0 0 1 x .x +1 0 0 0 0 0
1 1 0 .x + 1 x 1 0 1 x .x + 1
x x .x + 1 0 1 x 0 x 0 x
.x + 1 .x + 1 x 1 0 .x + 1 0 .x + 1 x 1

By the extended Euclidean algorithm,

1 = g(x)(x + 1) + f (x).
.

We have .g(x)−1 mod f (x) = x + 1.


Example 1.5.11 Let .F = F2 and .f (x) = x 2 + x + 1, g(x) = x 2 ∈ F2 [x]. By the
Euclidean algorithm, we have

f (x) = g(x) + (x + 1), gcd(g(x), f (x)) = gcd(g(x), (x + 1)),


.
g(x) = (x + 1)(x + 1) + 1, gcd(g(x), (x + 1)) = 1.

By the extended Euclidean algorithm,

1 = g(x) + (x + 1)(x + 1) = g(x) + (x + 1)(f (x) + g(x)) = g(x)x + (x + 1)f (x).


.

And .g(x)−1 mod f (x) = x.


Similar proofs for Theorem 1.4.2 and Corollary 1.4.1 can be applied to show the
following theorem.
Theorem 1.5.2 Together with addition and multiplication modulo .f (x),
F [x]/(f (x)) is a commutative ring. It is a field if and only if .f (x) is irreducible.
.

Example 1.5.12 Let .F = R. By Remark 1.5.1, .f (x) = x 2 + 1 is irreducible over


.R. By Theorem 1.5.2, .R/(f (x)) is a field. By Lemma 1.5.4,

R/(f (x)) = { a + bx | a, b ∈ R } .
.

Recall that

C = { a + bi | a, b ∈ R } .
.

It is easy to see that .R/(f (x)) ∼


= C by mapping x to i (see Definition 1.2.16).
Example 1.5.13 In Examples 1.5.4 and 1.5.5, we have shown that .g(x) = x 2 is
reducible and .f (x) = x 2 + x + 1 is irreducible over .F2 .
By Theorem 1.5.2, .F2 /(g(x)) is not a field and .F2 /(f (x)) is a field. Indeed,
in Examples 1.5.6 and 1.5.9, we have seen that even though .F2 [x]/(f (x)) and
54 1 Mathematical and Statistical Background

F2 [x]/(g(x)) contain equivalent classes generated by the same elements, the


.

multiplication computations are different in those two rings. In particular, x is a


zero divisor in .F2 /(g(x)) (Table 1.3) but has inverse .x + 1 in .F2 /(f (x)) (Table 1.2).
We have discussed that there is only one finite field up to isomorphism (Theo-
rem 1.2.3). The following theorem specifies the field structures for .F [x]/(f (x))
when .F = Fp , where p is a prime.
Theorem 1.5.3 Let p be a prime, and let .f (x) ∈ Fp [x] be an irreducible
polynomial of .deg(f (x)) = n. Then .Fp [x]/(f (x)) ∼
= Fpn .
Proof By Lemma 1.5.4,
⎧ n−1 ⎫
⎲ |
|
Fp [x]/(f (x)) =
. ai x | ai ∈ Fp for 0 ≤ i < n
i
.
i=0

There are p choices for each of the n .ai s. Hence the cardinality of .Fp [x]/(f (x)) is
pn . The result follows from Theorem 1.2.3.
. ⨆

Example 1.5.14 Let .f (x) = x 2 + x + 1 ∈ F2 [x]; by Theorem 1.5.3,
.F2 [x]/(f (x)) ∼
= F22 .

1.5.1 Bytes

Throughout this subsection, let .f (x) = x 8 + x 4 + x 3 + x + 1 ∈ F2 [x].


It can be shown that .f (x) is irreducible over .F2 using Lemma 1.5.3 and
Example 1.5.5. Then by Lemma 1.5.4,
⎧ ⎫

7 |
|
F2 [x]/(f (x)) =
. bi x | bi ∈ F2 ∀i
i
.
i=0

By Theorem 1.5.3, .F2 [x]/(f (x)) ∼


= F28 .
We note that any

b7 x 7 + b6 x 6 + b5 x 5 + b4 x 4 + b3 x 3 + b2 x 2 + b1 x + b0 ∈ F2 [x]/(f (x))
.

can be stored as a byte .b7 b6 b5 b4 b3 b2 b1 b0 ∈ F82 (see Definition 1.3.7), which


represents an integer between 0 (00.16 ) and 255 (FF.16 ) (see Remark 1.3.3). There
are 256 different values for a byte, and .|F28 | = 28 = 256. Then .ϕ defined as follows

ϕ : F2 [x]/(f (x)) → F82


.

b7 x 7 + b6 x 6 + b5 x 5 + b4 x 4 + b3 x 3 + b2 x 2 + b1 x + b0 ⍿→ b7 b6 b5 b4 b3 b2 b1 b0
1.5 Polynomial Rings 55

is a bijective function. Thus, with addition and multiplication modulo .f (x) in


F2 [x]/(f (x)), we can define the corresponding addition and multiplication between
.

bytes.
Definition 1.5.5 For any two bytes .v = v7 v6 . . . v1 v0 and .w = w7 w6 . . . w1 w0 , let
gv (x) = v7 x 7 + v6 x 6 + · · · + v1 x + v0 and .gw (x) = w7 x 7 + w6 x 6 + · · · + w1 x + w0
.

be the corresponding polynomials in .F2 [x]/(f (x)). We define

v + w = gv (x) + gw (x) mod f (x),


. v × w = gv (x)gw (x) mod f (x).

In particular, by Example 1.5.7,

v + w = c7 c6 . . . c1 c0 , where ci = vi + wi mod 2.
.

Remark 1.5.2 Recall that a byte is also a vector in .F82 , we have defined vector
addition as bitwise XOR (see Definition 1.3.6), and

v +F8 w = u7 u6 . . . u1 u0 , where ui = vi ⊕ wi .
.
2

We note that .a + b mod 2 = a ⊕ b for .a, b ∈ F2 . Thus, our definition of addition


between two bytes (Definition 1.5.5) agrees with the vector addition between two
vectors in .F82 .
Example 1.5.15 Take .x 6 + x 4 + x 2 + x + 1 ∈ F2 [x]/(f (x)), which corresponds
to .010101112 = 5716 . And .x 7 + x + 1 ∈ F2 [x]/(f (x)), which corresponds to
.100000112 = 8316 . We have

5716 + 8316 = (x 6 + x 4 + x 2 + x + 1) + (x 7 + x + 1) mod f (x)


.

= x 7 + x 6 + x 4 + x 2 mod f (x) = 110101002 = D416 .

We note that

.010101112 ⊕ 100000112 = 110101002 .

For multiplication, we compute

(x 6 + x 4 + x 2 + x + 1)(x 7 + x + 1) = x 13 + x 11 + x 9 + x 8 + x 6 + x 5 + x 4 + x 3 + 1,
.

and

x 8 = x 4 + x 3 + x + 1 mod f (x),
.

x 9 = x 5 + x 4 + x 2 + x mod f (x),
x 11 = x 7 + x 6 + x 4 + x 3 mod f (x),
56 1 Mathematical and Statistical Background

x 13 = x 9 + x 8 + x 6 + x 5 mod f (x).

Thus

x 13 +x 11 +x 9 +x 8 +x 6 +x 5 +x 4 +x 3 +1 = x 11 +x 4 +x 3 +1 = x 7 +x 6 +1 mod f (x),
.

which gives

5716 × 8316 = 110000012 = C116 .


.

Example 1.5.16 In this example, we would like to compute the formula for a byte
multiplied by 02.16 = x. Take any .g(x) = b7 x 7 + b6 x 6 + · · · + b1 x + b0 ∈
F2 [x]/(f (x)),

. g(x)x mod f (x)


= (b7 x 7 + b6 x 6 + b5 x 5 + b4 x 4 + b3 x 3 + b2 x 2 + b1 x + b0 )x mod f (x)
= b7 x 8 + b6 x 7 + b5 x 6 + b4 x 5 + b3 x 4 + b2 x 3 + b1 x 2 + b0 x mod f (x)
= b6 x 7 + b5 x 6 + b4 x 5 + b3 x 4 + b2 x 3 + b1 x 2 + b0 x + b7 x 4 + b7 x 3 + b7 x
+b7 mod f (x)
= b6 x 7 + b5 x 6 + b4 x 5 + (b3 + b7 )x 4 + (b2 + b7 )x 3 + b1 x 2 + (b0 + b7 )x
+b7 mod f (x).

Thus, for any byte .b7 b6 . . . b1 b0 , multiplication by 02.16 is equivalent to left shift by
1 and XOR with .000110112 = 1B16 if .b7 = 1.
Example 1.5.17
• .5716 = 010101112 , .0216 × 5716 = 10101110 = AE16 .
• .8316 = 100000112 , .0216 × 8316 = 000001102 ⊕ 000110112 = 000111012 =
1D16 .
• .D416 = 110101002 , .0216 × D416 = 101010002 ⊕ 000110112 = 101100112 =
B316 .
Example 1.5.18 Now, let us compute the multiplication of a byte by .0316 = x + 1.
Take any .h(x) = b7 x 7 + b6 x 6 + · · · + b1 x + b0 ∈ F2 [x]/(f (x)),

h(x)(x + 1) mod f (x) = h(x)x + h(x) mod f (x).


.

Thus, for any byte .b7 b6 . . . b1 b0 , multiplication by 03.16 is equivalent to first


multiplying by 02.16 (left shift by 1 and XOR with .000110112 = 1B16 if .b7 = 1)
and then XOR with the byte itself (.b7 b6 . . . b1 b0 ).
Example 1.5.19 Continuing Example 1.5.17,
1.6 Coding Theory 57

• .0316 × 5716 = AE16 ⊕ 5716 = F916 .


• .0316 × 8316 = 1D16 ⊕ 8316 = 9E16 .
• .0316 × D416 = B316 ⊕ D416 = 6716 .
Example 1.5.20 .0316 × BF16 = 011111102 ⊕ 000110112 ⊕ 101111112 =
110110102 = DA16 .
We can also compute the inverse of elements in .F2 [x]/(f (x)) using the extended
Euclidean algorithm (Algorithm 1.2) as in Example 1.5.10, thus, enabling us to find
the inverse of a byte as an element in .F2 [x]/(f (x)).
Example 1.5.21 .0316 = 000000112 = x + 1. By the Euclidean algorithm
(Algorithm 1.1),

f (x) = (x + 1)(x 7 + x 6 + x 5 + x 4 + x 2 + x) + 1 =⇒ gcd(f (x), (x + 1)) = 1.


.

See also Appendix B for the computation.


By the extended Euclidean algorithm,

1 = f (x) − (x + 1)(x 7 + x 6 + x 5 + x 4 + x 2 + x).


.

We have

03−1
.
16 = (x + 1)
−1
mod f (x) = x 7 + x 6 + x 5 + x 4 + x 2 + x = 111101102 = F616 .

1.6 Coding Theory

In this section, we give a brief discussion on binary codes, which will be useful for
the design of countermeasures against side-channel attacks (Sect. 4.5.1.1) and fault
attacks (Sect. 5.2.1).
Let n be a positive integer in the rest of this section. To study binary codes, we
look at the vector space .Fn2 , and we refer to vectors in .Fn2 as words of length n.
Definition 1.6.1
• w = w0 w1 . . . wn−1 ∈ Fn2 is called a binary word of length n.
.

• A nonempty set .C ⊂ Fn2 is called a binary code of length n.


• An element of a binary code C is called a codeword of C.
• Cardinality of C is called the size of C.
• A code of length n and size M is called a binary .(n, M)-code.
Example 1.6.1
• .C = { 00, 11 } is a binary .(2, 2)-code.
• .C = { 010, 001, 110, 111 } is a binary .(3, 4)-code.
58 1 Mathematical and Statistical Background

Definition 1.6.2 For any .v, u ∈ Fn2 , the Hamming distance between .v and .u,
denoted .dis (v, u), is defined as follows:


n−1
1 if vi /= ui
dis (v, u) =
. dis (vi , ui ) , where dis (vi , ui ) = (1.24)
i=0 0 if vi = ui .

Example 1.6.2 .dis (001, 111) = 2. .dis (00000, 10101) = 3.


Lemma 1.6.1 For any .v, u, w ∈ Fn2 , we have
1. .0 ≤ dis (v, u) ≤ n.
2. .dis (v, u) = 0 if and only if .v = u.
3. .dis (v, u) = dis (u, v).

4. .dis (v, w) ≤ dis (v, u) + dis (u, w) (triangle inequality).

Proof (1)–(3) are easy to see. We provide the proof for (4). By Eq. 1.24, it suffices
to consider .n = 1. Take any .v, w, u ∈ F2 .
If .v = w,

dis (w, w) = 0 ≤ dis (v, u) + dis (u, w) .


.

If .v /= w, .dis (v, w) = 1, and .dis (v, u) = 1 or .dis (u, w) = 1. ⨆



Definition 1.6.3 Let .C ⊂ Fn2 be a binary code containing at least two codewords,
and the (minimum) distance, denoted .dis (C), is given by

dis (C) = min { dis (c1 , c2 ) | c1 , c2 ∈ C, c1 /= c2 } .


.

Definition 1.6.4 A binary code of length n, size M, and distance d is called a binary
(n, M, d)-code.
.

Example 1.6.3 Let .C = { 0011, 1101, 1000 }; we can calculate that

dis (0011, 1101) = 3,


. dis (0011, 1000) = 3, dis (1101, 1000) = 2.

Thus C is a binary .(4, 3, 2)-code


Recall that when the value of a bit is changed we say that the bit is flipped
(Definition 1.2.17).
Definition 1.6.5 A binary code C is said to be k-error detecting for a positive
integer k if for any .c ∈ C, whenever at least 1 but at most k bits of .c are flipped, the
resulting word is not a codeword in C. If C is k-error detecting but not .(k +1)−error
detecting, then we say C is exactly k-error detecting.
Example 1.6.4 Let .C = { 0011, 1101, 1000 }. Since

dis (0011, 1101) = dis (0011, 1000) = 3,


. dis (1101, 1000) = 2,
1.6 Coding Theory 59

with 1-bit flip from any codeword, we cannot get another codeword. But with 2-bit
flips, we can change 1101 to 1000. Thus C is exactly 1-error detecting.
Theorem 1.6.1 A binary .(n, M, d)-code C is k-error detecting if and only if .d ≥
k + 1, i.e., C is an exactly .(d − 1)-error detecting code.
Proof .⇐= If .d ≥ k + 1, take .c ∈ C and .x ∈ Fn2 such that .1 ≤ dis (c, x) ≤ k. Then
.x /∈ C, and C is k-error detecting.

.=⇒ If .d < k + 1, take .c 1 , c 2 ∈ C such that .dis (c 1 , c 2 ) = d. Flipping d bits of

.c 1 we can get .c 2 ∈ C. Hence C is not k-error detecting. ⨆



Let us consider binary .(n, M, d)-codes with .M = 2k for some positive integer k.
When a binary code is used for transmitting information, every information word
.u ∈ F is assigned a unique codeword .c(u) ∈ C. We say that .u is encoded as
k
2
.c(u). Suppose Alice would like to send information .u to Bob using C. Alice sends

codeword .c(u) to Bob. Due to transmission noise, Bob might receive a word .x ∈ Fn2
not equal to .c(u). Thus we need to define a decoding rule for Bob that allows him
to find .u given .x.
We are interested in a minimum distance decoding rule, which specifies that after
receiving .x, Bob computes

cx = arg min { dis (x, c) | c ∈ C } ,


. i.e., dis (cx , x) = min { dis (x, c) | c ∈ C } .
c c

If more than one codeword is identified as .cx , there are two options. An incomplete
decoding rule says that Bob should request Alice for another transmission. Follow-
ing a complete decoding rule, Bob would then randomly select one codeword.
Example 1.6.5 Let .C = { 0000, 0111, 1110, 1111 }. We use C to encode informa-
tion words .u ∈ F22 with encoding designed as follows:

c(00) = 0000,
. c(01) = 0111, c(10) = 1110, c(11) = 1111.

Suppose Alice was sending information 00 with codeword 0000 to Bob. Due to
an error during the transmission, Bob received 0001. By the minimum distance
decoding rule, Bob computes the distances between 0001 and codewords in C.

dis (0001, 0000) = 1, dis (0001, 0111) = 2, dis (0001, 1110) = 4,


.
dis (0001, 1111) = 3.

Thus .c0001 = 0000, and Bob gets the correct information 00.
Definition 1.6.6 A binary code C is said to be k-error correcting if the minimum
distance decoding outputs the correct codeword when k or fewer bits are flipped. If
C is k-error correcting but not .k + 1-error correcting, then we say that C is exactly
k-error correcting.
60 1 Mathematical and Statistical Background

Example 1.6.6 Let .C = { 000, 111 }.


• If 000 was sent and 1 bit flip occurred, the received word .{ 001, 010, 100 } will
be decoded to 000.
• If 111 was sent and 1 bit flip occurred, the received word .{ 110, 011, 101 } will
be decoded to 111.
• If 000 was sent and 011 was received, the decoding result will be 111.
Thus C is exactly 1-error correcting.
Theorem 1.6.2 A binary .(n, M, d)-code C is k-error correcting if and only if .d ≥
2k + 1, i.e., C is an exactly .⎿(d − 1)/2⏌-error correcting code.
Proof .⇐= We assume .d ≥ 2k + 1. Suppose .c was sent, .v was received, and k or
fewer bit flips occurred, i.e., .dis (c, v) ≤ k. For any codeword .c' ∈ C different from
.c,

( ) ( )
dis v, c' ≥ dis c, c' − dis (v, c) ≥ 2k + 1 − k = k + 1 > dis (v, c) .
.

Thus C is k-error correcting.


'
.=⇒ Now suppose C is k-error correcting and .d < 2k + 1. Take .c, c ∈ C such
( ')
that(.dis )c, c = d. By definition, C is also k-error detecting. By Theorem 1.6.1,
.dis c, c
' = d ≥ k + 1. Without loss of generality, assume .c and .c' differ in the first

d bits.
Define .v ∈ Fn2 as


⎪ 0≤i<k
⎨ci
.vi = ci' k≤i<d


⎩c = c ' k ≥ d.
i i

Then
( )
dis v, c' = d − k ≤ k = dis (v, c) .
.

If .c is sent and .v is received, the minimum distance decoding cannot uniquely


decode .v to .c. ⨆

Definition 1.6.7 Let .C ⊆ Fn2 be a binary code. C is said to be linear if it is a vector
space over .F2 . Otherwise, it is said to be nonlinear.
In other words, a binary linear code C is a subspace of .Fn2 (see Definitions 1.3.5
and 1.3.8).
Remark 1.6.1 By Remark 1.3.4, to show a binary code C is linear, we need to
prove that .0 ∈ C and for any .c, c' ∈ C, .c + c' ∈ C.
We have defined dimensions for vector spaces in Definition 1.3.13.
1.6 Coding Theory 61

Definition 1.6.8 The dimension of a binary linear code C is given by .dim(C)F2 ,


the dimension of C as a vector space over .F2 . A binary linear code C of length n
and dimension k is called a binary .[n, k]-linear code. If C has distance d, then it is
called a binary .[n, k, d]-linear code.
By Lemma 1.3.3, we can calculate the size of a linear code C using its dimension,
|C| = 2dim(C)F2 . Thus a binary .[n, k]-linear code is also a binary .(n, 2k )-code (see
.

Definition 1.6.1).
Example 1.6.7
• Let .C = { 00, 11, 01, 10 } = F22 , then .dim(C)F2 = 2, and C is a binary .[2, 2, 1]-
linear code.
• Let .C = 〈111〉 = { 000, 111 }, then .{ 111 } is a basis for C, and .dim(C)F2 = 1.
C is a binary .[3, 1, 3]-linear code.
Example 1.6.8 (Repetition code) Let

C = 〈11 . . . 11〉 = { 00 . . . 00, 11 . . . 11 } ⊆ Fn2 .


.

Then .{ 11 . . . 11 } is a basis for C, and C is a binary .[n, 1, n]-linear code. C is called


the binary n-repetition code. By Theorems 1.6.1 and 1.6.2, C is exactly .(n−1)-error
detecting and exactly .⎿(n − 1)/2⏌-error correcting.
Example 1.6.9 (Parity-check code) Suppose we would like to encode information
words

u = (u0 , u1 , . . . , un−2 ) ∈ Fn−1


.
2 .

We add one parity-check bit and encode .u using


n−2
c = (u0 , u1 , . . . , un−2 , cn−1 ), where cn−1 =
. ui .
i=0

The corresponding code C consists of codewords that have an even number of 1s.
⎧ ⎫
| ⎲
n−2
|
C=
. (c0 , c1 , . . . , cn−2 , cn−1 ) | cn−1 = ci ⊆ Fn2 . (1.25)
i=0

It is easy to see that .0 ∈ C. Take .v = (v0 , v1 , . . . , vn−1 ), w = (w0 , w1 , . . . , wn−1 )


from C, then


n−2 ⎲
n−2
v + w = (v0 + w0 , v1 + w1 , . . . , vn−1 + wn−1 ), vn−1 + wn−1 =
. vi + wi
i=0 i=0


n−2
= (vi + wi ).
i=0
62 1 Mathematical and Statistical Background

We have .v + w ∈ C. By Remark 1.6.1, C is a linear code.


C is called the binary parity-check code of length n. By Eq. 1.25, the vectors .v i
(.0 ≤ i ≤ n − 1), where .vij = 0 for .j /= i and .vii = 1, form a basis for C. Thus,
.dim(C) = n − 1. Furthermore, we note that the minimum distance between the first

.n − 1 bits of codewords in C is 1. The parity-check bit for two codewords will be

different if they differ only at one position in the first .n − 1 bits. Thus, the minimum
distance of C is 2, and C is a binary .[n, n − 1, 2]-linear code. By Theorems 1.6.1
and 1.6.2, C is exactly 1-error detecting and cannot correct errors.
Definition 1.6.9 The dual code of a binary linear code C is the orthogonal
complement of C, .C ⊥ .
By Lemma 1.3.4, .C ⊥ is a binary linear code. It is easy to see that .(C ⊥ )⊥ = C.
Example 1.6.10 Let C be a binary parity-check code of length n (see Exam-
ple 1.6.9). Then .c ∈ C ⊥ if and only if .c · v = 0 .∀v ∈ C, i.e.,
⎛ ⎞

n−1 ⎲
n−2 ⎲
n−2 ⎲
n−2
. ci vi = 0 ⇐⇒ ci vi + cn−1 vi = 0 ⇐⇒ (ci + cn−1 )vi = 0
i=0 i=0 i=0 i=0

for all .vi = 0, 1(0 ≤ i ≤ n−2), which is equivalent to .ci = cn−1 for all .0 ≤ i ≤ n−
2. Thus .C ⊥ = { 00 . . . 00, 11 . . . 11 } is the n-repetition code (see Example 1.6.8).
Example 1.6.11 Let .C = { 000, 111 } be the binary 3-repetition code, then

C ⊥ = { 000, 011, 101, 110 }


.

is the binary parity-check code of length 3.


Definition 1.6.10 Let .v ∈ Fn2 be a word, and the Hamming weight of .v, denoted by
.wt (v), is given by the number of nonzero bits in .v, or equivalently,

wt (v) = dis (v, 0) .


.

We note that when .n = 1, .wt (v) = 1 if .v = 1 and .wt (v) = 0 if .v = 0. Then, for
any .v = (v0 , v1 , . . . , vn−1 ) from .Fn2 ,


n−1
wt (v) =
. wt (vi ) . (1.26)
i=0

Lemma 1.6.2 For any .u, v ∈ Fn2 , .dis (u, v) = wt (u + v).


Proof Take any .u, v ∈ F2 ,

0 if u = v
dis (u, v) =
.
1 if u /= v, i.e., u + v = 0.
1.6 Coding Theory 63

The lemma follows from Eq. 1.26. ⨆



Example 1.6.12 Let .u = (1, 0, 0, 1), .v = (0, 1, 1, 1), then .dis (u, v) = 3 and

wt (u + v) = wt ((1, 1, 1, 0)) = 3.
.

Theorem 1.6.3 Let C be a binary linear code, define

wt (C) := min { wt (c) | c ∈ C, c /= 0 } .


.

Then .dis (C) = wt (C).


Proof Take .v, u ∈ C, such that .dis (v, u) = dis (C). By Lemma 1.6.2,
.wt (v + u) = dis (C). Since C is a vector space, .v + u ∈ C. We have
.dis (C) ≥ wt (C).

Now, take .w ∈ C such that .wt (C) = wt (w). We have

wt (C) = wt (w) = dis (w, 0) ≥ dis (C) .


.



Definition 1.6.11 Let C be a binary liner code. A generator matrix for C is a matrix
whose rows form a basis for C. A parity-check matrix for C is a generator matrix
for .C ⊥ .
Example 1.6.13 Let .C= { 000, 111 }, and we know that .C ⊥ = { 000, 011, 101, 110 }
(see Example 1.6.11). Let
⎛ ⎞
( ) 011
G= 111 ,
. H = .
101

Then G is a generator matrix for C and a parity-check matrix for .C ⊥ . H is a


generator matrix for .C ⊥ and a parity-check matrix for C.
Let C be a binary .[n, k, d]-linear code. If G is a generator matrix for C and H is
a parity-check matrix for C, then .H GT = O, where .O denotes a matrix with all
entries equal to zero. Also, the size of G is .k × n.
Let .{ v 1 , . . . , v k } be the rows of G. Then for any .u = (u0 , u1 , . . . , uk−1 ) in .Fk2 ,


k−1
uG =
. ui v i ∈ C.
i=0

On the other hand, by Remark 1.3.5, any .c ∈ C has a unique representation of the
form
64 1 Mathematical and Statistical Background


k−1
c=
. ui v i , where ui ∈ F2 .
i=0

Thus, each .u ∈ Fk2 can be encoded as .uG.


Example 1.6.14 It follows from Example 1.6.9 that the binary parity-check code
of length n has generator matrix .(In−1 | 1), where .1 represents a column vector of
length .n − 1 with each entry equal to 1.
• The binary parity-check code of length 2 is given by .{ 00, 11 }. It has a generator
matrix .(1 1).
• The binary parity-check code of length 3 is given by .{ 000, 011, 101, 110 }. It has
a generator matrix
⎛ ⎞
011
. .
101

Theorem 1.6.4 Let C be a binary linear code with at least two codewords, and
let H be a parity-check matrix for C. Then .dis (C) is given by d such that any
.d − 1 columns of H are linearly independent and H has d columns that are linearly

dependent.
Proof Take .v ∈ C such that .v /= 0. By definition,

vH =
. vi hi = 0,
i,vi /=0

where .hi denotes the ith column of H . We can see that the columns .hi , where
vi /= 0, are linearly dependent. Note that .wt (v) = | { vi | vi /= 0 }|.
.

Thus, there exists .v ∈ C such that .wt (v) = d (i.e., .dis (C) ≤ d) if and only if
there are d columns of H that are linearly dependent.
.dis (C) ≥ d if and only if there is no .v ∈ C such that .wt (v) < d, which is

equivalent to that any .d − 1 columns of H are linearly independent. ⨆



Example 1.6.15 Let C be the binary parity-check code of length n (see Exam-
ple 1.6.9). We have discussed that .C ⊥ is the n-repetition code (see Example 1.6.10).
Since .C ⊥ = 〈11 . . . 11〉, it has generator matrix
( )
H = 1 1 ... 1 .
.

By definition, H is a parity-check matrix for C.


Any single column of H is linearly independent. H has two columns that are
linearly dependent, e.g., the first two columns. In fact, any two columns of H
are linearly dependent. By Theorem 1.6.4, .dis (C) = 2, which agrees with our
observation in Example 1.6.9.
1.7 Probability Theory 65

Definition 1.6.12 Let C be a binary .(n, M, d)-code. We define the maximum


distance of C to be

maxdis(C) := max { dis (c1 , c2 ) | c1 , c2 ∈ C } .


.

If .maxdis(C) = δ, C is called a binary .(n, M, d, δ)-anticode.


The notion of anticode was first defined in [Far70], where an anticode refers to a
two-dimensional array of bits such that the maximum Hamming distance between
any pair of rows is at most .δ, for some integer .δ > 0. In this original definition,
repeated rows are allowed. In Definition 1.6.12, an .(n, M, d, δ)-anticode does not
have repeated codewords.
We note that .δ ≥ d. And any binary code is a binary anticode. However, the
notion of binary anticode captures the maximum distance of a code.
Example 1.6.16
• C = { 01, 10 } is a binary .(2, 2, 2, 2)-anticode.
.

• C = { 001, 011, 111 } is a binary .(3, 3, 1, 2)-anticode.


.

• An n-repetition code is a binary .(n, 2, n, n)-anticode.


• A binary parity-check code of length n is a binary .(n, 2n−1 , 2, n)-anticode if n is
even. And it is a binary .(n, 2n−1 , 2, n − 1)-anticode if n is odd.

1.7 Probability Theory

This section aims to provide a rigorous introduction to probabilities, random


variables, and distributions.
Probability theory studies the mathematical theory behind random experiments.
A random experiment is an experiment whose output cannot be predicted with
certainty in advance. However, if the experiment is repeated many times, we can
see “regularity” in the average output. For example, if we roll a die, we cannot
predict the output of one roll. But if we roll it many times, we would expect to see
the number 1 in .1/6 of the outcomes assuming the die is fair.
For a given random experiment, we define sample space, denoted by .Ω, to be the
set of all possible outcomes. A subset A of .Ω is called an event. If the outcome of
the experiment is contained in A, then we say that A has occurred. The empty set .∅
denotes the event that consists of no outcomes. .∅ is also called the impossible event.
Example 1.7.1
• When the random experiment is rolling a die, the sample space is

Ω = { 1, 2, 3, 4, 5, 6 } .
.

A = { 1, 2, 3 } ⊆ Ω is an event.
.
66 1 Mathematical and Statistical Background

• When the random experiment is rolling two dice,

Ω = { (i, j ) | 1 ≤ i, j ≤ 6 } .
.

One possible event is .A = { (1, 2), (1, 1) }.


Recall that we have defined complement, unions, and intersections of sets in
Sect. 1.1.1. Fix a sample space .Ω. Take two⋃events, A and B. We say that .A ∪ B
occurs if ⋂either A or B occurs. Similarly, . ni=1⋂
Ai occurs when at least one .Ai
occurs. .A B occurs if both A and B occur, and . m i=1 Ai occurs if all of the events
.Ai occur. If .A∩B = ∅, then A and B cannot both occur, and they are called mutually

exclusive. The complement of A, .Ac , contains events in .Ω that are not in A.

1.7.1 σ -Algebras

Let .Ω be a set, and let .A denote a set of subsets of .Ω. .A is called a .σ -algebra if it
has the following properties:
• Ω ∈ A.
.

• If .A ∈ A, then .Ac ∈ A.
• .A is closed under finite unions and intersections: If .A1 , A2 , . . . An ∈ A, then
⋃n ⋂n
. i=1 Ai ∈ A and . i=1 Ai ∈ A.
• .A
⋃ is closed under ⋂ countable unions and intersections: If .A1 , A2 , · · · ∈ A, then
. i=1 Ai ∈ A and . i=1 Ai ∈ A.
The pair .(Ω, A) is called a measurable space, meaning that it is a space on which
we can put a measure.
Example 1.7.2
• For any set .Ω, .A = { ∅, Ω } is a .σ -algebra.
• For any set .Ω, the power set .A = 2Ω is a .σ -algebra.
• Let us consider the random experiment to roll a die. We know .Ω =
{ 1, 2, 3, 4, 5, 6 }. Then,

A = { ∅, Ω, { 1 } , { 2, 3, 4, 5, 6 } }
.

is a .σ -algebra.
• If we toss a coin, .Ω = { H, T }. And

A = 2Ω = { ∅, Ω, { H } , { T } }
.

is a .σ -algebra.
1.7 Probability Theory 67

Definition 1.7.1 Let d be a positive integer and .Ω = Rd . .Ω consists of vectors


.(x0 , x1 , . . . , xd−1 ), where .xi ∈ R (see Theorem 1.3.3). The smallest .σ -algebra
5

containing open sets in .Ω is called the Borel .σ -algebra, denoted .R . When .d = 1,


d

we write .R. Any set .B ∈ Rd is called a Borel set.


Example 1.7.3 Here we list some examples of Borel sets. Take any .a, b, c ∈ R
such that .a < c < b.
• By definition, open sets .(a, b) are Borel sets.
• Since a .σ -algebra contains the complement of a set, closed sets .[a, b] are also
Borel sets.
• As .(a, b] = (a, c) ∪ [c, b] and .(a, c), [c, b] ∈ R, we have .(a, b] ∈ R.
• Take a singleton set .{ a }, and we have
∞ ⎛
⋂ ⎞
1 1
. {a} = a − ,a + .
n n
n=1

Thus .{ a } is a Borel set.


• By definition, .R is closed under countable unions, and it follows that a set of
integers is a Borel set.

1.7.2 Probabilities

Let .Ω be a sample space, and let .(Ω, A) be a measurable space in this subsection.
Definition 1.7.2 A probability measure defined on a measurable space .(Ω, A) is a
function .P : A → [0, 1] such that
• .P (Ω) = 1, .P (∅) = 0.
• For any .A1 , A2 , . . . ∈ A that are pairwise disjoint, i.e., .Ai1 ∩ Ai2 = ∅ for .i1 /= i2 ,
⎛∞ ⎞ ∞
⋃ ⎲
.P Ai = P (Ai ).
i=1 i=1

This property is also called countable additivity.


P (A) is called the probability of A. .(Ω, A, P ) is called a probability space.
.

Example 1.7.4 Let us consider the random experiment of tossing a coin, the sample
space .Ω = { H, T }. Let .A = 2Ω = { ∅, Ω, { H } , { T } }. Define

5 It is easy to show that the intersection of .σ -algebras is again a .σ -algebra. Since .2Ω is a .σ -algebra,
it follows that the smallest .σ -algebra containing open sets exists.
68 1 Mathematical and Statistical Background

1 1
P (∅) = 0,
. P (Ω) = 1, P ({ H }) = , P ({ T }) = .
2 2

It is easy to see that P is a probability measure on .(Ω, A).


Example 1.7.5 Let .Ω be a countable set (finite or countably infinite). Let .A = 2Ω .
Then, any probability measure on .(Ω, A) is a function such that for any .A ∈ A,
⎲ ⎲
P (A) =
. P ({ ω }), where P ({ ω }) ≥ 0 and P ({ ω }) = 1.
ω∈A ω∈Ω

For the rest of this section, let .(Ω, A, P ) be a probability space.


Lemma 1.7.1
• For any .Ai ∈ A, .0 ≤ i ≤ m, pairwise disjoint, we have
⎛ ⎞

m ⎲
m
P
. Ai = P (Ai ).
i=1 i=1

This property is also called finite additivity.


• For any .A, B ∈ A such that .A ⊆ B, we have .P (A) ≤ P (B).
Proof Take .Ai = ∅ for .i > m, and by countable additivity we have finite additivity.
Let .C = B − A be the difference between B and A. By countable additivity of
probability measure,

P (B) = P (A ∪ C) = P (A) + P (C).


.

By Definition 1.7.2, .P (C) ≥ 0. ⨆



Definition 1.7.3 Let .Ω be a finite set. Let .A = 2Ω , the power set of .Ω. A probability
measure P on .(Ω, A) is called uniform if

1
P ({ ω }) =
. , ∀ω ∈ Ω.
|Ω|

We note that if P is a uniform probability measure on .(Ω, A), then for any .A ∈ A,
|A|
P (A) = |Ω|
. .

Example 1.7.6 Let .Ω = { 1, 2, 3, 4, 5, 6 } and .A = 2Ω . The uniform probability


measure on .(Ω, A) is given by P such that

1
P ({ i }) =
. , for i ∈ Ω.
6
Let .A = { 1, 2, 3 } , B = { 2, 4 }, then
1.7 Probability Theory 69

3 1 2 1
P (A) =
. = , P (B) = = .
6 2 6 3

Take any .A, B ∈ A such that .P (B) > 0. We would like to compute the probability
of A occurring given the knowledge that B has occurred. We do not need to consider
.A ∩ B since B has already occurred. Instead, we look at .A ∩ B, which occurs when
c

both A and B occur. This leads to the definition of the conditional probability of A
given B:

P (A ∩ B)
P (A|B) :=
. , where P (B) > 0. (1.27)
P (B)

Example 1.7.7 Continuing Example 1.7.6,

1
A ∩ B = {2},
. P (A ∩ B) = .
6
By Eq. 1.27,

P (A ∩ B) 1/6 1
P (A|B) =
. = = .
P (B) 1/3 2

Definition 1.7.4 Two events A and B are said to be independent if .P (A ∩ B) =


P (A)P (B). Otherwise, we say that they are dependent.
By Eq. 1.27, when .P (B) > 0, the condition .P (A ∩ B) = P (A)P (B) is equivalent
to
P (A ∩ B) P (A)P (B)
.P (A|B) = = = P (A). (1.28)
P (B) P (B)

That is, the probability of A occurring given the knowledge that B has occurred
is the same as the probability of A occurring without the knowledge that B has
occurred.
Example 1.7.8 Continuing Example 1.7.7,

1 1 1 1
P (A ∩ B) =
. , P (A)P (B) = × = .
6 2 3 6
By Definition 1.7.4, A and B are independent. We also note that

1
P (A|B) = P (A) =
. .
2
Next, we state a very useful theorem.
70 1 Mathematical and Statistical Background

Theorem 1.7.1 (Bayes’ Theorem) If .P (A) > 0 and .P (B) > 0, then

P (B)P (A|B) = P (A)P (B|A).


.

Proof By Eq. 1.27, we have

P (B)P (A|B) = P (A ∩ B),


. P (A)P (B|A) = P (A ∩ B).



Definition 1.7.5 A set of events .{ E1 , E2 , . . . | Ei ∈ A } is called a partition of .Ω
if:
• They are pairwise disjoint.
• .P (Ei ) > 0 for all i.
• .∪i Ei = Ω.
If the set of events is finite, it is called a finite partition of .Ω; otherwise, it is called
a countable partition of .Ω.
Example 1.7.9 Let .Ω = { 1, 2, 3, 4, 5, 6 }, .A = 2Ω , and P be the uniform
probability measure on .(Ω, A) (see Example 1.7.6). Let

E1 = { 1, 2, 3 } ,
. E2 = { 4, 5 } , E3 = { 6 } .

Then, .{ E1 , E2 , E3 } is a finite partition of .Ω. We can also calculate that

1 1 1
P (E1 ) =
. , P (E2 ) = , P (E3 ) = .
2 3 6

Lemma 1.7.2 Let .{ E1 , E2 , . . . | Ei ∈ A } be a finite or countable partition of .Ω.


Then, for any .A ∈ A, we have

P (A) =
. P (A|Ei )P (Ei ).
i

Proof First, we note that


⎛ ⎞
⋂ ⋂ ⋃ ⋃⎛ ⋂ ⎞
A=A
. Ω=A Ei = A Ei .
i i

Since .Ei are pairwise disjoint, .Ei ∩ A are also pairwise disjoint. We have
⎛ ⎞
⋃⎛ ⋂ ⎞ ⎲ ⎛ ⋂ ⎞ ⎲
P (A) = P
. A Ei = P A Ei = P (A|Ei )P (Ei ).
i i i



1.7 Probability Theory 71

Example 1.7.10 Continuing Example 1.7.9, let .A = { 2, 4 }, then

1
P (A) =
. , A ∩ E1 = { 2 } , A ∩ E2 = { 4 } , A ∩ E3 = ∅.
3
By Eq. 1.27,

1/6 1 1/6 1
.P (A|E1 ) = = , P (A|E2 ) = = , P (A|E3 ) = 0.
1/2 3 1/3 2

Furthermore,


3
1 1 1 1 1
. P (A|Ei )P (Ei ) = × + × = = P (A).
3 2 2 3 3
i=1

Now we can state a generalized version of Bayes’ Theorem (Theorem 1.7.1).


Theorem 1.7.2 Let .{ E1 , E2 , . . . | Ei ∈ A } be a finite or countable partition of .Ω.
For any .A ∈ A with .P (A) > 0 and any .m ≥ 1, we have

P (A|Em )P (Em )
.P (Em |A) = ∑ .
i P (A|Ei )P (Ei )

Proof By Bayes’ Theorem (Theorem 1.7.1),

P (A|Em )P (Em )
P (Em |A) =
. .
P (A)

The result then follows from Lemma 1.7.2. ⨆


1.7.3 Random Variables

Let .(Ω, A, P ) be a probability space. A random variable X represents an unknown


quantity that varies with the outcome of a random experiment. Before the random
experiment, we know all the possible values X can take, but we do not know which
one it will take until we see the outcome of the experiment.
Definition 1.7.6 A random variable X is a function .X : Ω → R, such that

X−1 (B) = { ω : X(ω) ∈ B } ∈ A,


. ∀B ∈ R,

where .R is the Borel .σ -algebra (see Definition 1.7.1).


72 1 Mathematical and Statistical Background

Example 1.7.11
• Fix .A ∈ A, and the indicator function, denoted .1A , for A is defined as follows:

1 ω∈A
1A : A → R,
. 1A (ω) =
0 ω /∈ A.

.1A is a random variable.

• Consider the probability space from Example 1.7.5, then any function .X : Ω →
R is a random variable. In such a case, X is called a discrete random variable.
• Let us consider the probability space discussed in Example 1.7.4. Define .X :
Ω → R such that .X(H ) = 0, X(T ) = 1. For any .B ∈ R, .X−1 (B) is always a
subset of .Ω, which is contained in .A. And X is a discrete random variable.
Let X be a random variable, and define .P X as follows:

P X : R → [0, 1]
.

B ⍿→ P (X−1 (B)). (1.29)

It is easy to see that .P X (R) = 1 and .P X (∅) = 0. Take any .Bi ∈ B that are
pairwise disjoint. Then .X−1 (Bi ) are also pairwise disjoint since X is a function.
The countable additivity of .P X follows from the countable additivity of P . Thus,
.P
X is a probability measure on .(R, R). We say that .P X is induced by X, and it

is called the distribution of X. The cumulative distribution function (CDF) of X,


denoted F , is defined as

F : R → [0, 1]
.

x ⍿→ P X ((−∞, x]) = P (X−1 ((−∞, x])). (1.30)

For simplicity, we will write .P (X ∈ B) instead of .P (X−1 (B)) in Eq. 1.29 and
.P (X ≤ x) instead of .P (X
−1 ((−∞, x])) in Eq. 1.30.

On the other hand, the next lemma says if we start from a function F with certain
properties, there always exists a random variable that has F as its CDF. The proof
can be found in, e.g., [Dur19, page 9].
Lemma 1.7.3 If a function F satisfies the following conditions, then it is the
distribution function of some random variable.
• F is nondecreasing.
• . lim F (x) = 1, . lim F (x) = 0.
x→∞ x→−∞
• F is right continuous, i.e., .lim F (y) = F (x).
y↓x

When X is a discrete random variable (see Example 1.7.11), the distribution of


X is completely determined by the following numbers:
1.7 Probability Theory 73


P (X = j ) =
. P ({ ω }).
ω:X(ω)=j

Let .T := X(Ω) be the image of .Ω in .R. The probability mass function (PMF) of X
is defined to be the function

pX : T → [0, 1]
.

x ⍿→ P (X = x).

We have the following relation between the PMF of X and the CDF of X:

F (a) =
. pX (x).
x≤a, x∈T

Example 1.7.12 Let us consider the probability space defined in Example 1.7.4.
We have discussed in Example 1.7.11 that

X : Ω → R,
. X(H ) = 0, X(T ) = 1

is a discrete random variable. The image of X in .R is .T = { 0, 1 }. And the PMF of


X is given by

1 1
pX (0) = P (X = 0) = P ({ H }) =
. , pX (1) = P (X = 1) = P ({ T }) = .
2 2
When the distribution function .F (x) = P (X ≤ x) has the form
⎰ x
. F (x) = f (y)dy,
−∞

we say that X has probability density function (PDF) f and X is called a continuous
random variable.
Example 1.7.13 Define .f (x) = 1 for .x ∈ (0, 1) and 0 otherwise.
⎰ x
F (x) =
. f (y)dy
−∞

is given by


⎪ x≤0
⎨0
.F (x) = x 0≤x≤1


⎩1 x > 1.
74 1 Mathematical and Statistical Background

It is easy to show that F satisfies the conditions in Lemma 1.7.3. If X is a random


variable that has F as its CDF, then we say that X induces a uniform distribution on
.(0, 1).

Example 1.7.14 A random variable Z that induces a standard normal distribution


has the probability density function
⎛ 2⎞
1 z
f (z) = √ exp −
.
2π 2

and the cumulative distribution function


⎰ z ⎛ 2⎞
1 y
.Ф(z) = √ exp − dy.
2π −∞ 2

The standard normal distribution will be very useful in later parts of the book,
and we use .Ф(z) instead of .F (z) to denote its CDF. Moreover, we say that Z is
a standard normal random variable. Figure 1.1 shows that .f (z) is a bell-shaped
curve that is symmetric about 0. The symmetry is also apparent from the formula
for .f (z).
Next, we would like to define expectations and variances for random variables. The
exact formulas for discrete and continuous random variables are different, but the
information carried by those notions is the same. In particular, the expectation/mean
of a random variable X is the expected average value of X. And the variance of
X is the average squared distance from the mean. By squaring the distances, the
small deviations from the mean are reduced, and the big ones are enlarged. Thus the
variance measures how the values of X vary from the mean or how “spread out” the
values of X are.
When X is a discrete random variable .X : Ω → R with PMF .pX and .T = X(Ω)
(the image of .Ω in .R), its expectation/mean is defined as

E [X] =
. xpX (x), (1.31)
x∈T

Fig. 1.1 Probability density


function of the standard
normal random variable
1.7 Probability Theory 75

provided the sum exists.6


Example 1.7.15 Let us consider the discrete random variable discussed in Exam-
ples 1.7.4 and 1.7.12. By Eq. 1.31,

1 1 1
E [X] = 0 × pX (0) + 1 × pX (1) = 0 ×
. +1× = .
2 2 2
When X is a continuous random variable with PDF f , its expectation/mean is
defined as
⎰ ∞
.E [X] = xf (x)dx, (1.32)
−∞

provided the integral exists.


Example 1.7.16 Let X be a random variable that induces a uniform distribution on
(0, 1) (see Example 1.7.13), by Eq. 1.32,
.

⎰ ⎰ |1
∞ 1 x 2 || 1
E [X] = xf (x)dx = xdx = = .
2 |0
.
−∞ 0 2

Example 1.7.17 Let Z be a random variable that induces the standard normal
distribution (see Example 1.7.14), by Eq. 1.32,
⎰ ∞ ⎰ ∞ ⎛ ⎞
1 z2
.E [Z] = zf (z)dz = √ z exp − dz = 0.
−∞ 2π −∞ 2

As shown in Fig. 1.1, .f (x) is symmetric about 0, so it is not surprising that the
expected average value of Z is 0.
Let g be a function .g : R → R. Then .g(X) is also a random variable.7 It can be
proven that if X is a discrete random variable with PMF .pX , then

E [g(X)] =
. g(x)pX (x).
x

If X is a continuous random variable with PDF f , then


⎰ ∞
E [g(X)] =
. g(x)f (x)dx.
−∞

6 For example, if .Ω is finite or if .Ω is countable and the series converges absolutely, the sum exists.
7 To be more precise, g should be a measurable function for .g(X) to be a random variable. For the
definition of measurable functions, we refer the readers to [Yeh14, page 72].
76 1 Mathematical and Statistical Background

The proof can be found in, e.g., [Ros20, page 113].


Example 1.7.18 Define

g:R→R
.

x ⍿→ x 2 .

Then the expectation of g is given by


⎾ ⏋ ⎲
E X2 =
. x 2 pX (x)
x

when X is a discrete random variable with PMF .pX . And


⎾ ⏋ ⎰ ∞
E X
.
2
= x 2 f (x)dx
−∞

when X is a continuous random variable with PDF .f (x).


Furthermore, given two random variables .X, Y such that .E [|X|] < ∞ and .E [|Y |] <
∞, for any .a, b ∈ R,

E [X + Y ] = E [X] + E [Y ] ,
. E [aX + b] = aE [X] + b, E [b] = b. (1.33)

The proof can be found in, e.g., [Dur19, page 24]


Example 1.7.19 Let X be a random variable, and let .μ := E [X]. By Eq. 1.33, we
have
⎾ ⏋ ⎾ ⏋ ⎾ ⏋
.E (X − μ) = E X2 + μ2 − 2E [Xμ] = E X2 + μ2 − 2μE [X]
2

⎾ ⏋
= E X2 + μ2 − 2μ2
⎾ ⏋
= E X2 − μ2 . (1.34)

Equation 1.34 provides the formula for computing the variance of a random variable
⎾ ⏋
X. More specifically, let X be a random variable with mean .E [X] = μ. If .E X2 <
∞, then the variance of X is given by
⎾ ⏋ ⎾ ⏋
Var(X) = E (X − μ)2 = E X2 − μ2 .
. (1.35)

Example 1.7.20 Let us consider the discrete random variable discussed in Exam-
ples 1.7.4 and 1.7.12. By Eq. 1.35 and Examples 1.7.15 and 1.7.18,
1.7 Probability Theory 77

1 ⎲ 2 1 1 1 1 1
Var(X) = E[X2 ]−
. = x pX (x)− = 0×pX (0)+1×pX (1)− = − = .
22 x
4 4 2 4 4

Example 1.7.21 Let X be a continuous random variable that induces the uniform
distribution on .(0, 1) (see Example 1.7.13), by Eq. 1.35 and Examples 1.7.16
and 1.7.18,
⎰ ⎰ |1
∞ 1 1 x 3 || 1 1
Var(X) = x f (x)dx − E [X] =
2 2
x dx − 2 =
2
− = .
3 |0 4
.
−∞ 0 2 12

Example 1.7.22 Let Z be a random variable that induces the standard normal
distribution (see Examples 1.7.14), by Eq. 1.35, Examples 1.7.17 and 1.7.18,
⏋ ⎾ ⎰ ∞ ⎰ ∞ ⎛ 2⎞
1 z
.Var(Z) = E Z −E [Z] = z f (z)dz−0 = √ z exp − dz = 1.
22 2 2
−∞ 2π −∞ 2

We write .Z ∼ N(0, 1) to indicate that Z induces the standard normal distribution


with mean 0 and variance 1.
Given two random variables .X, Y such that .E [|X|] < ∞ and .E [|Y |] < ∞, take any
a, b ∈ R, then it follows from Eq. 1.33 that
.

⎾ ⏋ ⎾ ⏋
Var(aX + b) = E (aX + b − E [aX + b])2 = E (aX + b − aE [X] − b)2
.

⎾ ⏋
= a 2 E (X − E [X])2 = a 2 Var(X). (1.36)

In particular, we have

Var(b) = 0,
. Var(X + b) = Var(X), Var(aX) = a 2 Var(X).

Example 1.7.23 Let .Z ∼ N(0, 1) be a standard normal random variable. Take any
σ, μ ∈ R with .σ 2 > 0. Define .Y = σ Z + μ. Then by Eqs. 1.33 and 1.36,
.

E [Y ] = μ,
. Var(Y ) = σ 2 .

It can be shown (see, e.g., [Dur19, page 28]) that Y has PDF
⎛ ⎞
1 (y − μ)2
.f (y) = √ exp − . (1.37)
σ 2π 2σ 2

We say that Y induces a normal distribution with mean .μ and variance .σ 2 , written
.Y ∼ N(μ, σ ). Y is also called normal/a normal random variable. We note that the
2

mean and variance fully define a normal distribution.


78 1 Mathematical and Statistical Background

Fig. 1.2 Probability density function of a normal random variable

f (y) is a bell-shaped curve symmetric about .μ and obtains its maximum value
.

of
1 0.399
. √ ≈
σ 2π σ

at .y = μ (see Fig. 1.2).


Remark 1.7.1 On the other hand, if we let .Y ∼ N(μ, σ ) be a normal random
variable, then

Y −μ
Z :=
.
σ
is a standard normal random variable (for proof, see [Dur19, exercise 1.2.5]).
Next, let us look at the relations between two random variables. First, similar to
Definition 1.7.4, we give the definition of independent random variables.
Definition 1.7.7 Given two random variables .X : Ω → R, .Y : Ω → R, they are
said to be independent if for any .A, B ∈ R,

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B).
.

If two random variables .X : Ω → R, .Y : Ω → R are independent, it can be proven


that

E [XY ] = E [X] E [Y ]
. if E [|X|] < ∞ and E [|Y |] < ∞. (1.38)

The proof can be found in, e.g., [Dur19, page 41].


To further analyze the relation between two random variables X and Y , we define
the covariance of X and Y to be

Cov(X, Y ) = E [(X − μX )(Y − μY )] ,


. (1.39)
1.7 Probability Theory 79

where .μX and .μY denote expectations for X and Y , respectively. It is easy to see
that

Cov(X, Y ) = Cov(Y, X),


. Cov(X, X) = Var(X).

In case .E [|X|] < ∞ and .E [|Y |] < ∞, by Eq. 1.33, we have

.Cov(X, Y ) = E [XY − μX Y − μY X + μx μY ]
= E [XY ] − μX μY − μY μX + μX μY = E [XY ] − E [X] E [Y ] . (1.40)

Definition 1.7.8 Let X and Y be two random variables. If .Cov(X, Y ) = 0, we say


that X and Y are uncorrelated. Otherwise, we say that X and Y are correlated.
Remark 1.7.2 By Eq. 1.38, if X and Y are two independent random variables such
that .E [|X|] < ∞ and E [|Y |] < ∞, then .Cov(X, Y ) = 0, and they are uncorrelated.
Let Z be another random variable such that .E [|Z|] < ∞, by Eq. 1.38,

E [(X+Z)Y ] − E [X+Z] E [Y ] = E [XY ] + E [ZY ] − E [X] E [Y ] − E [Z] E [Y ] .


.

Thus,

Cov(X + Z, Y ) = Cov(X, Y ) + Cov(X, Z).


.

It can be easily generalized to show that


⎛ n ⎞
⎲ ⎲
n
Cov
. Xi , Y = Cov(Xi , Y ),
i=1 i=1

where .X1 , X2 , . . . , Xn are n random variables. Furthermore, by the symmetry of


covariance (.Cov(X, Y ) = Cov(Y, X)), we have
⎛ ⎞

n ⎲
m ⎲
n ⎲
m
( )
Cov⎝
. Xi , Yj ⎠ = Cov Xi , Yj ,
i=1 j =1 i=1 j =1

where .Y1 , Y2 , . . . , Ym are m random variables. Set .m = n, .Yj = Xi , and we have


⎛ n ⎞
⎲ ⎲
n ⎲
n ⎲
n
( )
.Var Xi = Var(Xi ) + Cov Xi , Xj .
i=1 i=1 i=1 j =1,j /=i

If we further assume .Xi are independent with .E [|Xi |] < ∞, by Remark 1.7.2,
80 1 Mathematical and Statistical Background

⎛ n ⎞
⎲ ⎲
n
.Var Xi = Var(Xi ). (1.41)
i=1 i=1

Recall that we have defined Borel .σ -algebra for .Rn (Definition 1.7.1). Corre-
spondingly, we can define a random vector similar to Definition 1.7.6.
Definition 1.7.9 A random vector X is a function .X : Ω → Rd , such that

X−1 (B) = { ω : X(ω) ∈ B } ∈ A, ∀B ∈ Rd .


.

Note that a random variable is a random vector for the case .d = 1.


Definition 1.7.10 A random vector .X = (X1 , X2 , . . . , Xn ) induces a Gaussian (or
multivariate normal) distribution if every linear combination


n
. aj Xj , aj ∈ R
j =1

is a normal random variable. The mean vector of .X is

μ = (μX1 , μX2 , . . . , μXn ),


.

where .μXi is the mean of ) covariance matrix of .X is given by the matrix Q


( .Xi . The
with entries .Qij = Cov Xi , Xj . When .det Q /= 0, the probability density function
for .X is
⎛ ⎞
1 1 T −1
.f (x) = n√ exp − (x − μ) Q (x − μ) .
(2π ) 2 det Q 2

We write .X ∼ N(μ, Q), and we say that .X is Gaussian/a Gaussian random vector.
Example 1.7.24 If .X1 , . . . , Xn are pairwise independent random variables and
each .Xi ∼ N(μi , σi2 ) is normal, then .X = (X1 , X2 , . . . , Xn ) induces a Gaussian
distribution with mean .μ = (μ1 . . . . , μn ) and covariance matrix Q a diagonal
matrix with .Qii = σi2 (see [JP04, page 127] for a proof).
When we look at Gaussian random vectors, we have the following nice property
for the components of the random vector. The proof can be found in, e.g., [JP04,
page 128].
Theorem 1.7.3 Let .X = (X1 , X2 , . . . , Xn ) be a Gaussian random vector. Then
the components .Xi are independent if and only if the covariance matrix Q of .X is
diagonal.
A direct corollary is as follows.
1.7 Probability Theory 81

Corollary 1.7.1 Let .X = (X1 , X2 , . . . , Xn ) be a Gaussian random vector. Two


components .Xi and .Xj are independent if and only if they are uncorrelated.
Proof Let .Xi and .Xj be any two components of .X. ( )
.=⇒ If .Xi and .Xj are independent, by Theorem 1.7.3, .Cov Xi , Xj = 0.

.⇐= If .Xi , Xj are uncorrelated, the random vector .Y := (Xi , Xj ) is Gaussian

with a diagonal covariance matrix (see Example 1.7.24). Again by Theorem 1.7.3,
.Xi and .Xj are independent. ⨆

Corollary 1.7.2 Two normal random variables X and Y are independent if and
only if they are uncorrelated.
Definition 1.7.11 Let X and Y be two random variables with finite variances. The
correlation coefficient of X and Y is given by

Cov(X, Y )
ρ=√
. . (1.42)
Var(X)Var(Y )

It can be shown by the Cauchy–Schwarz inequality that .−1 ≤ ρ ≤ 1 (see [JP04,


p. 91]).
In general, the correlation coefficient is normally used to answer the question if
large values of X tend to be paired with large or small values of Y . For example,
if when X is large (or small), Y is also large (or small), then the signs of .Xi − X
and .Yi − Y will tend to be the same. Or if when X is large (or small), Y is small (or
large), then the signs of .Xi − X and .Yi − Y will tend to be different. In both cases,
the absolute value of .ρ will be big. On the other hand, in the special case when X
and Y are uncorrelated (Definition 1.7.8), their correlation coefficient is .ρ = 0. In
particular, if X and Y are independent, then .ρ = 0 (see Remark 1.7.2).
As another example, suppose X has finite expectation and variance. For .a, b ∈ R
and .a /= 0, if .Y = aX + b, then by Eqs. 1.33 and 1.36,

Cov(X, Y ) E [XY ] − E [X] E [Y ]


ρ= √
. = √
Var(X)Var(Y ) Var(Y )Var(X)
E [X(aX + b)] − E [X] E [aX + b]
= √
Var(aX + b)Var(X)
⎾ 2 ⏋ ⎾ ⏋
E aX +bX − aE [X]2 − bE [X] aE X2 + bE [X] − aE [X]2 − bE [X]
= √ =
a 2 Var(X)2 |a|Var(X)

aVar(X) a 1 a>0
= = =
|a|Var(X) |a| −1 a < 0.
82 1 Mathematical and Statistical Background

1.8 Statistics

In this section, we will first discuss a few important distributions (Sect. 1.8.1).
Then we will introduce statistical methods for estimating the mean and variance
of a normal distribution (Sect. 1.8.2) which utilize properties of those important
distributions. Those methods will provide more insights into our analysis of device
leakages in Sect. 4.2.3. Finally, we touch on some basics of hypothesis testing
(Sect. 1.8.3) which justifies leakage assessment methods that will be introduced in
Sect. 4.2.3.
We suggest the readers come back to this part later when they reach Chap. 4.

1.8.1 Important Distributions

Let Z denote a random variable that induces a standard normal distribution. We have
discussed in Example 1.7.14 that Z has the probability density function
⎛ 2⎞
1 z
.f (z) = √ exp −
2π 2

and the cumulative distribution function


⎰ z ⎛ 2⎞
1 y
.Ф(z) = √ exp − dy.
2π −∞ 2

Furthermore, Z has expectation .E [Z] = 0 (see Example 1.7.17) and variance


Var(Z) = 1 (see Example 1.7.22), and we write .Z ∼ N(0, 1).
.

Given any .α ∈ (0, 1), we define .zα such that

P (Z > zα ) = 1 − Ф(zα ) = α,
. i.e., Ф(zα ) = 1 − α. (1.43)

Those .zα values are useful for many applications, and there are tables listing
the values of .Ф(z) for small values of z (e.g., [Ros20, Table A1]). Given .α, the
approximated value of .zα can be found by examining such a table. In Table 1.4, we
list a few values of .zα with corresponding .α, which will be used later in the book.
By definition, .Ф(z) is the integral of .f (z). As shown in Fig. 1.3, .α corresponds
to the area under .f (z) for .z > zα . Furthermore, since .f (z) is symmetric about 0,
we have

Table 1.4 Values of .zα (see .α .0.1 .0.05 .0.01 .0.005 .0.001
Eq. 1.43) with
corresponding .α .1 −α .0.900 .0.950 .0.990 .0.995 .0.999
.zα .1.282 .1.645 .2.326 .2.576 .3.090
1.8 Statistics 83

Fig. 1.3 Probability density


function .f (z) for
.Z ∼ N(0, 1).
.P (Z > zα ) = α, .α
corresponds to the area under
.f (z) for .z > zα

Ф(−zα ) = P (Z < −zα ) = P (Z > zα ) = α,


.

and

P (−zα/2 < Z < zα/2 ) = P (Z > −zα/2 ) − P (Z > zα/2 )


. α (1.44)
= 1 − P (Z ≤ −zα/2 ) − = 1 − α.
2

Next, we look at .χ 2 -distributions.


Definition 1.8.1 Let .Z1 , Z2 , . . . , Zn be independent standard normal random vari-
ables. Define


n
X=
. Zi2 .
i=1

The distribution induced by X is called .χ 2 -distribution with n degrees of freedom.


We write .X ∼ χn2 .
Remark 1.8.1 We note that if .X1 ∼ χn21 , .X2 ∼ χn22 are independent, then .X1 +
X2 ∼ χn21 +n2 .
For example, when .n = 8, the probability density function for .X ∼ χ82 is shown
in Fig. 1.4. Similarly to .zα (Eq. 1.43), for any .α ∈ (0, 1), we define .χα,n
2 to be the

number such that

P (X ≥ χα,n
.
2
) = α. (1.45)

As shown in Fig. 1.4, .α corresponds to the area under the PDF of X for .X ≥ χα,n
2 .

We also have
2
P (χ1−α,n
.
2
< X < χα,n ) = P (X ≥ χ1−α,n
2
) − P (X ≥ χα,n
2
) = 1 − 2α.

Finally, we provide some details on t-distributions.


Definition 1.8.2 Let .X ∼ χn2 , .Z ∼ N(0, 1) be two independent random variables.
Define a random variable
84 1 Mathematical and Statistical Background

Fig. 1.4 Probability density function for .X ∼ χ82 . .P (X ≥ χα,8


2 )=α

Fig. 1.5 Probability density functions for .Tn ∼ tn (.n = 2, 5, 10) and for the standard normal
random variable Z

Z
Tn := √
. .
X/n

The distribution induced by .Tn is called a t-distribution with n degrees of freedom.


We write .Tn ∼ tn .
It can be shown (see [Ros20, page 189]) that the PDF of .Tn is symmetric about 0.
And when n becomes larger, the PDF for .Tn becomes more and more like that for a
standard normal random variable. Furthermore,
n
E[Tn ] = 0 for n > 1,
. Var(Tn ) = for n > 2.
n−2

For example, in Fig. 1.5 we can see the PDF of .Tn for .n = 2, 5, 10 and for .Z ∼
N(0, 1).
The same as for .zα (Eq. 1.43) and .χα,n (Eq. 1.45), given .α ∈ (0, 1), we define
.tα,n such that

P (Tn ≥ tα,n ) = α.
. (1.46)

By symmetry of the PDF for .Tn , we have

P (Tn ≤ −tα,n ) = P (Tn ≥ tα,n ) = α,


.

and
1.8 Statistics 85

Fig. 1.6 Probability density


function for .T5 ,
.P (T5 ≥ tα,5 ) = α

.P (Tn ≥ −tα,n ) = 1 − P (Tn < −tα,n ) = 1 − α =⇒ t1−α,n = −tα,n .

An illustration is shown in Fig. 1.6. We have

P (−tα,n ≤ T ≤ tα,n ) = P (Tn ≥ −tα,n ) − P (Tn > tα,n ) = 1 − 2α,


.

which gives

P (−tα/2,n ≤ T ≤ tα/2,n ) = 1 − α,
. (1.47)

or

P (|T | > tα/2,n ) = α.


. (1.48)

The table of values of .tα,n can be found in standard books for statistics, see, e.g.,
[Ros20, Table A3].
Remark 1.8.2 For large values of n (.n ≥ 30), .tα,n can be approximated by .zα (see
Table 1.4).

1.8.2 Estimating Mean and Difference of Means of Normal


Distributions

In Sect. 1.7 we have discussed that a random experiment is an experiment whose


output cannot be predicted with certainty in advance. However, if the experiment is
repeated many times, we can see “regularity” in the average output. For a given
random experiment, the sample space, denoted by .Ω, is the set of all possible
outcomes.
In this subsection, we are interested in a random variable .X : Ω → R (see
Definition 1.7.6) that induces a normal distribution. In particular, we assume .X ∼
N(μx , σx2 ) has mean .μx and variance .σx2 .
We will first discuss how to estimate .μx . To do so, we repeat the ran-
dom experiment n times and record the outcomes. Then the possible outcomes
.{ X1 , X2 , . . . , Xn } are n independent identically distributed random variables. We

refer to this set as a sample. An actual outcome for .Xi , denoted .xi , is called a
realization of .Xi .
86 1 Mathematical and Statistical Background

The sample mean (empirical mean), denoted .X, is given by

1⎲
n
.X := Xi . (1.49)
n
i=1

Remark 1.8.3 It can be shown that the sum of independent normal random
variables induces a normal distribution with mean (respectively, variance) given
by the sum of the means (respectively, variances) of each random variable (see,
e.g., [JP04, page 120] and also Eqs. 1.33 and 1.41).
Since .Xi ∼ N(μx , σx2 ) each are independent, together with Eqs. 1.33 and 1.41,
we have

⎾ ⏋ 1⎲ n
( ) 1 ⎲
n
σ2
E X =
. E [Xi ] = μx , Var X = 2 Var(Xi ) = x .
n n n
i=1 i=1

By Remark 1.8.3,
⎛ ⎞
σx2
.X ∼ N μx , , (1.50)
n

i.e., the sample mean is a normal random variable with mean .μx and variance .σx2 /n.
It follows from Remark 1.7.1 that

X − μx
. √ ∼ N(0, 1). (1.51)
σx / n

Similarly, for .i = 1, 2, . . . , n,

Xi − μx
. ∼ N(0, 1). (1.52)
σx

The sample variance (empirical variance), denoted .Sx2 , is given by

1 ⎲
n
Sx2 :=
. (Xi − X)2 . (1.53)
n−1
i=1

We note that


n ⎲
n
. (Xi − μx )2 = ((Xi − X) + (X − μx ))2
i=1 i=1


n ⎲
n
= n(X − μx )2 + (Xi − X)2 + 2 (Xi − X)(X − μx )
i=1 i=1
1.8 Statistics 87


n
= n(X − μx )2 + (Xi − X)2 ,
i=1

where


n ⎲
n
. (Xi − X) = −nX + Xi = 0.
i=1 i=1

Dividing by .σx2 , we get

n ⎛ ⎞ ⎛√ ⎞2 ∑
⎲ Xi − μx 2 n(X − μx ) n
(Xi − X)2
. = + i=1 2 . (1.54)
σx σx σx
i=1

Since .Xi are independent normal random variables, by Eq. 1.52 and Definition 1.8.1,
the left-hand side of Eq. 1.54 induces a .χ 2 -distribution with n degrees of freedom.
By Eq. 1.51 and Definition 1.8.1, the first term of the right-hand side of Eq. 1.54
induces a .χ 2 -distribution with 1 degree of freedom. By Remark 1.8.1, it is tempting
to conclude that the two terms on the right-hand side of Eq. 1.54 are independent
and the second term induces a .χ 2 -distribution with .n − 1 degrees of freedom.
Such a result has indeed been proven. In particular, the proof of the following
theorem was demonstrated in [Ros20, page 216]8
Theorem 1.8.1 The sample mean .X and sample variance .Sx2 are independent
random variables. Furthermore,

(n − 1)Sx2
. ∼ χn−1
2
. (1.55)
σx2

The above discussions give us the following useful result.


Lemma 1.8.1

√ X − μx
.n ∼ tn−1 .
Sx

Proof Since (see Eqs. 1.51 and 1.55)

X − μx (n − 1)Sx2
. √ ∼ N(0, 1), ∼ χn−1
2
,
σx / n σx2

by Definition 1.8.2,

8 The results are only valid for a normal random variable X.


88 1 Mathematical and Statistical Background


n(X − μx )/σx √ X − μx
. √ = n ∼ tn−1 .
2 2
Sx /σx Sx



Let .Θ denote the subset of .R that contains all possible values of .μx . A point
estimator of .μx is a function with domain .Rn and codomain .Θ that is used to
estimate the value of .μx . We use a point in .Θ for the estimation, hence the name
point estimator.
Remark 1.8.4 For example, we can use the sample mean as a point estimator for
μx . Similarly, we can use the sample variance as a point estimator for .σx (see
.

Example 4.2.1).
Example 1.8.1 (Sample correlation coefficient) Suppose U and W are two ran-
dom variables. Let .{(U1 , W1 ), (U2 , W2 ), . . . , (Un , Wn )} be a sample for this pair of
random variable .(U, W ). We further denote the sample mean and sample variance
for .{U1 , U2 , . . . , Un } by .U and .Su2 . Similarly, the sample mean and sample variance
for .{W1 , W2 , . . . , Wn } are denoted by .W and .Sw2 . Then, following Definition 1.7.11,
we can define the sample correlation coefficient, denoted by r, as follows (see
Eqs. 1.39 and 1.35):
∑n
UW − U W 1
i=1 (Ui − U )(Wi − W )
.r = √ = /⎛ n
⎞⎛ ∑ ⎞
Su2 Sw2 1 ∑n n
i=1 (Ui − U ) i=1 (Wi − W )
2 1 2
n n
∑n
− U )(Wi − W )
i=1 (Ui
= /∑ /∑ . (1.56)
n n
i=1 (Ui − U ) i=1 (Wi − W )
2 2

Then, the sample correlation coefficient can be used as a point estimator for
the correlation coefficient between U and W . We note that since the correlation
coefficient analyzes the relations between U and W , we collect samples in pairs
.(Ui , Wi ).

However, we do not expect .μx to be exactly equal to the sample mean. Thus, we
would like to specify an interval for which we have a certain degree of confidence
that our parameter lies. We refer to such an estimator as an interval estimator.
For the rest of this part, let .α ∈ (0, 1) be a real number. We recall the definitions
of .zα and .tα from Eqs. 1.43 and 1.46, respectively.
Interval estimator for .μx with known variance We first consider .σx2 to be known.
By Eqs. 1.44 and 1.51,
⎛ ⎞
X − μx
.P −zα/2 < √ < zα/2 = 1 − α,
σx / n
1.8 Statistics 89

which gives
⎛ ⎞
σx σx
. P X − zα/2 √ < μx < X + zα/2 √ = 1 − α.
n n
σx σx
Thus, the probability that .μx lies between .X − zα/2 √ n
and .X + zα/2 √n
is .1 − α.
We say that
⎛ ⎞
σx σx
. x − zα/2 √ , x + zα/2 √ (1.57)
n n

is a .100(1 − α) percent confidence interval for .μx , where .x̄ is a realization of .X.
We define the precision of our estimate, denoted by c, to be
σx
c := zα/2 √ ,
.
n

which is the length of half of the confidence interval. It measures how “close” is our
estimate to .μx . Consequently, to have an estimate with precision c and .100(1 − α)
confidence, the number of data in the sample should be at least (see Example 4.2.2)

σx2 2
n=
. z . (1.58)
c2 α/2

Interval estimator for .μx with unknown variance In case the variance is
unknown, by Lemma 1.8.1 and Eq. 1.47, we have
⎛ ⎞
√ X − μx
P
. −tα/2,n−1 ≤ n ≤ tα/2,n−1 = 1 − α,
Sx

which gives
⎛ ⎞
Sx Sx
.P X − tα/2,n−1 √ ≤ μx ≤ X + tα/2,n−1 √ = 1 − α.
n n

Thus a .100(1−α) percent confidence interval for .μx is given by (see Example 4.2.2)
⎛ ⎞
sx sx
. x − tα/2,n−1 √ , x + tα/2,n−1 √ . (1.59)
n n

Similarly, we can define the precision


s
c := tα/2,n−1 √ .
.
n
90 1 Mathematical and Statistical Background

Then to have an estimate with precision c and .100(1 − α) confidence, the number
of data required in the sample is given by

sx2 2
n=
. t .
c2 α/2,n−1

By Remark 1.8.2, when n is large .(≥30), .tα,n is close to .zα , and n can be estimated
by (see Example 4.2.2)

sx2 2
n≈
. z . (1.60)
c2 α/2
For the rest of this part, let Y be a normal random variable with mean .μy and
variance .σy2 that is independent from X. Let .{ Y1 , Y2 , . . . , Ym } be a sample for Y
with sample mean .Y and sample variance .Sy . We are interested in estimating .μx −
μy .
We note that since .X and .Y are point estimators for .μx and .μy , respectively,
.X − Y is a point estimator for .μx − μy .

Interval estimator for .μx − μy with known variances Suppose we know the
values of .σx2 and .σy2 . By Eq. 1.50,

⎛ ⎞ ⎛ ⎞
σx2 σy2
.X ∼ N μx , , Y ∼ N μy , .
n m

By Remark 1.8.3,

⎾ ⏋ ( ) σ2 σy2
E X − Y = μx − μy ,
. Var X − Y = x + ,
n m
and
⎛ ⎞
σx2 σy2 X − Y − (μx − μy )
.X − Y ∼ N μx − μy , + =⇒ / ∼ N(0, 1).
n m σx2 σy2
n + m
(1.61)
By Eq. 1.44, we have
⎛ ⎞
X − Y − (μx − μy )
P ⎝−zα/2
. < / < zα/2 ⎠ = 1 − α.
σx2 σy2
n + m

A .100(1 − α) confidence interval estimate for .μx − μy is then given by (see


Example 4.2.3)
1.8 Statistics 91

⎛ / / ⎞
σx2 σy2 σx2 σy2
. ⎝x − y − zα/2 + , x − y + zα/2 + ⎠. (1.62)
n m n m

The precision c is
/
σx2 σy2
c := zα/2
. + .
n m

If .m = n, to have an estimate with precision c and .100(1 − α) confidence, the


number of data required in the sample is at least (see Example 4.2.3)

2 (σ 2 + σ 2 )
zα/2 x y
n=
. . (1.63)
c2

Interval estimator for .μx − μy with unknown equal variance Suppose .σx = σy
is unknown. Let .σ = σx = σy . By Eq. 1.55,

(n − 1)Sx2 (m − 1)Sy2
. ∼ X2n−1 , ∼ X2m−1 .
σ2 σ2

Since we assume the samples are independent, those two .χ 2 random variables are
independent. By Remark 1.8.1, we have

(n − 1)Sx2 (m − 1)Sy2
. + ∼ X2m+n−2 . (1.64)
σ2 σ2
Let

(n − 1)Sx2 + (m − 1)Sy2
Sp2 :=
. . (1.65)
n+m−2

By Theorem 1.8.1, .X, Sx2 , Y , Sy2 are independent. By Definition 1.8.2 and Eqs. 1.61
and 1.64,

X − Y − (μx − μy ) X − Y − (μx − μy )
. / / = √ ∼ tn+m−2 . (1.66)
σ2
+ σ2
S 2 /σ 2 Sp 1/n + 1/m
n m p

Then according to Eq. 1.47,


⎛ ⎞
X − Y − (μx − μy )
P
. −tα/2,n+m−2 ≤ √ ≤ tα/2,n+m−2 = 1 − α.
Sp 1/n + 1/m
92 1 Mathematical and Statistical Background

A .100(1 − α) confidence interval estimate for .μx − μy is then given by


⎛ √ √ ⎞
. x − y − tα/2,n+m−2 sp 1/n + 1/m, x − y + tα/2,n+m−2 sp 1/n + 1/m .

If we assume .m = n, to have an estimate with precision c and .100(1−α) confidence,


the number of data required in the sample is at least

2
2tα/2,2n−2 sp2
n=
. .
c2
For large n (.n ≥ 30), we can approximate n by (see Example 4.2.3)

2 s2
2zα/2 p
n≈
. . (1.67)
c2

1.8.3 Hypothesis Testing

By statistical hypothesis, we refer to a statement about the unknown parameters of


a distribution (see Example 4.2.5). We call such a statement hypothesis because it
is not known whether or not it is true. In this subsection, we will use samples from
the distribution to draw certain conclusions regarding a given hypothesis about its
unknown parameters. In particular, we will introduce a procedure for determining
whether or not the values of a sample are consistent with the hypothesis. The
decision will then be either to accept the hypothesis or to reject it. By accepting
a hypothesis, we conclude that the resulting data from the sample appear to be
consistent with it.
The same as in Sect. 1.8.2, we consider a normal random variable X with mean
.μx and variance .σx . Furthermore, let .{ X1 , X2 , . . . Xn } denote a sample from the
2

distribution induced by X with sample mean .X and sample variance .Sx2 . We would
like to test hypotheses about .μx using data from this sample.
The hypothesis that we want to test is called the null hypothesis, denoted by .H0 .
For example,

. H0 : μx = 1, H0 : μx ≥ 0.

We will test the null hypothesis against an alternative hypothesis, denoted by .H1 .
For example,

H1 : μx /= 1,
. H1 : μx > 1.

To test the hypothesis, we define a region C such that


1.8 Statistics 93

if a sample { x1 , x2 , . . . , xn } /∈ C, we accept the null hypothesis H0 .


.

And

if a sample { x1 , x2 , . . . , xn } ∈ C, we reject the null hypothesis H0 .


.

C is called the critical region. We also define the level of significance of the test,
denoted by .α, such that when .H0 is true, the probability of rejecting it is not bigger
than .α, namely

P ({ x1 , x2 , . . . , xn } ∈ C|H0 is true) ≤ α.
.

Thus, the main procedure in our hypothesis testing is to find the critical region C
given a level of significance .α.
Two-sided hypothesis testing concerning .μx Let .μ0 ∈ R be a constant. We set
the null hypothesis and the alternative hypothesis as follows:

H0 : μx = μ0 ,
. H1 : μx /= μ0 . (1.68)

Recall that the sample mean, .X, is a point estimator for .μx (see Remark 1.8.4). Then
it is reasonable to accept .H0 if .X is not too far from .μ0 . Given .α, we choose the
critical region to be
{ }
C = (X1 , X2 , . . . , Xn ) | |X − μ0 | > c ,
. (1.69)

where c is a number such that if .X = μ0 ,

P (|X − μ0 | > c) = α.
. (1.70)

Then our main task is to find c that satisfies the above equation.
Suppose the variance .σx2 is known. If .X = μ0 , then by Eq. 1.50,
⎛ ⎞
σ2
X ∼ N μ0 ,
. .
n

Define

X − μ0
Z :=
. √ , (1.71)
σ/ n

by Remark 1.7.1, .Z ∼ N(0, 1). According to Eq. 1.70, we can choose c such that
⎛ √ ⎞ ⎛ √ ⎞ ⎛ √ ⎞
c n c n c n α
.P |Z| > = α =⇒ 2P Z > = α =⇒ P Z > = .
σ σ σ 2
94 1 Mathematical and Statistical Background

By the definition of .zα (Eq. 1.43),



c n zα/2 σ
. = zα/2 =⇒ c = √ . (1.72)
σ n

And the critical region for significance level .α is given by


⎧ | ⎫
| zα/2 σ
C=
. (X1 , X2 , . . . , Xn ) | |X − μ0 | > √ . (1.73)
n

Consequently, we reject the null hypothesis (Eq. 1.68) if



σ n
.|x̄ − μ0 | > zα/2 √ , i.e., |x̄ − μ0 | > zα/2
n σ

and accept .H0 otherwise (see Example 4.2.6).


Remark 1.8.5 We can see that when .μ0 = 0, the critical region corresponding to
the level of significance .α in Eq. 1.73 is the complement of the .100(1 − α) percent
confidence interval for .μx (Eq. 1.57).
Suppose we do not know the variance .σx2 . Recall that sample variance .Sx2
(Eq. 1.53) is a point estimator for .σx2 . Similar to Eq. 1.71, we are interested in the
following random variable:

n(X − μ0 )
T :=
. . (1.74)
Sx

We want to find c such that when .μx = μ0 ,

P (|T | > c) = α.
.

Note that when .μ = μ0 , T induces a t-distribution with .n − 1 degrees of freedom.


By Eq. 1.48, we choose

c = tα/2,n−1 .
.

Hence to achieve the level of significance .α, we reject .H0 (Eq. 1.68) if
|√ |
| n(x − μ0 ) |
| | > tα/2,n−1
.
| s |
x

and accept .H0 otherwise.


One-sided hypothesis testing concerning .μx Now we consider the same null
hypothesis with a different alternative hypothesis as follows:
1.8 Statistics 95

H0 : μ = μ0 ,
. H1 : μ > μ0 . (1.75)

We refer to such a test as one-sided test.


In this case, we will reject .H0 when .X is much bigger than .u0 since when .X is
smaller, it is more likely for .H0 to be true than for .H1 to be true. In other words, the
critical region is of the following form:
{ | }
C = (X1 , X2 , . . . , Xn ) | X − μ0 > c .
. (1.76)

To find the value of c, we assume .H0 is true. Then by definition, c should be chosen
such that

. P (X − μ0 > c) = α.

In case the variance .σx2 is known, by the definition of Z (Eq. 1.71),


⎛ √ ⎞
c n
P Z>
. = α.
σ

By the definition of .zα (Eq. 1.43),



c n zα σ
. = zα =⇒ c = √ . (1.77)
σ n

The critical region for significance level .α is then given by


⎧ | ⎫
| zα σ
.C= (X1 , X2 , . . . , Xn ) | X − μ0 > √ .
n

Thus, we reject the null hypothesis (Eq. 1.75) if



zα σ n
.x − μ0 > √ , i.e., (x − μ0 ) > zα
n σ

and accept .H0 otherwise (see Example 4.2.7).


Suppose we know a good estimate of c for the critical region in Eq. 1.76. Let
.μ0 = 0. We have

{ }
C = (X1 , X2 , . . . , Xn ) | X > c .
. (1.78)

Then by Eq. 1.77, to test whether .μx is different from 0 with significance level .α,
the number of data required is at least (see Example 4.2.7)

σ2 2
n=
. z . (1.79)
c2 α
96 1 Mathematical and Statistical Background

In case we do not know the variance .σx2 . By the definition of T (Eq. 1.74), we
have
⎛ √ ⎞
c n
.P T > = α.
Sx

Then according to Eq. 1.46,



c n tα,n−1 Sx
. = tα,n−1 =⇒ c = √ .
Sx n

Thus the significance level .α test is to reject .H0 (Eq. 1.75) if



n(x − μ0 )
. > tα,n−1
sx

and accept .H0 otherwise. When n is large (.≥ 30), we reject .H0 if (see Exam-
ple 4.2.7)

n(x − μ0 )
. > zα . (1.80)
sx

Suppose we want to test if the mean .μx is bigger than 0 with significance level
α, and we have a good estimate for c. Set .μ0 = 0. The number of data required is at
.

least

sx2 2
n=
. t .
c2 α,n−1
For large n (.n ≥ 30), we have (see Example 4.2.7)

sx2 2
n=
. z . (1.81)
c2 α

Two-sided hypothesis testing about .μx and .μy For the rest of this part, let Y
denote a normal random variable independent from X with mean .μy and variance
.σy . Furthermore, let .{ Y1 , Y2 , . . . , Ym } denote a sample from the distribution
2

induced by Y with sample mean .Y and sample variance .Sy2 .


We would like to test the following hypotheses:

H0 : μx = μy ,
. H1 : μx /= μy . (1.82)

Since .X and .Y are point estimators for .μx and .μy , respectively, .X − Y is a point
estimator for .μx − μy . Then it is reasonable to reject .H0 when .|X − Y | is far from
zero. Given .α, our critical region is of the form
1.8 Statistics 97

{ }
C = (X1 , X2 , . . . , Xn , Y1 , Y2 . . . , Ym ) | |X − Y | > c ,
. (1.83)

where c is chosen such that

P (|X − Y | > c|H0 is true) = P (|X − Y | > c|μx = μy ) = α.


. (1.84)

Our task is to decide how to choose the value of c.


In case the variances .σx2 and .σy2 are known, by Eq. 1.61, when .μx = μy (i.e.,
when .H0 is true), we have

X−Y
. / ∼ N(0, 1). (1.85)
σx2 σy2
n + m

By Eq. 1.44,
⎛ ⎞ ⎛ ⎞
X−Y |X − Y |
P ⎝−zα/2
. </ < zα/2 ⎠ = 1 − α =⇒ P ⎝ / > zα/2 ⎠ = α.
σx2 σy2 σx2 σy2
n + m n + m

Thus, we let
/
σx2 σy2
c = zα/2
. + . (1.86)
n m

To achieve significance level .α, we reject .H0 if


/
σx2 σy2
|x − y| > zα/2
. +
n m

and accept .H0 otherwise (see Example 4.2.8).


Furthermore, suppose .m = n, and we have a good estimate for c. To test if
.μx /= μy with significance level .α, the number of data required is at least (see

Example 4.2.8)

2 (σ 2 + σ 2 )
zα/2 x y
n=
. . (1.87)
c2
In case the variances are unknown but we know that .σx = σy . Let .σ = σx = σy .
By Eq. 1.66, when .μx = μy ,

X−Y
. / ∼ tn+m−2 .
Sp2 (1/n + 1/m)
98 1 Mathematical and Statistical Background

According to Eq. 1.48,


⎛| | ⎞
| |
| X − Y |
.P ⎝| / | > tα/2,n+m−2 ⎠ = α.
| |
| Sp2 (1/n + 1/m) |

Thus, we let
/
c = tα/2,n+m−2 Sp2 (1/n + 1/m).
.

For a test with significance level .α, we reject .H0 if


/
|x − y| > tα/2,n+m−2 sp2 (1/n + 1/m)
.

and accept .H0 otherwise. Such a test is called the student’s .t−test.
For large n and m, we reject .H0 if (see Example 4.2.11)
/ |x − y|
|x − y| > zα/2 sp2 (1/n + 1/m),
. or equivalently, / > zα/2 .
sp2 (1/n + 1/m)
(1.88)
Furthermore, when .n = m, we have (see Eq. 1.65)

(n − 1)Sx2 + (n − 1)Sy2 2 Sx2 + Sy2


Sp2 (1/n + 1/m) =
. × = ,
2n − 2 n n

and we reject .H0 if (see Examples 4.2.8 and 4.2.9)

|x − y|
. / 2 2 > zα/2 . (1.89)
sx +sy
n

In this case, suppose we have a good estimate for c, to have a student’s t-test with
significant level .α, and the number of data we need for both samples is given by (see
Examples 4.2.8 and 4.2.9)

Sx2 + Sy2
n = zα/2
.
2
. (1.90)
c2

If we further assume that the unknown variances .σx2 and .σy2 are not equal, it can
be shown that [Wel47]

X−Y
. / ∼ tv ,
Sx2 Sy2
n + m
1.8 Statistics 99

where

(Sx2 /n + Sy2 /m)2


v≈
. .
(Sx2 /n)2 /(n − 1) + (Sy2 /m)2 /(m − 1)

And a test with significance level .α rejects .H0 if

|x − y|
. / > tα/2,v .
sx2 sy2
n + m

Such a test is called Welch’s t-test.


When n and m are big (.≥30), we test if (see Example 4.2.13)

|x − y|
. / > zα/2 . (1.91)
sx2 sy2
n + m

Remark 1.8.6 Note that when .n = m is big (.≥30), Welch’s t-test and the student’s
t-test have the same formula (see Eq. 1.89 and 1.91).
Both student’s t-test and Welch’s t-test will be useful for leakage assessment in
Sect. 4.2.3.
One-sided hypothesis testing about .μx and .μy For one-sided testing, we consider
the following null and alternative hypotheses:

H0 : μx = μy ,
. H1 : μx > μy .

Similar to Eq. 1.76, our critical region is given by


{ }
.C = (X1 , X2 , . . . , Xn , Y1 , Y2 , . . . , Ym ) | X − Y > c , (1.92)

where c is chosen such that


( )
P X − Y > c|μx = μy = α.
.

We will only discuss the case when .σx2 and .σy2 are known. For unknown
variances, we refer the readers to [Wel47]. By Eqs. 1.85 and 1.43,
⎛ ⎞
X − Y
.P ⎝ / > zα ⎠ = α.
σx2 σy2
n + m

Thus, we choose
100 1 Mathematical and Statistical Background

/
σx2 σy2
c = zα
. + . (1.93)
n m

To achieve the level of significance .α, we reject .H0 if


/
σx2 σy2
x − y > zα
. +
n m

and accept .H0 otherwise (see Example 4.2.10).


Furthermore, suppose .m = n, we have a good estimate for c, and we know
that .μx ≥ μy . To test if .μx /= μy , the number of data required is at least (see
Example 4.2.10)

zα2 (σx2 + σy2 )


n=
. . (1.94)
c2

1.9 Further Reading

For more detailed discussions on sets, functions, number theory, and abstract
algebra, we refer the readers to [Her96, Chapters 1–6] and a series of lecture notes
from Frédérique Oggier [Ogg].
[LX04] provides more in-depth studies for finite fields and coding theory.
For probability theory, we refer the readers to [Dur19] and [JP04] for a thorough
analysis and [Ros20] for practical examples. [Ros20] also provides more insights
on statistical methods presented in Sect. 1.8.
Chapter 2
Introduction to Cryptography

Before we dive into the modern cryptographic algorithms that are in use today
(Chap. 3), we give an introduction to cryptography in general (Sect. 2.1) and discuss
some classical ciphers that were designed a few centuries back (Sect. 2.2). In the
end, we will discuss how cryptographic algorithms are actually used with different
encryption modes (Sect. 2.3).
We start with a definition of cryptography.
Definition 2.0.1 Cryptography studies techniques that allow secure communica-
tion in the presence of adversarial behavior. These techniques are related to
information security attributes such as confidentiality, integrity, authentication, and
non-repudiation.
Below, we give more details on the information security attributes that can be
achieved by using cryptography:
1. Confidentiality aims at preventing unauthorized disclosure of information. There
are various technical, administrative, physical, and legal means to enforce
confidentiality. In the context of cryptography, we are mostly interested in
utilizing various encryption techniques to keep information private.
2. Integrity aims at preventing unauthorized alteration of data to keep them correct,
authentic, and reliable. Similarly to confidentiality, while there are many means
of ensuring data integrity, in cryptography we are looking at hash functions and
message authentication codes.
3. Authentication aims at determining whether something or someone is who they
claim they are. In communication, the entities should be able to identify each
other. Similarly, the properties of the exchanged information, such as origin,
content, and timestamp, should be authenticated. In cryptography, we are mostly
interested in two aspects: entity authentication and data origin authentication. For
these purposes, signatures, and identification, primitives are used.
4. Non-repudiation aims at assuring that the sender of the information is provided
with proof of delivery, and the recipient is provided with proof of the sender’s

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 101
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_2
102 2 Introduction to Cryptography

identity so that neither party can later deny the actions taken. Similarly to
authentication, signatures, and identification, primitives are cryptographic means
of supporting non-repudiation.

Note
CIA Triad is a widely utilized information security model, where the abbrevi-
ation stands for confidentiality, integrity, and availability. Therefore, a curious
reader might be interested in knowing why we did not mention the availability.
The answer is rather simple—there are no techniques within cryptography
that could contribute in one way or another to ensure availability. Availability
attribute ensures that information is consistently and readily accessible for
authorized entities. One needs to look into other means of supporting this
attribute.

2.1 Cryptographic Primitives

Cryptographic primitives are the tools that can be used to achieve the goals listed
in Definition 2.0.1. The categorization of cryptographic primitives is depicted in
Fig. 2.1. We have highlighted the ones that will be discussed in more detail in this
book, especially regarding hardware attacks.
Let us briefly explain each primitive.
• Hash functions: Hash functions map data of arbitrary length to a binary array of
some fixed length. We provide more details on hash functions in Sect. 2.1.1.
• Public key ciphers: Public key (or asymmetric) ciphers use a pair of related keys.
This pair consists of a private key and a public key. These keys are generated by
cryptographic algorithms that are based on mathematical problems called one-

Fig. 2.1 Categorization of


cryptographic primitives. The
ones highlighted in blue color
will be discussed in this book
2.1 Cryptographic Primitives 103

way functions. A one-way function is a function that is easy to compute on every


input, but it is hard to compute its inverse.1
• Signatures: Digital signatures provide means for an entity to bind its identity to
a message. This normally means that the sender uses their private key to sign
the (hashed) message. Whoever has access to the public key can then verify the
origin of the message.
• (Symmetric) block ciphers: Block ciphers are cryptographic algorithms operating
on blocks of data of a fixed size (generally multiples of bytes for modern cipher
designs). They use the same secret key for the encryption and decryption of
data. Block ciphers are detailed in Sect. 2.1.2. Three modern block ciphers are
discussed in Sect. 3.1.
• Stream ciphers: Stream ciphers are symmetric key ciphers that combine plaintext
digits (usually bits) with the keystream, which is a stream of pseudorandom digits
generated by the cipher. The combination is normally done by a bitwise XOR
operation. The idea of stream ciphers comes from the one-time pad (Sect. 2.2.7).
• Message authentication codes (MACs): A message authentication code is a
piece of information that is used to authenticate the origin of the message and
to protect its integrity. MAC algorithms are commonly constructed from other
cryptographic primitives, such as hash functions and block ciphers.

2.1.1 Hash Functions

A hash function is a computationally efficient function mapping data of arbitrary


length to a binary array of some fixed length, called hash values or message digests.
The following are the properties that should be met in a properly designed
cryptographic hash function:
(a) It is quick to compute a hash value for any given input.
(b) It is computationally infeasible to generate an input that yields a given hash
value (a preimage).
(c) It is computationally infeasible to find a second input that maps to the same
hash value when one input is already known (a second preimage).
(d) It is computationally infeasible to find any pair of different messages that
produce the same hash value (a collision).
Cryptographic hash functions are mostly used for integrity and digital signatures.
Message integrity use case of hash functions works as follows. The user creates a
message digest of the original message at some point in time. At a later time (e.g.,
after a transmission), the digest is calculated again to check whether there have
been any changes to the original message. In digital signatures, it is common to first

1 It is worth noting that the existence of one-way functions is an open conjecture and depends on
.P /= N P inequality.
104 2 Introduction to Cryptography

create a message digest that is afterward digitally signed, rather than signing the
entire message which can be slow in case the message is large (see Sect. 3.4).
The current NIST standard for hash functions was released in 2015 and is called
Secure Hash Algorithm 3 (SHA-3) [Dwo15]. It is based on Keccak permutation
[BDPA13] which uses a previously developed sponge construction [BDPVA07].

2.1.2 Cryptosystems

We have mentioned three types of ciphers: public key ciphers, block ciphers, and
stream ciphers. In this subsection, we will provide more discussions on ciphers,
which are also called cryptosystems.
When we use ciphers, we normally assume insecure communication. A popular
example setting is that Alice would like to send messages to Bob, but Eve is also
listening to the communication. The goal of Alice is to make sure that even if Eve
can intercept what was sent, she will not be able to find the original message. To do
so, Alice will first encrypt the message, or the plaintext, and send the ciphertext to
Bob, instead of the original message. Bob will then decrypt the ciphertext to get the
plaintext. For this communication to work, there must be a key for encryption and
decryption. It is clear that the decryption key should be secret from Eve, and a basic
requirement is that the algorithm for encryption/decryption should be designed in
a way that Eve cannot easily brute force the plaintext with the knowledge of the
ciphertext.
Definition 2.1.1 A cryptosystem is a tuple .(P, C, K, E, D) with the following
properties:
• .P is a finite set of plaintexts, called plaintext space.
• .C is a finite set of ciphertexts, called ciphertext space.
• .K is a finite set of keys, called key space.

• .E = { Ek : k ∈ K }, where .Ek : P → C is an encryption function.

• .D = { Dk : k ∈ K }, where .Dk : C → P is a decryption function.

• For each .e ∈ K, there exists .d ∈ K such that .Dd (Ee (p)) = p for all .p ∈ P.
If .e = d, the cryptosystem is called a symmetric (key) cryptosystem. Otherwise, it is
called a public key/asymmetric cryptosystem.
Take any .c1 = Ee (p1 ), c2 = Ee (p2 ) from the ciphertext space .C, where .e ∈ K. Let
d ∈ K be the corresponding decryption key for e. If .c1 = c2 , then by definition,
.

p1 = Dd (c1 ) = Dd (c2 ) = p2 .
.

Thus, .Ee is an injective function (see Definition 1.1.2). We also note that if .P = C,
Ek is a permutation of .P (see Definition 1.2.3).
.

There are mainly two types of symmetric ciphers: block ciphers and stream
ciphers.
2.1 Cryptographic Primitives 105

Definition 2.1.2 (Block Cipher) A block cipher is a symmetric key cryptosystem


with .P = C = An for some alphabet .A and positive integer n. n is called the block
length.
For classical ciphers that we will see in Sects. 2.2.1–2.2.5, .A = Z26 . For modern
cryptosystems that we will discuss in Sect. 3.1, .A = F2 = { 0, 1 }.
Now, if we have a long plaintext .p = p1 p2 . . . , where each .pi ∈ An is one block
of plaintext, and a key k, using a block cipher, we can obtain ciphertext string .c as
follows:2

c = c1 c2 · · · = ek (p1 )ek (p2 ) . . . .


.

But, for a stream cipher, .P = C = A are single digits. Encryptions are computed
on each digit of the plaintext. In particular, suppose we have a plaintext string .p =
p1 p2 . . . (where .pi ∈ A) and a key k. We first compute a key stream .z = z1 z2 . . .
using the key k; then the ciphertext is obtained as follows:

c = c1 c2 · · · = ez1 (p1 )ez2 (p2 ) . . .


.

A stream cipher is said to be synchronous if the key stream only depends on the
chosen key k but not on the encrypted plaintext. In this case, the sender and the
receiver can both compute the keystream synchronously. In Sect. 2.2.7 we will see
a classical synchronous stream cipher called one-time pad.

2.1.2.1 Converting Message to Plaintext

An important aspect to clarify is how the message that Alice intends to send is
represented as plaintext.
For classical ciphers that we will discuss in Sect. 2.2, we will only consider
messages consisting of English letters (A–Z), and we map each letter to an element in
.Z26 . Table 2.1 lists the details of the mapping from letters to .Z26 . Thus the plaintext

spaces are vector spaces over .Z26 .

Table 2.1 Converting English letters to elements in .Z26


A B C D E F G H I J K L M N O P Q R S T
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

U V W X Y Z
20 21 22 23 24 25

2 Such an encryption mode is called an ECB mode, and more encryption modes will be introduced

in Sect. 2.3.
106 2 Introduction to Cryptography

Table 2.2 Examples of methods for converting message symbols to bytes. The second column in
each table is the binary representation of the byte value, and the third column is the corresponding
hexadecimal representation
(a) ASCII (b) UTF-8
A 01000001 41 Á 11000001 C1
B 01000010 42 Ä 11000100 C4
a 01100001 61 Í 11001101 CD
b 01100010 62 × 11010111 D7
? 00111111 3F ÷ 11110111 F7

In modern computers, we store data in binary digits, which can be viewed as


variables ranging over .F2 , or bits (see Definition 1.2.17). An 8-bit binary string is
called a byte (see Definition 1.3.7). Computers often operate on a few bytes at a
time. For example, a 64-bit processor operates on 8 bytes at a time. In computer
architecture, a word is defined as the unit of data of (at most) a certain bit length
that can be addressed and moved between storage and the processor. Therefore, for
a 64-bit processor, the word size is 64 bits.
We have discussed that a byte can be represented as a decimal number between
0 and 255 or as a hexadecimal number between 00.16 and FF.16 (see Remark 1.3.3).
When modern cryptographic algorithms are used, the messages are converted to
plaintexts which are n-bit binary strings (i.e., vectors in .Fn2 ), where n is a multiple
of 8. For example, Table 2.2 lists the representation of some single symbols as
bytes using ASCII and UTF-8 conversion methods. The second column gives the
binary representation of the byte value, and the third column is the corresponding
hexadecimal representation.

2.1.3 Security of Cryptosystems

When the security of a cryptosystem is analyzed, Kerckhoffs’ principle is always


followed.
Definition 2.1.3 (Kerckhoffs’ Principle) The security of a cryptosystem should
depend only on the secrecy of the key.
In other words, everything is public knowledge except for the secret key.
To discuss the security of cryptosystems, we should also specify the attack
assumptions. Normally, they consist of the attacker’s knowledge and the attacker’s
goal. Ciphertext-only attack assumes the attacker has access to a collection of
ciphertexts. Known plaintext attack assumes the attacker has a collection of plaintext
and ciphertext pairs. And in chosen plaintext attack, the attacker has access to
the encryption mechanism such that they can choose plaintexts and obtain the
corresponding ciphertexts. The attacker’s goal can be the recovery of the plaintext
or the recovery of the key.
2.2 Classical Ciphers 107

By Kerckhoffs’ principle (Definition 2.1.3), we assume the attacker has knowl-


edge of the cipher design and communication context, e.g., the sender is a student
and might use words like “exam,” “assignment,” etc.
A ciphertext-only attack scenario is the weakest attacker model and also the
most realistic one. For example, an intercepted encrypted network traffic falls
into this category. As an example of a known plaintext attack scenario, one can
think of the cryptanalysis of Enigma during World War II. There were situations
when the German military broadcast the same message encrypted by different
cryptosystems—for some recipients, it was encrypted by a so-called dockyard
cipher (a manual cipher, relatively easy to cryptanalyze), and for some, it was
encrypted by Enigma [Mah45]. If both messages were intercepted, the allies would
possess both the plaintext and the ciphertext, thus making it a known plaintext
attack on Enigma. When it comes to chosen plaintext attacks, one can imagine a
scenario when an encryption device is captured, and the attacker can send queries
to it and receive the ciphertexts. As the key would normally be stored in secure
storage, the attacker needs to use the plaintext–ciphertext pairs to recover it. This
is a common scenario for hardware attacks. While in the traditional cryptanalysis
setting, a chosen plaintext attack is infeasible for modern ciphers, hardware attacks
can recover the key relatively efficiently, depending on the attacker’s assumptions
and the attack type.
In this book, we say a cipher is broken if the secret key is recovered.3 A cipher
is said to be perfectly secure if, in a ciphertext-only attack setting, the attacker
cannot obtain any information about the plaintext no matter how much computing
power they has. A cipher is secure in practice if there is no known attack that
can break it within a reasonable amount of time and with a reasonable amount
of computing power. A cipher is said to be computationally secure if breaking it
requires computing power that is not available in practice.
In Sect. 2.2.7, we will introduce a classical cipher that achieves perfect secrecy.
However, we will see that the key management of this cipher makes it impractical
for modern usage. Modern cryptosystems that are popular today are considered to
be computationally secure. Most of the ciphers are designed in a way that the effort
taken to break them grows exponentially with the number of bits of the secret key,
which is called key length. Thus, key length is an important factor in the security of
modern ciphers.

2.2 Classical Ciphers

In this section, we will discuss some classical ciphers and analyze their security.
We focus on the case when messages consist of English letters. Those letters are

3 In a more general sense, breaking a cipher means finding a weakness in the cipher algorithm that

can be exploited with a complexity less than brute force [Sch00].


108 2 Introduction to Cryptography

identified with elements in .Z26 as shown in Table 2.1. For easy reading, we will
not distinguish letters and elements in .Z26 . For example, when the message is A, we
may say that the plaintext is A or the plaintext is 0, similarly for ciphertext.

2.2.1 Shift Cipher

Definition 2.2.1 (Shift Cipher) Let P = C = K = Z26 . For each k ∈ K, define

Ek : Z26 → Z26 ,
. p |→ p + k mod 26; Dk : Z26 → Z26 , c |→ c − k mod 26.

The cryptosystem (P, C, K, E, D), where E = { Ek : k ∈ K } and D =


{ Dk : k ∈ K }, is called the shift cipher.
By Theorem 1.4.2, Z26 is a commutative ring with addition and multiplication
modulo 26. We also discussed that subtracting k corresponds to adding the additive
inverse of k (see Remark 1.4.2).
Example 2.2.1 Let k = 2, we have

. − k = −2 mod 26 = 24 mod 26.

Suppose the message is A, then the corresponding plaintext is 0 (see Table 2.1). The
ciphertext is given by

Ek (A) = 0 + 2 mod 26 = 2 mod 26 = C.


.

When we decrypt the ciphertext using the same key, we get our original message:

Dk (C) = 2 − 2 mod 26 = 2 + 24 mod 26 = 0 mod 26 = A.


.

We note that encrypting using a key k is the same as shifting the letters by k
positions, hence the name “shift cipher.”
Example 2.2.2 For example, when k = 5,

Ek (A) = 0 + 5 mod 26 = F,
. Ek (Z) = 25 + 5 mod 26 = 4 mod 26 = E.

To encrypt a message, we can follow Table 2.3 and replace letters in the
first row with those in the second row. Suppose the message is I STUDY IN
BRATISLAVA. Then the corresponding ciphertext (omitting the white spaces) is
NXYZIDNSGWFYNXQFAF.
When k = 3, the cipher is called the Caesar Cipher, which was used by Julius
Caesar around 50 B.C. It is unknown how effective the Caesar cipher was at the
time. But it is likely to have been reasonably secure since most of Caesar’s enemies
2.2 Classical Ciphers 109

Table 2.3 Shift cipher with k = 5. The second row represents the ciphertexts for the letters in the
first row
A B C D E F G H I J K L M N O P Q R S T
F G H I J K L M N O P Q R S T U V W X Y

U V W X Y Z
Z A B C D E

would have been illiterate and they might have also assumed the messages were
written in an unknown foreign language.
Now, suppose as an attacker, we know that the ciphertext is NXYZIDNSGWFYNXQ
FAF. By Kerckhoffs’ principle (Definition 2.1.3), we can assume that we also know
the communication language is English, and how can we find the corresponding
plaintext?
With a moment’s thought, it is easy to see that we can simply try all the
possible keys until we find a plaintext that makes sense. For example, let k =
1, then N should be decrypted to M, X to W, and so on. Eventually, we get
MWXYHCMRFVEXMWPEZE, which does not make sense. So we continue, when k = 2,
we get LVWXGBLQEUDWLVODYD. When k = 3, we have KUVWFAKPDTCVKUNCXC, and
for k = 4, we get JTUVEYJOCSBUJTMBWB. Finally, letting k = 5, we get a proper
sentence ISTUDYINBRATISLAVA. Since there are only 25 possible keys (the key is
not equal to 0), with a known ciphertext, it is easy to find the original plaintext and
the key.
Such a method of trying every possible key until the correct one is found is called
an exhaustive key search. We have demonstrated that with an exhaustive key search,
we can break the shift cipher, i.e., find the key.

2.2.2 Affine Cipher

Recall that .Z∗n is the set of elements .x ∈ Zn such that .gcd(x, n) = 1 (Defini-
tion 1.4.5).
{
Definition 2.2.2 (Affine Cipher) Let .P = C = Z26 and .K = (a, b) | a ∈ Z∗26 ,
b ∈ Z26 }. For each key .(a, b), define

E(a,b) : Z26 → Z26 ,


. p |→ ap + b mod 26; D(a,b) : Z26 → Z26 ,
c |→ a −1 (c − b) mod 26.
{ }
} C, K, E, D), where .E =
{The cryptosystem .(P, E(a,b) : (a, b) ∈ K , .D =
D(a,b) : (a, b) ∈ K , is called the affine cipher.
Note that when .a = 1, we have a shift cipher (Definition 2.2.1).
110 2 Introduction to Cryptography

Next, we will verify that the affine cipher is well-defined. In particular, we will
show the following:
• Decryption is always possible, i.e., given any .a ∈ Z∗26 and .b, y ∈ Z26 , a solution
for x such that

ax + b ≡ y mod 26
.

always exists in .Z26 .


• Each encryption function .Ek is injective, i.e., different plaintexts produce
different ciphertexts, or equivalently, if the solution for .ax + b ≡ y mod 26
exists, then it is unique.
Fix .a ∈ Z∗26 , .b, y ∈ Z26 . To solve the equation

ax + b ≡ y mod 26
.

is equivalent to solving the equation

ax ≡ y − b mod 26.
.

When y varies over .Z26 , .y − b also varies over .Z26 . Thus we can focus on solutions
for

ax ≡ z mod 26,
. (2.1)

where .z ∈ Z26 . Since .a ∈ Z∗26 , by Theorem 1.4.6, Eq. 2.1 has a unique solution.
The existence of the solution proves that decryption is possible, and the uniqueness
guarantees that encryption functions are injective.
Given a key .(a, b), to find .a −1 mod 26, we can apply the extended Euclidean
algorithm (Algorithm 1.2).
Example 2.2.3 Suppose the key for affine cipher is .(3, 1), by the extended
Euclidean algorithm, we can find .3−1 mod 26:

26 = 3×8+2, 3 = 2+1 =⇒ 1 = 3−(26−3×8) = 3×9−26 =⇒ 3−1 mod 26 = 9.


.

To encrypt the word STROM,4 we compute (see Table 2.1)

3 × 18 + 1 = 55 ≡ 3 mod 26, 3 × 19 + 1 = 58 ≡ 6 mod 26,


. 3 × 17 + 1 = 52 ≡ 0 mod 26, 3 × 14 + 1 = 43 ≡ 17 mod 26,
3 × 12 + 1 = 37 ≡ 11 mod 26.

4 Strom is a Slovak word which means tree.


2.2 Classical Ciphers 111

So the ciphertext is DGARL. We can list the correspondence between plaintext and
ciphertext as follows:

S T R O M
18 19 17 14 12
3 6 0 17 11
D G A R L

We know that .26 = 2 × 13. By Theorem 1.4.3,


⎛ ⎞⎛ ⎞
1 1
.ϕ(26) = 26 × 1 − 1− = 12.
2 13

So there are 12 possible values for .a ∈ Z∗26 . And there are 26 possible values for
.b ∈ Z26 . Then the total number of possible keys .(a, b) is .12 × 26 = 312. Similarly

to shift cipher, knowing a ciphertext, we can try each of the 312 keys until we find
a plaintext that makes sense. Thus we can break affine cipher by exhaustive key
search.

2.2.3 Substitution Cipher

Recall that the symmetric group of degree n, denoted .Sn , is the set of permutations of
a set X with n elements (see Definition 1.2.4). We have discussed that a permutation
is a bijective function, and its inverse exists with respect to the composition of
functions (see Lemma 1.2.1). In particular, any permutation .σ ∈ S26 has an inverse

−1 .

Definition 2.2.3 (Substitution Cipher) Let .P = C = Z26 , and .K = S26 . For any
key .σ ∈ S26 , define

Eσ : Z26 → Z26 ,
. p |→ σ (p); Dσ : Z26 → Z26 , c |→ σ −1 (c).

The cryptosystem .(P, C, K, E, D), where .E = { Eσ : σ ∈ K }, .D = { Dσ : σ ∈ K },


is called the substitution cipher.
We note that an affine cipher (Definition 2.2.2) is also a substitution cipher.
Example 2.2.4 Define .σ as in Table 2.4, then the corresponding table for decryp-
tion can be computed by flipping the two rows of the table (see Table 2.5). For
example, to decrypt UIFJNJUWUJPOHWNF, using Table 2.5, we get THE IMITATION
GAME.
112 2 Introduction to Cryptography

Table 2.4 Definition of .σ , a key for substitution cipher


A B C D E F G H I J K L M N O P Q R S T
W X Y Z F G H I J K L M N O P Q R S T U

U V W X Y Z
V A B C D E

Table 2.5 Definition of .σ −1 , where .σ ∈ S26 is a key for substitution cipher shown in Table 2.4
A B C D E F G H I J K L M N O P Q R S T
V W X Y Z E F G H I J K L M N O P Q R S

U V W X Y Z
T U A B C D

We have discussed that .|Sn | = n! (see Example 1.2.9). So the size of key space
for substitution cipher is .26! ≈ 4 × 1026 . Modern computers run at a speed of a few
GHz, which is .∼109 instructions per second. There are .∼105 seconds per day, so one
computer can run .∼1014 instructions per day or .∼1016 instructions per year. If we
would like to exhaust every key for substitution cipher, we will need .∼1010 years.
Compared to the age of the universe, which is .13.8 billion, i.e., .1.38 × 1010 years,
exhaustive key search is impossible with current computation power. However, we
will show in Sect. 2.2.6 that other methods can be used to break substitution cipher.

2.2.4 Vigenère Cipher

For the substitution cipher, one alphabet is mapped to a unique alphabet. Hence
such a cipher is also called a monoalphabetic cipher. Vigenère cipher, named after
the French cryptographer Blaise Vigenère, is a polyalphabetic cipher where one
alphabet can be encrypted to different alphabets depending on the key.
Let m be a positive integer, and let .Zm 26 be the set of matrices with coefficients
in .Z26 of size .1 × m. In other words, .Zm 26 is the set of .1 × m row vectors
with coefficients in .Z26 (see Definition 1.3.1). As discussed in Eq. 1.4, for any
.x = (x0 , x1 , . . . , xm−1 ), .y = (y0 , y1 , . . . , ym−1 ) in .Z , the addition .x + y is
m
26
computed componentwise:

x + y = (x0 + y0 , x1 + y1 , . . . , xm−1 + ym−1 ),


.

where .xi + yi is computed with addition modulo 26. Recall that the additive inverse
of an element a in .Z26 is given by .−a (see Remark 1.4.2). .x − y is then computed
componentwise using the additive inverses of .yi s.
2.2 Classical Ciphers 113

Definition 2.2.4 (Vigenère Cipher) Let m be a positive integer, and let .K = P =


C = Zm26 . For each .k ∈ K, define

.Ek : Zm
26 → Z26 ,
m
p |→ p + k; Dk : Z m
26 → Z26 ,
m
c |→ c − k.

The cryptosystem .(P, C, K, E, D), where .E = { Ek : k ∈ K }, .D = { Dk : k ∈ K },


is called the Vigenère cipher.
The key for a Vigenère cipher is also called a keyword since it can be written as a
string of letters. By definition, a Vigenère cipher encrypts m alphabetic characters
at a time.
Example 2.2.5 Let .m = 6 and choose SECRET as the keyword. Thus the key is
( )
k = 18 4 2 17 4 19 .
.

To encrypt AN EXAMPLE, we write the plaintext in groups of six letters and add the
keyword to each group letter by letter, modulo 26.

A N E X A M P L E
0 13 4 23 0 12 15 11 4
18 4 2 17 4 19 18 4 2
18 17 6 14 4 5 7 15 6
S R G O E F H P G

The ciphertext is given by SRGOEFHPG.


Example 2.2.6 Let the keyword be SKALA. So .m = 5 and
( )
k = 18 10 0 11 0 .
.

To decrypt ZSLWCAZHPR, we write the ciphertext in groups of five letters and add the
keyword to each group letter by letter modulo 26. We get the plaintext HILLCIPHER.

Z S L W C A Z H P R
25 18 11 22 2 0 25 7 15 17
18 10 0 11 0 18 10 0 11 0
7 8 11 11 2 8 15 7 4 17
H I L L C I P H E R

The size of the key space for Vigenère Cipher is given by .26m . If .m = 6, it is
about .3.1 × 108 ≈ 228.2 , which is possible to search each key using a computer.
114 2 Introduction to Cryptography

However, for larger m, it becomes much more difficult. If .m = 25, .2625 ≈ 2117 ,
which is not feasible with current computation powers.

2.2.5 Hill Cipher

Definition 2.2.5 (Hill Cipher) Let m be an integer such that m ≥ 2. Let P = C =


Zm
26 and
{ }
K = A | A ∈ Mm×m (Z26 ), det(A) ∈ Z∗26 .
.

For each A ∈ K, define

EA : Zm
. 26 → Z26 ,
m
p |→ pA; DA : Z m
26 → Z26 ,
m
c |→ cA−1 .

The cryptosystem (P, C, K, E, D), where E = { EA : A ∈ K }, D = { DA : A ∈ K },


is called the Hill cipher.
By Theorem 1.4.2, Z26 is a commutative ring. We have defined the determinant
of a square matrix with coefficients from a commutative ring R in Sect. 1.3.1
(Eq. 1.6). We discussed that an m × m matrix A is invertible in Mm×m (R) if and
only if its determinant, det(A), is a unit (see Definition 1.2.10) in R. Furthermore,
when A is invertible, its inverse can be calculated using the adjoint matrix of A
(Theorem 1.3.2). By Lemma 1.4.3, a matrix A ∈ Mn×n (Z26 ) is invertible if and
only if gcd(det(A), 26) = 1, i.e., det(A) ∈ Z∗26 . Therefore, in the definition of the
Hill cipher, we require det(A) ∈ Z∗26 so that the decryption can be computed.
Example 2.2.7 Let
⎛ ⎞
2 1 2
.A = ⎝3 12 4⎠

0 5 1

be a matrix in M3×3 (Z26 ). We denote by Aij the matrix obtained from A by deleting
the ith row and the j th column. Then
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
12 4 34 3 12
A00 =
. , A01 = , A02 = .
5 1 01 0 5

Following the discussions in Example 1.3.6, we have

. det(A00 ) = 12 − 20 mod 26 = −8 mod 26,


det(A01 ) = 3 − 0 mod 26 = 3 mod 26,
2.2 Classical Ciphers 115

det(A02 ) = 15 − 0 mod 26 = 15 mod 26.

Similarly, we can calculate

. det(A10 ) = −9 mod 26, det(A11 ) = 2 mod 26, det(A12 ) = 10 mod 26,


det(A20 ) = −20 mod 26, det(A21 ) = 2 mod 26, det(A22 ) = 21 mod 26.

Let aij denote the entry of A at ith row and j th column, then by Eq. 1.6,


2
. det(A) = (−1)j a0j det(A0j ) mod 26
j =0

= (−1)0 × 2 × (−8) + (−1)1 × 1 × 3 + (−1)2 × 2 × 15 mod 26


= −16 − 3 + 30 mod 26 = 11.

By the Euclidean algorithm (Algorithm 1.1), we can find gcd(26, 11):

.26 = 11 × 2 + 4, 11 = 4 × 2 + 3, 4 = 3 + 1, 3 = 1 × 3 =⇒ gcd(11, 26) = 1.

Thus A is an invertible matrix in M3×3 (Z26 ).


By the extended Euclidean algorithm (Algorithm 1.2),

1 = 4 − 3 = 4 − (11 − 4 × 2) = 4 × 3 − 11 = (26 − 11 × 2) × 3 − 11
.

= 26 × 3 − 11 × 7 =⇒ 11−1 mod 26 = 7.

By Theorem 1.3.2,
⎛ ⎞ ⎛ ⎞
−8 9 −20 56 −63 140
.A
−1
= −7 ⎝−3 2 −2 ⎠ mod 26 = ⎝ 21 −14 14 ⎠
15 −10 21 −105 70 −147
⎛ ⎞
4 15 10
mod 26 = ⎝21 12 14⎠ .
25 18 9

Example 2.2.8 Let


⎛ ⎞
2 1 2
.A = ⎝3 12 4⎠

0 5 1
116 2 Introduction to Cryptography

be a key for Hill


( cipher.
) Suppose
( ) the plaintext is CIPHER. By Table 2.1, this
corresponds to 2 8 15 and 7 4 17 . To encrypt, we calculate
⎛ ⎞
( ) 2 1 2 ( )
. 2 8 15 ⎝3 12 4⎠ mod 26 = 2 17 25 ,
0 5 1
⎛ ⎞
( ) 2 1 2 ( )
7 4 17 ⎝3 12 4⎠ mod 26 = 0 10 21 .
0 5 1

And the ciphertext is CRZAKV.


( Now )suppose( the )ciphertext is DOSJBQ. By Table 2.1, this corresponds to
3 14 18 and 9 1 16 . We have calculated in Example 2.2.7 that
⎛ ⎞
4 15 10
.A−1 = ⎝21 12 14⎠ .
25 18 9

We can then compute the plaintext as follows:


⎛ ⎞
( ) 4 15 10 ( ) ( )
. 3 14 18 ⎝21 12 14⎠ mod 26 = 756 537 388 mod 26 = 2 17 24 ,
25 18 9
⎛ ⎞
( ) 4 15 10 ( ) ( )
9 1 16 ⎝21 12 14⎠ mod 26 = 457 435 248 mod 26 = 15 19 14 .
25 18 9

And the plaintext is CRYPTO.


Remark 2.2.1 By Definition 2.1.2, shift cipher, affine cipher, and substitution
cipher are block ciphers of block length 1. Vigenère cipher and Hill cipher are block
ciphers of block length m.

2.2.6 Cryptanalysis of Classical Ciphers

In this subsection, we will discuss the cryptanalysis of the classical ciphers


introduced in the previous subsections. Cryptanalysis comes from the Latin words
kryptós (hidden) and analýein (to analyze). The goal of cryptanalysis is to decrypt
the ciphertext without knowing the key. Successful cryptanalysis recovers the
plaintext or even the key. We recall the different assumptions of attack described
in Sect. 2.1.3.
2.2 Classical Ciphers 117

Example 2.2.9 (Known Plaintext Attack—Hill Cipher) Let us consider a known


plaintext attack on Hill cipher. Suppose we know .m = 2, i.e., .A ∈ M2×2 (Z26 ), and
we have a string of plaintext ATTACK and its corresponding ciphertext FTMTIM. By
Definition 2.2.5, we have
(( )) ( ) (( )) ( ) (( )) ( )
EA
. 0 19 = 5 19 , EA 19 0 = 12 19 , EA 2 10 = 8 12 .

The first two plaintext–ciphertext pairs give us


⎛ ⎞ ⎛ ⎞
0 19 5 19
. A mod 26 = . (2.2)
19 0 12 19

The inverse of a .2×2 matrix can be computed using Eq. 1.7, where the computations
should be . mod 26. We have
⎛ ⎞−1 ⎛ ⎞ ⎛ ⎞
0 19 −1 07 0 11
. =3 mod 26 = .
19 0 70 11 0

Together with Eq. 2.2,


⎛ ⎞⎛ ⎞ ⎛ ⎞
0 11 5 19 21
.A = mod 26 = .
11 0 12 19 31

We can verify this key using the third plaintext–ciphertext pair


⎛ ⎞
( ) 21 ( )
. 2 10 mod 26 = 8 12 .
31

We have seen that an exhaustive key search can be used to break affine cipher,
where the attacker can find both the plaintext and the key. But this does not apply
to substitution cipher or Vigenère cipher. Next, we will discuss other cryptanalysis
methods that can be used to break those ciphers.

2.2.6.1 Frequency Analysis

By Kerckhoffs’ principle (Definition 2.1.3), we assume we know the plaintext is


an English text. We also know the cipher used for communication. We assume a
ciphertext-only attacker model, and we will show how to recover both the plaintext
and the key using frequency analysis for affine cipher and Vigenère cipher.
As the plaintext is an English text, we first analyze the probabilities for the
appearance of each letter in a standard English text. For example, Table 2.6 lists
the analysis results from [BP82]. In particular, we observe that E has the highest
probability and the second most common letter is T. Similarly, [BP82] also shows
118 2 Introduction to Cryptography

Table 2.6 Probabilities of each letter in a standard English text [BP82]


A 0.082 B 0.015 C 0.028 D 0.043 E 0.127 F 0.022
G 0.020 H 0.061 I 0.070 J 0.002 K 0.008 L 0.040
M 0.024 N 0.067 O 0.075 P 0.019 Q 0.001 R 0.060
S 0.063 T 0.091 U 0.028 V 0.010 W 0.023 X 0.001
Y 0.020 Z 0.001

that the most common two consecutive letters are TH, HE, IN, .. . . , and the most
common three consecutive letters are THE, ING, AND, .. . . .
Given a ciphertext that is encrypted using a monoalphabetic cipher (i.e., one
alphabet is mapped to a unique alphabet), we expect a permutation of the letters
in the ciphertext to have similar frequencies as in Table 2.6.
Example 2.2.10 Suppose the cipher used is an affine cipher, and we have the
following ciphertext:

VCVIRSKPOFPNZOTHOVMLVYSATISKVNVLIVSZVR.
.

We can calculate the frequencies of each letter that appear in the text:

V S I O R K P N Z T L C F H M Y A
8 4 3 3 2 2 2 2 2 2 2 1 1 1 1 1 1

The most frequent letter is V, and the second most frequent one is S. Thus, it
makes sense to assume V is the ciphertext corresponding to E and S to T. Let the key
be .(a, b), and by Table 2.1 and Definition 2.2.2, we have the following equations:

. 4a + b = 21 mod 26, 19a + b = 18 mod 26,

which gives

15a = 23 mod 26.


.

By the extended Euclidean algorithm,

26 = 15 × 1 + 11,
. 15 = 11 × 1 + 4, 11 = 4 × 2 + 3, 4 = 3 + 1,

and

1 = 4 − 3 = 4 − (11 − 4 × 2) = −11 + 4 × 3 = −11 + (15 − 11) × 3


.

= 15 × 3 − 11 × 4 = 15 × 3 − (26 − 15) × 4 = 15 × 7 − 26 × 4.
2.2 Classical Ciphers 119

Hence, we have .15−1 mod 26 = 7 and

a = 23 × 15−1 mod 26 = 23 × 7 mod 26 = 5 mod 26.


.

Furthermore, we get

b = 21 − 4a mod 26 = 21 − 4 × 5 mod 26 = 1.
.

To decrypt the message, we compute the decryption key by finding .a −1 mod 26 =


5−1 mod 26:

.26 = 5 × 5 + 1 =⇒ 1 = 26 − 5 × 5 =⇒ 5−1 mod 26 = −5 mod 26 = 21.

Applying the decryption key .(21, 1) to the ciphertext, we get the following plaintext:

.EVERYTHING IS KNOWN EXCEPT FOR THE SECRET KEY.

We note that the same technique works for substitution cipher since it is also
monoalphabetic. But a longer ciphertext might be needed since we do not have
equations to solve for the key. Instead, we must guess the mapping between each
distinct letter in the ciphertext to the 26 alphabets (see [Sti05] Section 1.2.2).
Remark 2.2.2 Suppose the length of the keyword m is determined for Vigenère
cipher. We take every mth letter from the ciphertext and obtain m ciphertexts. Then
each of them can be considered as the ciphertext of the shift cipher with a key given
by the corresponding letter in the keyword.
Example 2.2.11 Suppose we have the following ciphertext generated with
Vigenère cipher (Definition 2.2.4), and we know that the keyword length .m = 3.

SJRRIBSWRKRAOFCDACORRGSYZTCKVYXGCCSDDLCCEKOAMBHGCEKEPRS
.
TJOSDWXFOGMBVCCTMXHGXKNKVRCMLDLCMMNRIPDIVDAGVPZOXFOWYWI.

Take every third letter, and we have the following three ciphertexts:

SRSKODOGZKXCDCOBCESOWOBCXXKCDMRDDVOOW,
. JIWRFARSTVGSLEAHEPTSXGVTHKVMLMIIAPXWI,

RBRACCRYCYCDCKMGKRJDFMCMGNRLCNPVGZFY.

We note that each of them can be considered as the ciphertext of a shift cipher,
where the keys correspond to each letter of the keyword for the Vigenère cipher (as
mentioned in Remark 2.2.2). The frequencies of each letter in the first ciphertext are
as follows:
120 2 Introduction to Cryptography

O D C S K X R B W G Z E M V
7 5 5 3 3 3 2 2 2 1 1 1 1 1

The most frequent letter is O, and we assume O .(14) is the ciphertext corresponding
to E .(4). And this gives us the first letter of the keyword

14 − 4 mod 26 = 10 mod 26 = K.
.

The frequencies of each letter in the second ciphertext are as follows:

I A S T V W R G L E H P X M J F K
4 3 3 3 3 2 2 2 2 2 2 2 2 2 1 1 1

Similarly, we assume E .(4) is encrypted as I .(8). And the second letter of the
keyword is

8 − 4 mod 26 = 4 mod 26 = E.
.

The frequencies of each letter in the third ciphertext are as follows:

C R Y M G D K F N B A J L P V Z
7 5 5 3 3 3 2 2 2 1 1 1 1 1 1 1

And we have the third letter of the keyword

2 − 4 mod 26 = 24 mod 26 = Y.
.

Thus we have recovered the keyword KEY. Computing decryption with the
keyword, we get the following plaintext:

IF THE DISTANCE BETWEEN TWO APPEARANCES OF THE SAME WORD


. IS A MULTIPLE OF M, THE CORRESPONDING PARTS IN THE
CIPHERTEXT WILL BE THE SAME.

Next, we will discuss two methods to determine the length m of the keyword for
a Vigenère cipher.

2.2.6.2 Kasiski Test: Vigenère Cipher

We observe that if the distance between two appearances of the same sequence of
alphabets in the plaintext is a multiple of m, the corresponding parts in the ciphertext
2.2 Classical Ciphers 121

will be the same. Kasiski test looks for identical parts of ciphertext and records the
distance between those parts. Then we know that m is a divisor for all the distance
values.
Example 2.2.12 Suppose the plaintext is

THE MEETING WILL BE IN THE CAFE AND THE STARTING TIME IS TEN
.

and the keyword is KEY (.m = 3). The encryption gives us

THE MEETING WILL BE IN THE CAFE AND THE STARTING TIME IS TEN
. KEY KEYKEYK EYKE YK EY KEY KEYK EYK EYK EYKEYKEY KEYK EY KEY

DLC WICDMLQ AGVP ZO ML DLC MEDO ELN XFO WRKVRSRE DMKO MQ DIL.

The first two appearances of THE have distance 15, which is a multiple of 3,
and hence the corresponding parts in the ciphertext are the same DLC. But the
third appearance of THE has distance 7 from the second appearance, and the
corresponding parts in the ciphertext are different.
On the other hand, if we have only the ciphertext, we can observe the two
identical parts DLC with distance 15, and then we can conclude that very likely
m is a divisor of 15, i.e., .m = 1, 3, 5, 15. To decide the exact value of m, a longer
ciphertext is needed, or frequency analysis (see Example 2.2.11) can be applied
assuming different values of m until a meaningful plaintext is found.

2.2.6.3 Index of Coincidence: Vigenère Cipher

Definition 2.2.6 Let x = x1 x2 . . . xn be a string of n alphabetic characters. The


index of coincidence of x, denoted by Ic (x), is the probability that two random
elements of x are identical.
Example 2.2.13 Let x be a long random text. If we randomly choose a letter from
x, we expect that the probability for each letter to be chosen is close to 1/26. Then,
if we randomly choose two letters from x, the probability for those two letters to be
the same is close to 1/262 . The index of coincidence for x will be close to
⎛ ⎞2
1
.Ic (x) ≈ 26 = 0.038.
26

Example 2.2.14 Let x be a long English text. If we randomly choose a letter from
x, we expect that the probabilities for each letter to be chosen are similar to the
values listed in Table 2.6. If we randomly choose two letters from x, the probability
for both letters to be A is then given by 0.0822 , the probability for both to be B is
0.0152 , etc. Thus, the index of coincidence for x can be approximated as
122 2 Introduction to Cryptography


25
Ic (x) ≈
. pi2 = 0.065.
i=0

Remark 2.2.3 If x is a ciphertext string obtained using any monoalphabetic cipher,


we would expect Ic (x) to be close to 0.065. The individual probabilities for different
alphabets will be permuted, but the sum will be unchanged.
Let f0 , f1 , . . . , f25 denote the frequencies of letters A, B, . . . , Z in x. If we randomly
choose a letter, the probability of each letter appearing is then given by
(fi )
. (n2) .
2

We have the following formula for Ic (x):


∑25 (fi ) ∑25
i=0 2 fi (fi − 1)
.Ic (x) = ( ) = i=0 . (2.3)
n
2
n(n − 1)

Example 2.2.15 Let x be the ciphertext from Example 2.2.11. The total number of
letters is 110, and the frequencies of each letter are

C R O D S K G M V X I W A B F Y T L E P J Z H N
12 9 7 7 7 6 6 6 5 5 4 4 4 3 3 3 3 3 3 3 2 2 2 2

By Eq. 2.3, the index of coincidence of x is

1
Ic (x) =
. (12 × 11 + 9 × 8 + · · · + 2 × 1) = 0.004454.
110 × 109

Given a ciphertext c = c1 c2 . . . cn output from Vigenère cipher. To find the length


of the keyword m, for each m ≥ 1, we construct substrings of c by taking every mth
letter.

c1 = c1 cm+1 . . .
.

c2 = c2 cm+2 . . .
..
.
cm = cm c2m . . .

If m is the keyword length, we expect Ic (ci ) to be close to 0.065 (see Remark 2.2.3).
Otherwise, ci will be more random and Ic (ci ) will be closer to 0.038 (see
Example 2.2.13)
2.2 Classical Ciphers 123

Example 2.2.16 Suppose we have the same ciphertext as in Example 2.2.11, and
we do not know the value of m.
Assume m = 1, we have calculated that

Ic (c) = 0.004454
.

in Example 2.2.15.
Assume m = 2, we have

c1 = SRISRROCAORSZCVXCSDCEOMHCKPSJSWFGBCTXGKKRMDCMRPIDGPOFWW
.
c2 = JRBWKAFDCRGYTKYGCDLCKABGEERTODXOMVCMHXNVCLLMNIDVAVZXOYI

and

Ic (c1 ) = 0.05253,
. Ic (c2 ) = 0.03636.

Assume m = 3,

c1 = SRSKODOGZKXCDCOBCESOWOBCXXKCDMRDDVOOW
. c2 = JIWRFARSTVGSLEAHEPTSXGVTHKVMLMIIAPXWI
c3 = RBRACCRYCYCDCKMGKRJDFMCMGNRLCNPVGZFY

and

Ic (c1 ) = 0.07958,
. Ic (c2 ) = 0.04054, Ic (c3 ) = 0.06984.

Thus it is more likely that m = 3. The exact value can be verified by frequency
analysis as shown in Example 2.2.11 to see if the recovered plaintext is meaningful.

2.2.7 One-Time Pad

In this subsection, we will discuss a type of synchronous stream cipher (see


Sect. 2.1.2) called one-time pad, which was invented by Gilbert Vernam in 1917.
Definition 2.2.7 (One-Time Pad) Given a positive integer n, let .P = C = K =
Fn2 . For any .k ∈ K, define

.Ek : Fn2 → Fn2 , p |→ p ⊕ k Dk : Fn2 → Fn2 , c |→ c ⊕ k.

The cryptosystem .(P, C, K, E, D), where .E = { Ek : k ∈ K }, .D = { Dk : k ∈ K },


is called the one-time pad.
Recall that vector addition in .Fn2 is defined as bitwise XOR, denoted by .⊕ (see
Definition 1.3.6).
124 2 Introduction to Cryptography

For encryption, we require the key to be chosen randomly with uniform


probability (see Definition 1.7.3) from .K. This requirement will be justified in
Theorem 2.2.1. Furthermore, we note that if the attacker has knowledge of one
pair of plaintext p and its corresponding ciphertext c, they can recover the key by
computing .p ⊕ c = p ⊕ p ⊕ k = k. Thus each key can be used only once.
One distinct feature of the one-time pad from the previously introduced classical
ciphers is that it achieves perfect secrecy (see Sect. 2.1.3). Before proving this, we
will first formalize the notion of perfect secrecy.
Let .P, .C, and .K denote the plaintext space, ciphertext space, and key space,
respectively, for a given cryptosystem. The random experiment we are interested in
is encryption using one key and one plaintext for communication. The sample space
(see Sect. 1.7) is .Ω = P × K.
Let .p := { (p, k) | k ∈ K } denote the event that p is encrypted. Similarly,
.k := { (p, k) | p ∈ P } denotes the event that k is used for encryption. .c :=
{ (p, k) | Ek (p) = c } denotes the event that c is the ciphertext. Note that p and
k are independent.
By Kerckhoffs’ principle, .P (p) and .P (k) are known to the attacker. Then the
cryptosystem is perfectly secure if p and c are independent (Definition 1.7.4) for
any p and c, or equivalently (Eq. 1.28)

P (p ∩ c) = P (p)P (c), i.e., P (p|c) = P (p).


.

Example 2.2.17 (An Example of Cipher that Is Not Perfectly Secure) Let

P = { 0, 1 } ,
. K = { x, y } , C = { α, β } .

Define the encryption functions as follows:

Ex (0) = Ey (1) = α,
. Ex (1) = Ey (0) = β.

Suppose

1 2 1 4
P (0) =
. , P (1) = , P (x) = , P (y) = .
3 3 5 5
Then
3
P (α) = P (x ∩ 0) + P (y ∩ 1) = P (x)P (0) + P (y)P (1) =
. ,
5
and
P (0)P (α|0) P (0)P (x) 1
P (0|α) =
. = = .
P (α) P (α) 9
2.2 Classical Ciphers 125

We have
8
P (1|α) = 1 − P (0|α) =
. ,
9
and
2
P (β) = 1 − P (α) =
. .
5
Similarly, we get

2 1
.P (0|β) = , P (1|β) = .
3 3

Thus .P (p|c) /= P (p) for all .p ∈ P, c ∈ C, and the cipher is not perfectly secure.
In particular, if the attacker knows the ciphertext is .α, they can conclude that it is
more likely that the plaintext is 1 rather than 0, and if the ciphertext is .β, they can
conclude that it is more likely for the plaintext to be 0.
We recall uniform probability measures from Definition 1.7.3.
Theorem 2.2.1 One-time pad is perfectly secure if and only if the probability
measure on the key space is uniform.
Proof Fix a positive integer n, and let .P = C = K = Fn2 . For any .p ∈ P and .c ∈ C,
if c is the ciphertext corresponding to p, then we know the key used is .kp,c := p ⊕c.
Thus

P (c|p) = P (kp,c ).
.

=⇒ Fix .c ∈ C, for any p, we have


.

P (p ∩ c) P (p)P (c)
.P (kp,c ) = P (c|p) = = = P (c),
P (p) P (p)

which shows that the probability of .kp,c is not dependent on p and the probabilities
of all .kp,c s are the same for this fixed c. When p takes all possible values in .P, we
have all possible values of .kp,c ∈ K. Thus we can conclude that .P (k) is the same
for all .k ∈ K.
.⇐= Since .{ q | q ∈ P } is a finite partition of .Ω, by Theorem 1.7.2, for any .c ∈ C

and any .p ∈ P,

P (c|p)P (p) P (kp,c )P (p)


. P (p|c) = ∑ =∑ .
q∈P P (c|q)P (q) q∈P P (kq,c )P (q)

Since the probability measure on the key space is uniform,


126 2 Introduction to Cryptography

1
.P (k) = , ∀k ∈ K.
|K|

Also, . q∈P P (q) = 1. We have

P (kp,c )P (p) P (p)


.P (p|c) = ∑ =∑ = P (p).
q∈P P (k q,c )P (q) q∈P P (q)



We note that brute force of the key does not work for one-time pad—by brute force,
the attacker can obtain any plaintext of the same length as the original plaintext.
However, key management is the bottle neck of one-time pad. With a plaintext
of length n, we will also need a key of length n. Furthermore, as we have mentioned
earlier, each key can only be used once. Thus it is necessary to share a key of the
same length as the message each time before the communication. This makes it
impractical to use one-time pad.

2.3 Encryption Modes

We have seen a few examples of classical block ciphers. For messages that are
longer than the block length, the way we encrypted them (e.g., see Examples 2.2.8
and 2.2.5) can be described by Fig. 2.2. Similarly, the decryption method we have
applied (e.g., see Examples 2.2.8 and 2.2.6) corresponds to Fig. 2.3.
In general, when we use a symmetric block cipher of block length n to encrypt a
long message, we first divide this long message into blocks of plaintexts of length
n. Then we apply certain encryption mode to encrypt the plaintext blocks. If the last
block has a length of less than n, padding might be required. Different methods exist
for padding, e.g., using a constant or using a random number.
The simplest encryption mode is the mode we have been using so far, which is
called electronic codebook (ECB) mode. ECB mode is easy to use, but the main
drawback is that the encryption of identical plaintext blocks produces identical
ciphertext blocks. For an extreme case, if the plaintext is either all 0s or all 1s, it
would be easy for the attacker to deduce the message given a collection of plaintext

Fig. 2.2 ECB mode for encryption


2.3 Encryption Modes 127

Fig. 2.3 ECB mode for decryption

Fig. 2.4 Original picture and encrypted picture with ECB and CBC modes

Fig. 2.5 CBC mode for encryption

and ciphertext pairs. Due to this property, it is also easy to recognize patterns of
the plaintext in the ciphertext, which makes statistical attacks easier (e.g., frequency
analysis of the affine cipher described in Example 2.2.10). For example, Fig. 2.4b
gives an example for encryption using ECB mode. Compared to the original image
in Fig. 2.4a, we can see a clear pattern of the plaintext from the ciphertext.
To avoid such problems, we can use the cipherblock chaining (CBC) mode. The
encryption and decryption are shown in Figs. 2.5 and 2.6, respectively, where IV
stands for initialization vector. An IV has the same length as the plaintext block and
is public. We can see that with CBC, the same plaintext is encrypted differently with
different IVs. Figure 2.4a encrypted with CBC mode is shown in Fig. 2.4c, where
no clear pattern can be seen.
Furthermore, if a plaintext block is changed, the corresponding ciphertext block
will also be changed, affecting all the subsequent ciphertext blocks. Hence CBC
mode can also be useful for authentication.
However, with CBC mode, the receiver needs to wait for the previous ciphertext
block to arrive to decrypt the next ciphertext block. In real-time applications, output
feedback (OFB) mode can be used to make communication more efficient. As
128 2 Introduction to Cryptography

Fig. 2.6 CBC mode for decryption

Fig. 2.7 OFB mode for encryption

Fig. 2.8 OFB mode for decryption

shown in Figs. 2.7 and 2.8, the encryption function is not used for encrypting the
plaintext blocks, rather it is used for generating a key sequence. Ciphertext blocks
are computed by XORing the plaintext blocks and the key sequence. Such a design
allows the receiver and the sender to generate the key sequence simultaneously
before the ciphertext is sent.
In a way, OFB mode can be considered as a synchronous stream cipher (see
Sect. 2.1.2). Another advantage of OFB mode is that padding is not needed.
However, the encryption of a plaintext block does not depend on the previous blocks,
which makes it easier for the attacker to modify the ciphertext blocks.
2.4 Further Reading 129

2.4 Further Reading

We refer the readers to [Sti05, Chapter 1] for more discussions on classical ciphers
and to [MVOV18] for a detailed presentation on different cryptographic primitives.
As for encryption modes and padding schemes, we refer the readers to [PP09,
Chapter 5].
In Sect. 2.2.7 we introduced a classical stream cipher—one-time pad. The area of
stream ciphers, albeit less discussed in the cryptography books than its block cipher
counterpart, encompasses many modern algorithm designs. We do not go into detail
in this book; interested readers will find more information in [KPP+ 22].
The physical attacks we will present in Chaps. 4 and 5 are for symmetric
block ciphers, one particular public key cipher (RSA), and RSA signatures. There
is also plenty of research on physical attacks on other cryptographic primitives,
e.g., hash functions [HH11, HLMS14, KMBM17], post-quantum public key
algorithms [MWK+ 22, PSKH18, XIU+ 21, PPM17], or stream ciphers [BMV07,
BT12, KDB+ 22].
Chapter 3
Modern Cryptographic Algorithms
and Their Implementations

We have defined cryptosystem/cipher in Definition 2.1.1. When the keys for


encryption and decryption are the same, it is a symmetric cipher. Otherwise, it is
a public-key/asymmetric cipher. In general, symmetric key ciphers are faster, but
they require key exchange before communication.
In this chapter, we will detail the designs of three symmetric block ciphers—
DES (Sect. 3.1.1), AES (Sect. 3.1.2), and PRESENT (Sect. 3.1.3) as well as one
public key cipher—RSA (Sect. 3.3). We will also discuss how RSA can be used
for digital signatures (Sect. 3.4). Moreover, we will present different techniques for
implementing those algorithms (Sects. 3.2 and 3.5).

3.1 Symmetric Block Ciphers

For the construction of symmetric block ciphers, two important principles are
followed by modern cryptographers—confusion and diffusion. Shannon first intro-
duced them in his famous paper [Sha45].
Confusion obscures the relationship between the ciphertext and the key. To
achieve this, each part of the ciphertext should depend on several parts of the key.
For example, in Vigenère cipher, each letter of the plaintext and each letter of the key
influence exactly one letter of the ciphertext. Consequently, we can use the Kasiski
test (Sect. 2.2.6.2) or index of coincidence (Sect. 2.2.6.3) to attack the Vigenère
cipher. Diffusion obscures the statistical relationship between the plaintext and the
ciphertext. Each change in the plaintext is spread over the ciphertext, with the
redundancies being dissipated. For example, monoalphabetic ciphers (Sect. 2.2.4)
have very low diffusion—the distributions of letters in plaintext correspond directly
to those in the ciphertext. That is also why frequency analysis (Sect. 2.2.6.1) can be
applied to break those ciphers.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 131
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_3
132 3 Modern Cryptographic Algorithms and Their Implementations

As mentioned in Sect. 2.1.2, for modern symmetric block ciphers, .P = C = Fn2


for a positive integer n, which is called the block length of the cipher. Furthermore,
the key space is also a vector space over .F2 and its dimension is called the key length
of the cipher. Each key .k ∈ K is called a master key.
A symmetric block cipher design specifies a round function and a key schedule.
Encryption of a plaintext block consists of a few rounds of round functions, possibly
with minor differences. Each round function takes the cipher’s current state as an
input and outputs the next state. The key schedule takes the master key k and outputs
the keys for each round, which are called round keys. In most cases, the key schedule
is an invertible function. In particular, given one or more round keys, the master keys
can be calculated.
By Kerckhoffs’ principle, round functions and key schedule specifications are
public, but the master key (hence also the round keys) are secret. In physical attacks
that we will discuss in the later parts of the book, the attacker normally aims to
recover some round key(s) and then use the inverse key schedule to find the master
key.
To be more specific, suppose we have a symmetric block cipher with round
function F and in total Nr number of rounds. Let .Ki denote the round key for round
i and .Si denote the cipher state at the end of round i. For a plaintext .p ∈ Fn2 , the
corresponding ciphertext .c ∈ Fn2 can be computed as follows:1

S0 = p,
.

S1 = F (S0 , K1 ),
S2 = F (S1 , K2 ),
..
.
SNr = F (SNr−1 , KNr ),
c = SNr .

To perform decryption, we require that for any given round key .Ki , .F (·, Ki ) has an
inverse, i.e.,

F −1 (F (x, Ki ), Ki ) = x,
. ∀x ∈ Fn2 .

In this case, given ciphertext c, plaintext p can be computed as follows:

. SNr = c,
SNr−1 = F −1 (SNr , KNr ),

1 The round function for the last round might be a bit different, as for the case of AES (see

Sect. 3.1.2).
3.1 Symmetric Block Ciphers 133

..
.
S1 = F −1 (S2 , K2 ),
S0 = F −1 (S1 , K1 ),
p = S0 .

We recall for a vector space over .F2 , vector addition is given by bitwise XOR,
denoted .⊕ (Definition 1.3.6). XOR with the round key is a common operation in
round functions of symmetric block ciphers.
Another common function is a substitution function called Sbox, denoted SB,

SB : Fω2 1 → Fω2 2 .
.

Normally .ω1 or/and .ω2 is a divisor of the block length n and a few Sboxes are
applied in one round function. When .ω1 = ω2 , SB is a permutation on .Fω2 1 and we
say that the Sbox is a .ω1 −bit Sbox.
There are mainly two types of symmetric block ciphers—Feistel cipher and
Substitution–permutation network (SPN) cipher.
For a Feistel cipher, the cipher state at the beginning/end of each round is divided
into two halves of equal length. The cipher state at the end of round i is denoted as
.Li and .Ri , where L stands for left and R stands for right. The round function F is

defined as

(Li , Ri ) = F (Li−1 , Ri−1 , Ki ), where Li = Ri−1 , Ri = Li−1 ⊕ f (Ri−1 , Ki ).


.

(3.1)
We note that f is a function that does not need to have an inverse since the function
F defined as in Eq. 3.1 is always invertible:

Li−1 = Ri ⊕ f (Li , Ki ),
. Ri−1 = Li .

Furthermore, the ciphertext is normally given by .RNr ||LNr (i.e., swapping the left
and right side of the cipher state at the end of the last round). In this case, if we let
.Ri and .Li denote the right and left part of the cipher state at the end of round i in

the decryption, then the decryption computation is the same as in Eq. 3.1 except that
the round keys are in reverse order as that for encryption. An illustration of Feistel
cipher can be seen in Fig. 3.1.
Let .ω be a divisor of n, the block length, and let .𝓁 = n/ω. The design of an SPN
cipher encryption is shown in Fig. 3.2, where SB is an .ω-bit Sbox. In most cases,
.ω = 4, 8.

Each round of an SPN cipher normally consists of bitwise XOR with the round
key, application of .𝓁 parallel .ω-bit Sboxes, and a permutation on .Fn2 . The encryption
starts with XOR with a round key, also ends with XOR with a round key before
outputting the ciphertext. Otherwise, the cipher states in the second (or the last)
134 3 Modern Cryptographic Algorithms and Their Implementations

Fig. 3.1 An illustration of


Feistel cipher encryption
algorithm

Fig. 3.2 An illustration of SPN cipher encryption algorithm


3.1 Symmetric Block Ciphers 135

Fig. 3.3 An illustration of


DES encryption algorithm

round are all known to the attacker. Those two operations are called whitening. For
decryption, the inverse of Sbox and permutation are computed, and round keys are
XOR-ed with the cipher state in reverse order compared to that for encryption.

3.1.1 DES

Let us first look at one Feistel cipher—Data Encryption Standard (DES). DES was
developed at IBM by a team led by Horst Feistel and the design was based on Lucifer
cipher [Sor84]. It was used as the NIST standard from 1977 to 2005. Furthermore,
it has a significant influence on the development of cipher design.
The block length of DES is .n = 64, i.e., .P = C = F64 2 . Hence .Li , Ri ∈ F2 .
32

The master key length is 56, i.e., .K = F2 . The round key length is 48. The total
56

number of rounds .Nr = 16. An illustration of DES encryption is shown in Fig. 3.3.
Each DES round function follows the structure as described in Eq. 3.1.
Before the first round function, the encryption starts with an initial permutation
(IP). The inverse of IP, called the final permutation .(I P −1 ) is applied to the
cipher state after the last round before outputting the ciphertext. Initial and final
permutations are included for the ease of loading plaintext/ciphertext. Initial and
final permutations are shown in Table 3.1. For example, in IP, the 1st bit of the
output is from the 58th bit of the input. The 2nd bit of the output is from the 50th
bit of the input.
136 3 Modern Cryptographic Algorithms and Their Implementations

Table 3.1 Initial permutation (IP) and final permutation (IP.−1 ) in DES algorithm
(a) IP (b) IP.−1
58 50 42 34 26 18 10 2 40 8 48 16 56 24 64 32
60 52 44 36 28 20 12 4 39 7 47 15 55 23 63 31
62 54 46 38 30 22 14 6 38 6 46 14 54 22 62 30
64 56 48 40 32 24 16 8 37 5 45 13 53 21 61 29
57 49 41 33 25 17 9 1 36 4 44 12 52 20 60 28
59 51 43 35 27 19 11 3 35 3 43 11 51 19 59 27
61 53 45 37 29 21 13 5 34 2 42 10 50 18 58 26
63 55 47 39 31 23 15 7 33 1 41 9 49 17 57 25

Fig. 3.4 Function f in DES


round function

Note For DES specification, we consider the 1st bit of a value as the leftmost
bit in its binary representation. For example, the 1st bit of .3 = 0112 is 0, the
2nd bit is 1 and the last bit is 1.

At the ith round, the function f in the round function of DES takes input .Ri−1 ∈
F32
2 and round key .Ki ∈ F2 , then outputs a 32-bit intermediate value as follows:
48

f (Ri−1 , Ki ) = PDES (Sboxes(EDES (Ri−1 ) ⊕ Ki )).


.

Firstly, .Ri−1 is passed to an expansion function .EDES : F32


2 → F2 . Then the output
48

.EDES (Ri−1 ) is XOR-ed with the round key .Ki , producing a 48-bit intermediate

value. This 48-bit value is divided into eight 6-bit subblocks. Eight distinct Sboxes,
j
SB.DES : F62 → F42 .(1 ≤ j ≤ 8), are applied to each of the 6 bits. Finally, the
resulting 32-bit intermediate value goes through a permutation function .PDES :
F32
2 → F2 . An illustration of f is shown in Fig. 3.4.
32

Details of the expansion function .EDES are given in Table 3.2. 16 bits of the input
are repeated and affect two bits of the output, which influence two Sboxes. Such a
design makes the dependency of the output bits on the input bits spread faster and
achieves higher diffusion.
3.1 Symmetric Block Ciphers 137

Table 3.2 Expansion function .EDES : F32 2 → F2 in DES round function. The 1st bit of the output
48

is given by the 32nd bit of the input. The 2nd bit of the output is given by the 1st bit of the input
32 1 2 3 4 5 4 5 6 7 8 9 8 9 10 11
12 13 12 13 14 15 16 17 16 17 18 19 20 21 20 21
22 23 24 25 24 25 26 27 28 29 28 29 30 31 32 1

Table 3.3 SB.1DES in DES found function


14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13

Table 3.4 Permutation function .PDES : F32 2 → F2 in DES round function. The 1st bit of the
32

output is given by the 16th bit of the input. The 2nd bit of the output comes from the 7th bit of
the input
16 7 20 21 29 12 28 17 1 15 23 26 5 18 31 10
2 8 24 14 32 27 3 9 19 13 30 6 22 11 4 25

The design of the first Sbox is shown in Table 3.3, and the rest of the Sboxes
are detailed in Appendix C. To use those tables, take an input of one Sbox, say
.b1 b2 b3 b4 b5 b6 , the output corresponds to row .b1 b6 and column .b2 b3 b4 b5 . We note

that each row of each of the Sbox tables is a permutation of integers .0, 1, . . . , 15.
Example 3.1.1 Suppose the input of SB.1DES is

b1 b2 b3 b4 b5 b6 = 100110.
.

According to Table 3.3, the row number is given by .b1 b6 = 2. The column number
is given by .b2 b3 b4 b5 = 0011 = 3. Hence the output is .8 = 1000. Similarly (see
Table C.1 (b)),

SB3DES (100110) = 9 = 1001


.

The details of the permutation function .PDES are given in Table 3.4.
The key schedule of DES takes a 64-bit master key as input and outputs round
keys of length 48. An illustration of the key schedule is in Fig. 3.5, where PC stands
for permuted choice.
Each 8th bit of the master key is a parity-check bit of the previous 7 bits, i.e.,
the XORed value of those 7 bits. PC1 reduces 64-bit input to 56 bit by ignoring those
parity-check bits and outputs a permutation of the remaining 56 bits. Then the output
is divided into two 28-bit halves (see Table 3.5). Each half rotates left by one or two
bits, depending on the round (see Table 3.6). Finally, PC2 selects 48 bits out of 56
bits, permutes them, and outputs the round key (see Table 3.7).
138 3 Modern Cryptographic Algorithms and Their Implementations

Fig. 3.5 DES key schedule

Table 3.5 Left and right part of the intermediate values in DES key schedule after PC1. The 1st
bit of the left part comes from the 57th bit of the master key (input to PC1)
Left Right
57 49 41 33 25 17 9 63 55 47 39 31 23 15
1 58 50 42 34 26 18 7 62 54 46 38 30 22
10 2 59 51 43 35 27 14 6 61 53 45 37 29
19 11 3 60 52 44 36 21 13 5 28 20 12 4

Table 3.6 Number of key bits rotated per round in DES key schedule
Round 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Rotation 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 1

Table 3.7 PC2 in DES key schedule


14 17 11 24 1 5 3 28 15 6 21 10 23 19 12 4
26 8 16 7 27 20 13 2 41 52 31 37 47 55 30 40
51 45 33 48 44 49 39 56 34 53 46 42 50 36 29 32

For some master keys, the key schedule outputs the same round keys for more
than one round. Those master keys are called weak keys. Weak keys should not be
used. It can be shown that there are in total four of them:
• 01010101 01010101,
3.1 Symmetric Block Ciphers 139

• FEFEFEFE FEFEFEFE,
• E0E0E0E0 F1F1F1F1,
• 1F1F1F1F 0E0E0E0E.
Remark 3.1.1 From the design of the DES key schedule, we can see that with the
knowledge of any round key, the attacker can recover 48 bits of the master key.
The remaining 8 can be found by brute force. Alternatively, with the knowledge of
another round key, the master key can be recovered.

3.1.2 AES

In 1997, NIST published a call for cryptographic algorithms as a replacement for


DES. In October 2000, Rijndael was selected as the winner and certain versions of
Rijndael are set as the Advanced Encryption Standard (AES). Rijndael was invented
by Belgian cryptographers Joan Daemen and Vincent Rijmen and optimized for
software efficiency on 8 and 32 bit processors.
For AES, block length .n = 128, number of rounds Nr .= 10, 12, 14 with
corresponding key lengths .128, 192, 256. The corresponding algorithms are hence
named AES-128, AES-192, and AES-256 respectively. The original design of
Rijndael also allows for other key lengths and block lengths. As shown in Table 3.8,
where blue-colored values are specifications adopted by AES.
The encryption algorithm starts with an initial AddRoundKey operation. Then
the round function for the first Nr.−1 rounds consists of four operations: SubBytes,
ShiftRows, MixColumns, and AddRoundKey. Finally, the last round (round Nr)
consists of SubBytes, ShiftRows, and AddRoundKey. AddRoundKey is bitwise XOR
with the round key and SubBytes is the application of 8-bit Sboxes. ShiftRows
permutes the bytes and MixColumns is a function on 32-bit values (four bytes).
Figure 3.6 illustrates the AES round function.
The inverse of SubBytes, ShiftRows, and MixColumns are denoted as
InvSubBytes, InvShiftRows, and InvMixColumns respectively. The first round
of AES decryption computes AddRoundKey, InvShiftRows, and InvSubBytes.
Then the round function for the next Nr.−1 rounds consists of AddRoundKey,

Table 3.8 Specifications of Block length


Rijndael design, where
Key length 128 160 192 224 256
blue-colored values are
adopted by AES 128 10 11 12 13 14
160 11 11 12 13 14
192 12 12 12 13 14
224 13 13 13 13 14
256 14 14 14 14 14
140 3 Modern Cryptographic Algorithms and Their Implementations

Fig. 3.6 AES round function for round i, .1 ≤ i ≤Nr.−1. SB, SR, MC, and AK stand for SubBytes,
ShiftRows, MixColumns, and AddRoundKey respectively

InvMixColumns, InvShiftRows, and InvSubBytes. Finally, there is an additional


AddRoundKey operation. The round keys for decryption are in reverse order as
those for encryption.
To give more details on the AES round function, we represent the AES cipher
state as a four-by-four matrix of bytes:
⎛ ⎞
s00 s01 s02 s03
⎜s10 s11 s12 s13 ⎟
.⎜ ⎟. (3.2)
⎝s20 s21 s22 s23 ⎠
s30 s31 s32 s33

Recall that one byte is a vector in .F82 and can be represented as a hexadecimal
number between 00 and FF (see Definition 1.3.7 and Remark 1.3.3). As discussed
in Sect. 1.5.1, a byte can also be identified as an element in .F2 [x]/(f (x)), where
.f (x) = x + x + x + x + 1 ∈ F2 [x] is an irreducible polynomial over .F2 .
8 4 3
( )
Remark 3.1.2 We refer to . si0 si1 si2 si3 as the .(i + 1)th row of the cipher state,
and
⎛ ⎞
s0j
⎜s1j ⎟
.⎜ ⎟
⎝s2j ⎠
s3j

as the .(j + 1)th column of the cipher state.


The 8-bit Sbox in AES can be described using Table 3.9, for example,
SB.AES (12) = C9. Different from the eight Sboxes in DES, AES Sbox can also
be defined algebraically. Let
3.1 Symmetric Block Ciphers 141

Table 3.9 AES Sbox


0 1 2 3 4 5 6 7 8 9 A B C D E F
0 63 7C 77 7B F2 6B 6F C5 30 01 67 2B FE D7 AB 76
1 CA 82 C9 7D FA 59 47 F0 AD D4 A2 AF 9C A4 72 C0
2 B7 FD 93 26 36 3F F7 CC 34 A5 E5 F1 71 D8 31 15
3 04 C7 23 C3 18 96 05 9A 07 12 80 E2 EB 27 B2 75
4 09 83 2C 1A 1B 6E 5A A0 52 3B D6 B3 29 E3 2F 84
5 53 D1 00 ED 20 FC B1 5B 6A CB BE 39 4A 4C 58 CF
6 D0 EF AA FB 43 4D 33 85 45 F9 02 7F 50 3C 9F A8
7 51 A3 40 8F 92 9D 38 F5 BC B6 DA 21 10 FF F3 D2
8 CD 0C 13 EC 5F 97 44 17 C4 A7 7E 3D 64 5D 19 73
9 60 81 4F DC 22 2A 90 88 46 EE B8 14 DE 5E 0B DB
A E0 32 3A 0A 49 06 24 5C C2 D3 AC 62 91 95 E4 79
B E7 C8 37 6D 8D D5 4E A9 6C 56 F4 EA 65 7A AE 08
C BA 78 25 2E 1C A6 B4 C6 E8 DD 74 1F 4B BD 8B 8A
D 70 3E B5 66 48 03 F6 0E 61 35 57 B9 86 C1 1D 9E
E E1 F8 98 11 69 D9 8E 94 9B 1E 87 E9 CE 55 28 DF
F 8C A1 89 0D BF E6 42 68 41 99 2D 0F B0 54 BB 16

⎛ ⎞ ⎛ ⎞
1 1 1 1 1 0 0 0 0
⎜0
⎜ 1 1 1 1 1 0 0⎟

⎜1⎟
⎜ ⎟
⎜0
⎜ 0 1 1 1 1 1 0⎟

⎜1⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜0 0 0 1 1 1 1 1⎟ ⎜0⎟
.A = ⎜ ⎟, a = ⎜ ⎟,
⎜1 0 0 0 1 1 1 1⎟ ⎜0⎟
⎜ ⎟ ⎜ ⎟
⎜1
⎜ 1 0 0 0 1 1 1⎟

⎜0⎟
⎜ ⎟
⎝1 1 1 0 0 0 1 1⎠ ⎝1⎠
1 1 1 1 0 0 0 1 1

then

Az−1 + a z /= 0
SBAES (z) =
. (3.3)
a z=0

where .z−1 is the inverse of z as an element in .F2 [x]/(f (x)) (see Sect. 1.5.1).
Example 3.1.2 SB.AES (00) = a = 011000112 = 63.
Example 3.1.3 Suppose the input of AES Sbox is .03 = 000000112 , which
corresponds to .x + 1 ∈ F2 [x]/(f (x)). We have shown in Example 1.5.21 that
.03
−1 = 11110110 . Then
2
142 3 Modern Cryptographic Algorithms and Their Implementations

⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1111 1 0 0 0 1 0 0 0 0
⎜ 1⎟ ⎜0 1 1 1 1 1 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜
0⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟ ⎟ ⎜
⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜0 0 1 1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ 1 1 1 ⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜0 0 0 1 1 1 1 1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
.A ⎜ ⎟ + a = ⎜ ⎟⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟.
⎜ 0⎟ ⎜1 0 0 0 1 1 1 1⎟ ⎜0⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1⎟ ⎜1 1 0 0 1⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ 0 1 1 ⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎝ 1⎠ ⎝1 1 1 0 0 0 1 1 ⎝1⎠ ⎝1⎠ ⎝0⎠ ⎝1⎠ ⎝1⎠

0 1111 0 0 0 1 0 1 0 1 1

So SB.AES (03) = 011110112 = 7B, which agrees with Table 3.9.


For decryption, we need to compute the inverse of SubBytes, InvSubBytes. Let
g denote the function

g(z) = Az + a.
.

Then by Eq. 3.3, InvSubBytes computes



−1 (g −1 (z))−1 g −1 (z) /= 0
.SB
AES (z) = ,
0 g −1 (z) = 0

where .g −1 (z) is given by (see [DR02])


⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 1 0 0
⎜0 0 1 0 1 0 0 1⎟ ⎜0⎟
⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜0⎟
⎜ 0 0 1 0 1 0 ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
−1 ⎜0 1 0 0 1 0 1 0⎟ ⎜0⎟
.g (z) = ⎜ ⎟z + ⎜ ⎟
⎜0 0 1 0 0 1 0 1 ⎟ ⎜0⎟
⎜ ⎟ ⎜ ⎟
⎜1 0 0 1 0 0 1 0⎟⎟ ⎜1⎟
⎜ ⎜ ⎟
⎝0 1 0 0 1 0 0 1⎠ ⎝0⎠
1 0 1 0 0 1 0 0 1

InvSubBytes can also be described using a table, as detailed in Table 3.10.


Example 3.1.4 Let .z = 63 = 011000112 . Then
3.1 Symmetric Block Ciphers 143

Table 3.10 Inverse of AES Sbox


0 1 2 3 4 5 6 7 8 9 A B C D E F
0 52 09 6A D5 30 36 A5 38 BF 40 A3 9E 81 F3 D7 FB
1 7C E3 39 82 9B 2F FF 87 34 8E 43 44 C4 DE E9 CB
2 54 7B 94 32 A6 C2 23 3D EE 4C 95 0B 42 FA C3 4E
3 08 2E A1 66 28 D9 24 B2 76 5B A2 49 6D 8B D1 25
4 72 F8 F6 64 86 68 98 16 D4 A4 5C CC 5D 65 B6 92
5 6C 70 48 50 FD ED B9 DA 5E 15 46 57 A7 8D 9D 84
6 90 D8 AB 00 8C BC D3 0A F7 E4 58 05 B8 B3 45 06
7 D0 2C 1E 8F CA 3F 0F 02 C1 AF BD 03 01 13 8A 6B
8 3A 91 11 41 4F 67 DC EA 97 F2 CF CE F0 B4 E6 73
9 96 AC 74 22 E7 AD 35 85 E2 F9 37 E8 1C 75 DF 6E
A 47 F1 1A 71 1D 29 C5 89 6F B7 62 0E AA 18 BE 1B
B FC 56 3E 4B C6 D2 79 20 9A DB C0 FE 78 CD 5A F4
C 1F DD A8 33 88 07 C7 31 B1 12 10 59 27 80 EC 5F
D 60 51 7F A9 19 B5 4A 0D 2D E5 7A 9F 93 C9 9C EF
E A0 E0 3B 4D AE 2A F5 B0 C8 BE BB 3C 83 53 99 61
F 17 2B 04 7E BA 77 D6 26 E1 69 14 63 55 21 0C 7D

⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 1 0 0 0 0 0 0
⎜0 0 1 0 1 0 0 1⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 1 0 ⎟ ⎜1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
−1 ⎜0 1 0 0 1 0 1 0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
.g (z) = ⎜ ⎟⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟,
⎜0 0 1 0 0 1 0 1⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 0 1 ⎟ ⎜0⎟ ⎜1⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟
⎝0 1 0 0 1 0 0 1⎠ ⎝1⎠ ⎝0⎠ ⎝0⎠ ⎝0⎠ ⎝0⎠
1 0 1 0 0 1 0 0 1 1 1 1 0

which is equal to 00. And we have .SB−1


AES (63) = 00.
Example 3.1.5 Let .z = 8C = 100011002 . Then
⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 1 0 1 0 0 0 0
⎜0
⎜ 0 1 0 1 0 0 1⎟ ⎜0⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜

⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 1 0 ⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
−1 ⎜0 1 0 0 1 0 1 0⎟ ⎜0⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
.g (z) = ⎜ ⎟⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟ + ⎜ ⎟ = ⎜ ⎟,
⎜0 0 1 0 0 1 0 1⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟ ⎜0⎟ ⎜1⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜1 0⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 0 1 ⎟ ⎜1⎟ ⎜1⎟ ⎜1⎟ ⎜1⎟ ⎜0⎟
⎝0 1 0 0 1 0 0 1 ⎝0⎠ ⎝0⎠ ⎝1⎠ ⎝0⎠ ⎝1⎠

1 0 1 0 0 1 0 0 0 1 0 1 1

which corresponds to
144 3 Modern Cryptographic Algorithms and Their Implementations

x 6 + x 4 + x 3 + x + 1 ∈ F2 [x]/(f (x)).
.

By the Euclidean algorithm

f (x) = (x 2 + 1)(x 6 + x 4 + x 3 + x + 1) + (x 5 + x 3 + x 2 ),
.

x 6 + x 4 + x 3 + x + 1 = x(x 5 + x 3 + x 2 ) + (x + 1),
x 5 + x 3 + x 2 = (x 4 + x 3 + x + 1)(x + 1) + 1.

Then by the extended Euclidean algorithm

1 = (x 5 + x 3 + x 2 ) + (x 4 + x 3 + x + 1)(x + 1)
.

= (x 5 + x 3 + x 2 ) + (x 4 + x 3 + x + 1)((x 6 + x 4 + x 3 + x + 1)
+x(x 5 + x 3 + x 2 ))
= (x 4 + x 3 + x + 1)(x 6 + x 4 + x 3 + x + 1)
+(x 5 + x 4 + x 2 + x + 1)(x 5 + x 3 + x 2 )
= (x 4 + x 3 + x + 1)(x 6 + x 4 + x 3 + x + 1)
+(x 5 + x 4 + x 2 + x + 1)(f (x) + (x 2 + 1)(x 6 + x 4 + x 3 + x + 1))
= (x 5 + x 4 + x 2 + x + 1)f (x) + (x 7 + x 6 + x 5 + x 4 )(x 6 + x 4 + x 3 + x + 1).

And we have

(x 6 + x 4 + x 3 + x + 1)−1 mod f (x) = x 7 + x 6 + x 5 + x 4 = 111100002 = F0,


.

which gives .SB−1


AES (8C) = F0
As the name suggests, the ShiftRows operation shifts the bytes in the rows of the
cipher state. Recall the representation of the AES cipher state from Eq. 3.2. Then
the ShiftRows operation can be described by the following transformation:
⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 s00 s01 s02 s03
⎜s10 s11 s12 ⎟
s13 ⎟ ⎜s11 s12 s13 s10 ⎟
.⎜ →⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝s22 s23 s20 s21 ⎠
s30 s31 s32 s33 s33 s30 s31 s32

The first row does not change. The second row rotates left by one byte. The third
row rotates left by two bytes. Finally, the last row rotates left by three bytes.
In another representation, let us denote the input of ShiftRows using cipher state
representation in Eq. 3.2. Let the output of ShiftRows be a matrix B with entries .bij
(.0 ≤ i, j ≤ 3). Then
3.1 Symmetric Block Ciphers 145

⎛ ⎞ ⎛ ⎞
b0j s0j
⎜b1j ⎟ ⎜ s1(j +1 mod 4) ⎟
.⎜ ⎟=⎜


⎟, 0 ≤ j < 4. (3.4)
⎝b2j ⎠ ⎝s
2(j +2 mod 4) ⎠
b3j s3(j +3 mod 4)

For decryption, the inverse of ShiftRows, InvShiftRows, can be easily deduced.


The MixColumns function takes each of the four columns of the cipher state
(Eq. 3.2)
⎛ ⎞
s0j
⎜s1j ⎟
.⎜ ⎟ j = 0, 1, 2, 3,
⎝s2j ⎠ ,
s3j

as input. The column is considered as a polynomial over .F2 [x]/(f (x)):

s3j x 3 + s2j x 2 + s1j x + s0j .


.

MixColumns multiplies .s3j x 3 + s2j x 2 + s1j x + s0j with another polynomial over
F2 [x]/(f (x)) given by
.

g(x) = 03x 3 + 01x 2 + 01x + 02.


.

The multiplication is computed modulo .x 4 + 1. This design choice is based on


specific diffusion and performance goals. We will not go into the details in this
book, interested readers can refer to [DR02]. Let .d(x) = d3 x 3 + d2 x 2 + d1 x + d0
denote the product of .s3j x 3 + s2j x 2 + s1j x + s0j with .g(x) modulo .x 4 + 1. We have

d(x) = (s3j x 3 + s2j x 2 + s1j x + s0j )(03x 3 + 01x 2 + 01x + 0216 ) mod (x 4 + 1)
.

= 03s3j x 6 + (01s3j + 03s2j )x 5 + (01s3j + 01s2j + 03s1j )x 4


+ (02s3j + 01s2j + 01s1j + 03s0j )x 3 + (02s2j + 01s1j + 01s0j )x 2
+ (02s1j + 01s0j )x + 02s0j mod (x 4 + 1)
= (02s3j + 01s2j + 01s1j + 03s0j )x 3 + (03s3j + 02s2j + 01s1j + 01s0j )x 2
+ (01s3j + 03s2j + 02s1j + 01s0j )x + 01s3j
+ 01s2j + 03s1j + 02s0j . (3.5)

Thus, MixColumns can be considered as multiplying the input column by a matrix:


146 3 Modern Cryptographic Algorithms and Their Implementations

⎛ ⎞ ⎛ ⎞⎛ ⎞
d0 02 03 01 01 s0j
⎜d1 ⎟ ⎜01 02 03 01⎟ ⎜s1j ⎟
.⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝d2 ⎠ = ⎝01 01 02 03⎠ ⎝s2j ⎠ . (3.6)
d3 03 01 01 02 s3j

Example 3.1.6 Suppose


⎛ ⎞ ⎛ ⎞
s0j D4
⎜s1j ⎟ ⎜BF⎟
.⎜ ⎟ ⎜ ⎟
⎝s2j ⎠ = ⎝5D⎠ .
s3j 30

Then
⎛ ⎞⎛ ⎞ ⎛ ⎞
02 03 01 01 D4 04
⎜01 02 03 01⎟ ⎜BF⎟ ⎜66⎟
.⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝01 01 02 03⎠ ⎝5D⎠ = ⎝81⎠
03 01 01 02 30 E5

For example, we have calculated in Examples 1.5.17 and 1.5.20 that

02 × D4 = 101100112 ,
. 03 × BF = 110110102 .

The first entry of the product is then given by

10110011 ⊕ 11011010 ⊕ 01011101 ⊕ 00110000 = 00000100 = 04.


.

Remark 3.1.3 For any


⎛ ⎞ ⎛ ⎞
a0 b0
⎜a1 ⎟ ⎜b1 ⎟
.a = ⎜ ⎟ b=⎜ ⎟
⎝a2 ⎠ , ⎝b2 ⎠ ,
a3 b3

we have

MixColumns(a + b) = MixColumns(a) + MixColumns(b),


.

where the addition is computed modulo .f (x). As discussed in Remark 1.5.2, this
addition is equivalent to XOR. Consequently, we have

. MixColumns(a ⊕ b) = MixColumns(a) ⊕ MixColumns(b).


3.1 Symmetric Block Ciphers 147

The inverse of MixColumns, InvMixColumns, is defined by multiplying each


column of the cipher state by the inverse of .g(x) (Eq. 3.1.2) modulo .x 4 + 1. We
note that

.x 4 + 1 = (x + 1)4

as a polynomial over .F2 [x]/(f (x)). Since 1 is not a root of .g(x), .x + 1 does not
divide .g(x), which gives

. gcd(g(x), x 4 + 1) = 1.

We have shown that .F2 [x]/(f (x)) is a field in Sect. 1.5.1. .g(x)−1 mod x 4 + 1 can
be computed using the extended Euclidean algorithm, similarly to Example 1.5.10.
We have

.g(x)−1 mod x 4 + 1 = 0Bx 3 + 0Dx 2 + 09x + 0E.

It can be shown in the same way as in Eq. 3.5 that, multiplication by


g(x)−1 mod x 4 + 1 is equivalent to multiplication by the following matrix
.

⎛ ⎞
0E 0B 0D 09
⎜09 0E 0B 0D⎟
.⎜ ⎟ (3.7)
⎝0D 09 0E 0B⎠ .
0B 0D 09 0E

We will discuss the AES key schedule for key length 128, which corresponds to
Nr .= 10. The algorithms for other key lengths are defined similarly (see [DR02]
for more details). The key schedule algorithm is named KeyExpansion, shown in
Algorithm 3.1. The master key k is written as a four-by-four array of bytes, denoted
by .K[4][4] in the algorithm. KeyExpansion expands .K[4][4] to a .4 × 44 array of
bytes, denoted by .W [4][44]. Since Nr .= 10, in total we need 11 round keys. The ith
round key is given by the columns 4i to .4(i + 1) − 1 of W . Note that the 0th round
key, i.e., the round key for whitening at the beginning of the encryption, is given
by the first 4 columns of W , which are equal to the master key (lines 1–3). Round
constants, denoted Rcon (line 6), is an array of ten bytes, computed as follows:

Rcon[1] = x 0 = 01,
. and Rcon[j ] = xRcon[j − 1] = x j −1 , for j > 1.

We have

Rcon = {01, 02, 04, 08, 10, 20, 40, 80, 1B, 36} .
.
148 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.1: KeyExpansion—AES-128 key schedule


Input: K[4][4] // master key written as a four-by-four array of bytes
Output: W [4][44]
1 for j = 0, j < 4, j + + do
2 for i = 0, i < 4, i + + do
3 W[i][j]=K[i][j]

4 for j = 4, j < 44, j + + do


5 if j mod 4 == 0 then
6 W [0][j ] = W [0][j − 4] ⊕ SBAES (W [1][j − 1]) ⊕ Rcon[j/4]
7 for i = 1, i < 4, i + + do
8 W [i][j ] = W [i][j − 4] ⊕ SBAES (W [i + 1 mod 4][j − 1])

9 else
10 for i = 0, i < 4, i + + do
11 W [i][j ] = W [i][j − 4] ⊕ W [i][j − 1]

12 return W

Fig. 3.7 Key schedule for


AES-128

The key schedule is also depicted in Fig. 3.7, where the round keys are repre-
sented as four-by-four grids and each box corresponds to one byte. The rotation .⪡
rotates the right-most column by one byte
3.1 Symmetric Block Ciphers 149

⎛ ⎞ ⎛ ⎞
y0 y1
⎜y1 ⎟ ⎜y2 ⎟
.⎜ ⎟ ⎜ ⎟
⎝y2 ⎠ |→ ⎝y3 ⎠ .
y3 y0

Remark 3.1.4 We note that with the knowledge of any round key for AES-128
encryption, the attacker can recover the master key using the inverse of the key
schedule.

3.1.3 PRESENT

PRESENT was proposed in 2007 [BKL+ 07] as a symmetric block cipher optimized
for hardware implementation. It has block length .n = 64, number of rounds Nr
.= 31, and a key length of either 80 or 128. The Sbox for PRESENT is a 4-bit Sbox.

When the key length is 80, the algorithm is called PRESENT-80.


The round function of PRESENT consists of addRoundKey, sBoxLayer, and
pLayer. After 31 rounds, addRoundKey is applied again before the ciphertext output
(see Fig. 3.8).

Note As opposed to DES specification, for PRESENT specification, we


consider the 0th bit of a value as the right-most bit in its binary representation.
For example, the 0th bit of .3 = 0112 is 1, the 1st bit is 1 and the 2nd bit is 0.

Fig. 3.8 An illustration of


PRESENT encryption
algorithm
150 3 Modern Cryptographic Algorithms and Their Implementations

addRoundKey takes the current 64-bit cipher state

b63 b62 . . . b0
.

and XOR it with the round key

Ki = κ63
.
i
. . . κ0i , (1 ≤ i ≤ 32)

bitwise

bj = bj ⊕ κji ,
. 0 ≤ j ≤ 63.

sBoxLayer applies sixteen 4-bit Sboxes to each nibble of the current cipher state.
The 4-bit Sbox is given by Table 3.11. For example, if the input is 0, the output is C.
pLayer permutes the 64 bits of the cipher state using the following formula:
| |
j
pLayer(j ) =
. + (j mod 4) × 16,
4

where j denotes the bit position. For example, the 0th bit of the input stays as the
0th bit of the output, and the 1st bit of the input goes to the 16th bit of the output. It
can also be described using Table 3.12.
Figure 3.9 shows two rounds of PRESENT.
Here we detail the key schedule for PRESENT-80. We refer the readers to
[BKL+ 07] for the key schedule for the 128-bit master key. Let us denote the variable
storing the key by .k79 k78 . . . k0 . At round i, the round key is given by

Ki = κ63
.
i i
κ62 . . . κ0i = k79 k78 . . . k16 .

Table 3.11 PRESENT Sbox 0 1 2 3 4 5 6 7 8 9 A B C D E F


C 5 6 B 9 0 A D 3 E F 8 4 7 1 2

Table 3.12 PRESENT pLayer


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 16 32 48 1 17 33 49 2 18 34 50 3 19 35 51
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
4 20 36 52 5 21 37 53 6 22 38 54 7 23 39 55
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
8 24 40 56 9 25 41 57 10 26 42 58 11 27 43 59
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
12 28 44 60 13 29 45 61 14 30 46 62 15 31 47 63
3.2 Implementations of Symmetric Block Ciphers 151

Fig. 3.9 Two rounds of PRESENT

Fig. 3.10 PRESENT-80 key schedule

After extracting the round key, the variable .k79 k78 . . . k0 is updated using the
following steps:
1. Left rotate of 61 bits, .k79 k78 . . . k1 k0 = k18 k17 . . . k20 k19 ;
2. .k79 k78 k77 k76 = SBPRESENT (k79 k78 k77 k76 );
3. .k19 k18 k17 k16 k15 = k19 k18 k17 k16 k15 ⊕ round._counter;
where SB.PRESENT stands for the PRESENT Sbox (Table 3.11) and round._counter
= 1, 2, . . . , 31. A graphical illustration is shown in Fig. 3.10.
.

Remark 3.1.5 With the knowledge of any round key for PRESENT-80, the attacker
can recover 64 bits of the master key. The remaining 16 bits can be recovered by
brute force. Alternatively, with the knowledge of another round key, the master key
can also be revealed.

3.2 Implementations of Symmetric Block Ciphers

In Sect. 3.1, we saw that there are mainly three building blocks for a symmetric
block cipher: bitwise XOR with round key, Sbox, and permutation. In this section,
we will discuss how to implement each of them. While we mainly focus on the
152 3 Modern Cryptographic Algorithms and Their Implementations

software implementations of PRESENT and AES, the main ideas apply in general
to other ciphers with similar constructions.
It is easy in both software and hardware to implement bitwise XOR with a round
key. In hardware, there is an XOR gate and almost every processor has a dedicated
XOR instruction.

3.2.1 Implementing Sboxes

In software, a naïve way to implement Sbox is to use a lookup table. The table is
stored as an array in random access memory or flash memory. The storage space
required for an Sbox SB.: Fω2 1 → Fω2 2 is .ω2 × 2ω1 . For example, PRESENT
has a 4-bit Sbox (Table 3.11) and the storage required is .24 × 4 = 64 bits, or 8
bytes. A lookup table implementation of PRESENT Sbox in pseudocode is shown
in Algorithm 3.2. As current computer architectures normally use word sizes of

Algorithm 3.2: A lookup table implementation of PRESENT Sbox in pseu-


docode
1 integer array [1..16] Sbox = {C, 5, 6, B, 9, 0, A, D, 3, E, F, 8, 4, 7, 1, 2}
2 s = Sbox[s] // table lookup

at least one byte (generally multiple bytes), it is not efficient to implement Sbox
nibble-wise. To optimize the execution time, we can merge two PRESENT Sbox
table lookups (Algorithm 3.3). However, even though we can utilize the space

Algorithm 3.3: A more efficient lookup table implementation of PRESENT


Sbox in pseudocode
1 integer array [1..16] Sbox = {C, 5, 6, B, 9, 0, A, D, 3, E, F, 8, 4, 7, 1, 2}
2 integer big_s = Sbox[s & 0F] // lower nibble; & denotes bitwise AND (see
Definition 1.3.6)
3 big_s = big_s ∨ (Sbox[(s⪢4) & 0F] ⪡4) // upper nibble; ∨ denotes bitwise OR
(see Remark 1.3.2)
4 s = big_s // state update

more efficiently, the additional operations take extra computing time. To avoid the
bit shifts and boolean operations, it is better to combine two .4 × 4 Sbox tables into
one bigger .8 × 8 table (Algorithm 3.4):
3.2 Implementations of Symmetric Block Ciphers 153

SB(0)|SB(0) SB(0)|SB(1) . . . SB(0)|SB(F)


SB(1)|SB(0) SB(1)|SB(1) . . . SB(1)|SB(F)
. .. .. ....
. . ..
SB(F)|SB(0) SB(F)|SB(1) . . . SB(F)|SB(F)

Algorithm 3.4: A lookup table implementation combining two PRESENT


Sboxes in parallel in pseudocode
1 integer array [1..256] Sbox = {CC, C5, . . . , C1, C2, 5C, 55, . . . , 51, 52, . . . 2C, 25, . . . , 21,
22}
2 s = Sbox[s] // table lookup of two nibbles in parallel

3.2.2 Implementing Permutations

The efficiency of the implementation is highly dependent on the design of the


permutation. For AES ShiftRows, the bytes are permuted, making it easier to
implement. For PRESENT pLayer, the bit level permutations are “free” in hardware
as we just need to reorder the wires, no new gates are required. However, in software,
extracting each bit and putting it in the right position is time-consuming.

3.2.2.1 Implementing pLayer

In this part, we will discuss two methods for implementing PRESENT pLayer by
combining it with sBoxLayer.
The first method is straightforward. We will construct sixteen .4 × 64 lookup
tables, TB1, TB2, .. . . , TB16. The input of TBi is given by the ith nibble of the
cipher state at the input of sBoxLayer. The outputs are 64-bit values with mostly
0s except for 4 bits that are related to this ith input nibble through sBoxLayer and
pLayer.
Let us consider TB1, whose input is the first nibble of the cipher state at the input
of sBoxLayer. By Table 3.12, the Sbox output corresponding to this nibble should
go to bits .0, 16, 32 and 48 of the output of pLayer. Thus, each entry of TB1 is a
64-bit value with bits in positions .0, 16, 32 and 48 given by the Sbox output, and
the other bits are all 0.
Example 3.2.1 For example, if the input is A, the Sbox output should be .F = 11112
and

TB1[A] = 0 . . . 010 . . . 010 . . . 010 . . . 1,


.
154 3 Modern Cryptographic Algorithms and Their Implementations

where the 0th, 16th, 32nd and 48th bits are 1. Similarly, PRESENT Sbox output for
input B is .10002 , and

TB1[B] = 0 . . . 010 . . . 0,
.

where the 48th bit is 1.


Example 3.2.2 TB2 takes the second nibble of the cipher state as input. The output
bits should be positioned at .1, 17, 33 and 49. Thus

TB2[B] = 0 . . . 010 . . . 0,
.

where only the 49th bit is 1.


As for the memory consumption, a .4 × 64 table takes .64 × 24 bits and those
sixteen tables take 16384 bits of memory. Compared to one Sbox table, which is 64
bits, this is much bigger, but these tables also implement pLayer of PRESENT. The
speed can be further improved by merging two Sbox computations and constructing
eight .8 × 64 lookup tables. The memory consumption will be the same. But the
speed will be much faster.
The second method [GHNZ09, PV13] requires a deeper look at the pLayer
design. The aim is to design four .8 × 8 tables that output the corresponding Sbox
values and permutate the bits of each byte of the sBoxLayer input.
If we analyze Table 3.12 and Fig. 3.9, we can see that in round i:
• The 0th bits of bytes at positions .0, 1, 3, 5 in pLayer output come from the 0th
nibble of the input of pLayer, which corresponds to the 0th nibble of the cipher
state at sBoxLayer input of round i;
• The 1st bits of bytes at positions .0, 1, 3, 5 in pLayer output correspond to the 1st
nibble of the cipher state at sBoxLayer input;
• The 2nd bits of bytes at positions .0, 1, 3, 5 in pLayer output correspond to the
2nd nibble of the cipher state at sBoxLayer input;
• The 3rd bits of bytes at positions .0, 1, 3, 5 in pLayer output correspond to the 3rd
nibble of the cipher state at sBoxLayer input;
• ...
• The 7th bits of bytes .0, 1, 3, 5 in pLayer output correspond to the 7th nibble of
the cipher state at sBoxLayer input;
Similar observations hold for bytes at positions .2, 4, 6, 7.
We can have the following four tables for the implementation of sBoxLayer and
pLayer:
• Table one takes the 0th byte (bits .0−7) of sBoxLayer input, the correspond-
ing output will be the 0th and 1st bits for bytes at positions .0, 1, 3, 5 (bits
.0, 1, 16, 17, 32, 33, 48, 49) in the output of pLayer;
3.2 Implementations of Symmetric Block Ciphers 155

• Table two takes the 1st byte (bits .8−15) of sBoxLayer input, the corresponding
output will be the 2nd and 3rd bits for bytes at positions .0, 1, 3, 5 (bits
.2, 3, 18, 19, 34, 35, 50, 51) in the output of pLayer;

• Table three takes the 2nd byte (bits .16−23) of sBoxLayer input, the correspond-
ing output will be the 4th and 5th bits for bytes at positions .0, 1, 3, 5 (bits
.4, 5, 20, 21, 36, 37, 52, 53) in the output of pLayer;

• Table four takes the 3rd byte (bits .24−31) of sBoxLayer input, the correspond-
ing output will be the 6th and 7th bits for bytes at positions .0, 1, 3, 5 (bits
.6, 7, 22, 23, 38, 39, 54, 55) in the output of pLayer.

The same tables can also be used for the remaining four bytes of the cipher
state:
• Table one takes the 4th byte (bits .32−39) of sBoxLayer input, the correspond-
ing output will be the 0th and 1st bits for bytes at positions .2, 4, 6, 7 (bits
.8, 9, 24, 25, 40, 41, 56, 57) in the output of pLayer;

• Table two takes the 5th byte (bits .40−47) of sBoxLayer input, the corresponding
output will be the 2nd and 3rd bits for bytes at positions .2, 4, 6, 7 (bits
.10, 11, 26, 27, 42, 43, 58, 59) in the output of pLayer;

• Table three takes the 6th byte (bits .48−55) of sBoxLayer input, the correspond-
ing output will be the 4th and 5th bits for bytes at positions .2, 4, 6, 7 (bits
.12, 13, 28, 29, 44, 45, 60, 61) in the output of pLayer;

• Table four takes the 7th byte (bits .56−63) of sBoxLayer input, the correspond-
ing output will be the 6th and 7th bits for bytes at positions .2, 4, 6, 7 (bits
.14, 15, 30, 31, 46, 47, 62, 63) in the output of pLayer.

Since the input for each table is one byte, we will be computing two Sboxes in
parallel. In Algorithm 3.4 we have seen the algorithm for such a computation. To
see how the four tables are computed, we will detail the first three entries of each
table. The other entries are calculated with similar methods.
First, we note that to combine two Sboxes, the lookup table starts with

.CC C5 C6 ...

As mentioned above, one type of input intended for Table one is bits at
positions .0−7 of sBoxLayer input, those bits correspond to bits at positions .0−7
at sBoxLayer output. The corresponding output of Table one are bits at positions
.0, 1, 16, 17, 32, 33, 48, 49 of pLayer output. According to pLayer (Table 3.12)

design, we will need to permute bits at positions .0−7 to .0, 4, 1, 5, 2, 6, 3, 7 so that


they will give us bits at positions .0, 1, 16, 17, 32, 33, 48, 49 of pLayer output. For
example, if the input of Table one is 00, the corresponding sBoxLayer output is CC
.= 11001100, where the 0th bit is 0. After permutation, we get .11110000 = F0.

Similarly, we get that Table one starts with

.F0 B1 B4 ... (3.8)


156 3 Modern Cryptographic Algorithms and Their Implementations

If we consider the other set of inputs intended for Table one, which are bits at
positions .32−39, they should be first permuted to .32, 36, 33, 37, 34, 38, 35, 39 so
that the output will be bits at positions .8, 9, 24, 25, 40, 41, 56, 57. Then we arrive
at the same values as in Eq. 3.8.
For Table two, the output will later be positioned at the 2nd and 3rd positions in
the eight bytes of the pLayer output. A natural choice is to design it so that the output
can be combined with the outputs of other tables with a binary operation, e.g., .∨. In
particular, since the output of Table one starts with bits from positions .0, 1 and .8, 9,
the output of Table two will put bits from positions .2, 3 and .10, 11 in the 2nd and
3rd positions. Thus, Table two permutes bits .8−15 to .11, 15, 8, 12, 9, 13, 10, 14,
which then will give bits at .50, 51, 2, 3, 18, 19, 34, 35 for pLayer output. Similarly,
bits .40−47 will be permuted to .43, 47, 40, 44, 41, 45, 42, 46 and give bits at
.58, 59, 10, 11, 26, 27, 42, 43 for pLayer output. The first few entries of Table two

are as follows:

3C
. 6C 2D ...

Table three first permutes bits from .16−23 (resp. .48−55) to .18, 22, 19, 23, 16,
20, 17, 21 (resp. .50, 54, 51, 55, 48, 52, 49), which then give bits .36, 37, 52, 53, 4, 5,
20, 21 (resp. .44, 45, 60, 61, 12, 13, 28, 29) of pLayer output. The table starts with

0F
. 1B 4B ...

Table four first permutes bits from .24−31 (resp. .56−63) to .25, 29, 26, 30, 27, 31,
24, 28 (resp. .57, 61, 58, 62, 59, 63, 56, 60), which then give bits .22, 23, 38, 39, 54,
55, 6, 7 (resp. .30, 31, 46, 47, 62, 63, 14, 15) of pLayer output. The table starts with

C3
. C6 D2 ...

A pseudocode for the implementation is detailed in Algorithm 3.5. We represent


the ith byte of the cipher state at sBoxLayer input by .bi (.i = 0, 1, 2, . . . , 7). The
algorithm demonstrates how the bits .0−7 of the pLayer output can be computed.
Other bits can be calculated similarly. In line 1, we pass the 0th byte of the cipher
state at sBoxLayer input, .b0 , to Table one. The table lookup result is stored in
a1, which gives us bits .0, 1, 16, 17, 32, 33, 48, 49 of pLayer output. In line 5, the
leftmost two bits of s1 are given by the leftmost two bits of a1, which correspond
to bits at positions 0 and 1 in pLayer output. Similarly, s2 (reps. s3, s4) stores bits
at positions .2, 3 (resp. .4, 5, .6, 7) at of pLayer. Then those eight bits are combined
together in line 9 to produce the 0th byte of the pLayer output.

3.2.2.2 AES T-tables

This part discusses an implementation method combining SubBytes, ShiftRows, and


MixColumns for AES round function. Let SB denote the AES Sbox.
3.2 Implementations of Symmetric Block Ciphers 157

Algorithm 3.5: An implementation that combines sBoxLayer and pLayer for


PRESENT
Input: b7 , b6 , b5 , b4 , b3 , b2 , b1 , b0 , Table_one, Table_two, Table_three, Table_four
// b7 , b6 , b5 , b4 , b3 , b2 , b1 , b0 is the cipher state at the input of
sBoxLayer, each bi represents one byte
// Table_one = {F0, B1, B4, . . . }
// Table_two = {3C, 6C, 2D, . . . }
// Table_three = {0F, 1B, 4B, . . . }
// Table_four = {C3, C6, D2, . . . }
Output: cipher state at the output of pLayer
// compute bytes at positions 0, 1, 3, 5 in pLayer output----
1 a1 = Table_one[b0 ]// look up bits 0, 1, 16, 17, 32, 33, 48, 49
2 a2 = Table_two[b1 ]// look up bits 50, 51, 2, 3, 18, 19, 34, 35
3 a3 = Table_three[b2 ]// look up bits 36, 37, 52, 53, 4, 5, 20, 21
4 a4 = Table_four[b3 ]// look up bits 22, 23, 38, 39, 54, 55, 6, 7
// computing bits 0 − 7 of pLayer output
5 s1 = a1 & C0// extract bits 0, 1. & denotes bitwise AND (see
Definition 1.3.6)
6 s2 = a2 & 30// extract output bits 2, 3
7 s3 = a3 & 0C// extract output bits 4, 5
8 s4 = a4 & 03// extract output bits 6, 7
9 b0 = s1 ∨ s2 ∨ s3 ∨ s4// combine bits, ∨ denotes bitwise OR (see
Remark 1.3.2)
// other bits of bytes at positions 0, 1, 3, 5 in pLayer output
10 ...
// compute bytes at positions 2, 4, 6, 7 in pLayer output---
11 a1 = Table_one[b4 ] // look up bits 8, 9, 24, 25, 40, 41, 56, 57
12 a2 = Table_two[b5 ] // look up bits 58, 59, 10, 11, 26, 27, 42, 43
13 a3 = Table_three[b6 ] // look up bits 44, 45, 60, 61, 12, 13, 28, 29
14 a4 = Table_four[b7 ] // look up bits 30, 31, 46, 47, 62, 63, 14, 15
15 ...

Recall that the cipher state of AES can be represented by a four-by-four matrix
of bytes (Eq. 3.2). Let us denote the input of SubBytes by a matrix S. The outputs
of SubBytes, ShiftRows, and MixColumns are represented by matrices .A, B, and
D respectively. By definition, .aij = SB(sij ), 0 ≤ i, j < 4. By Eqs. 3.4 and 3.5,

⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞
b0j a0j d0j 02 03 01 01 b0j
⎜b1j ⎟ ⎜ a1(j +1 mod 4) ⎟ ⎜d1j ⎟ ⎜01 02 ⎟
03 01⎟ ⎜b1j ⎟

.⎜ ⎟=⎜


⎟, ⎜ ⎟=⎜ ⎟,
⎝b2j ⎠ ⎝a ⎝d2j ⎠ ⎝01 01 02 03⎠ ⎝b2j ⎠
2(j +2 mod 4) ⎠
b3j a3(j +3 mod 4) d3j 03 01 01 02 b3j

j = 0, 1, 2, 3.

We have
158 3 Modern Cryptographic Algorithms and Their Implementations

⎛ ⎞ ⎛ ⎞⎛ ⎞
d0j 02 03 01 01 SB(s0j )
⎜d1j ⎟ ⎜01 02 03 01⎟ ⎜ SB(s1(j +1 mod 4) )⎟
.⎜ ⎟=⎜ ⎟⎜⎜


⎝d2j ⎠ ⎝01 01 02 03⎠ ⎝SB(s
2(j +2 mod 4) ⎠
)
d3j 03 01 01 02 SB(s3(j +3 mod 4) )
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
02 03 01
⎜01⎟ ⎜02⎟ ⎜03⎟
=⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝01⎠ SB(s0j ) ⊕ ⎝01⎠ SB(s1(j +1 mod 4) ) ⊕ ⎝02⎠ SB(s2(j +2 mod 4) )
03 01 01
⎛ ⎞
01
⎜01⎟
⊕⎜ ⎟
⎝03⎠ SB(s3(j +3 mod 4) ),
02

where .j = 0, 1, 2, 3. For .a ∈ F82 , define


⎛ ⎞ ⎛ ⎞ ⎛ ⎞
02 03 01
⎜01⎟ ⎜02⎟ ⎜03⎟
.T0 (a) := ⎜ ⎟ T1 (a) := ⎜ ⎟ T2 (a) := ⎜ ⎟
⎝01⎠ SB(a), ⎝01⎠ SB(a), ⎝02⎠ SB(a),
03 01 01
⎛ ⎞
01
⎜01⎟
T3 (a) := ⎜ ⎟
⎝03⎠ SB(a).
02

Then
⎛ ⎞
d0j
⎜d1j ⎟
.⎜ ⎟
⎝d2j ⎠ = T0 (s0j ) ⊕ T1 (s1(j +1 mod 4) ) ⊕ T2 (s2(j +2 mod 4) )T3 (s3(j +3 mod 4) ),
d3j

Thus the four tables .T0 , T1 , T2 , T3 of size .8×32 can be used to implement SubBytes,
ShiftRows, and MixColumns. Those four tables are called T-tables for AES. We
note that to store the T-tables we need processors with a word size of 32 or above.
They cannot be used for the last round of AES as there is no Mixcolumns operation.
3.2 Implementations of Symmetric Block Ciphers 159

3.2.3 Bitsliced Implementations

Bitsliced implementation of symmetric block ciphers was first introduced by Eli


Biham for implementing DES [Bih97]. The goal of a bitsliced implementation is
to simulate a hardware implementation in software so that several plaintext blocks
can be encrypted in parallel. The operations in symmetric block ciphers will be
represented as a sequence of logical operations. Naturally, the implementations
should be adjusted based on the specific underlying hardware—the word size of
the architecture (see Sect. 2.1.2). We will see that with word size .ω, we can encrypt
.ω blocks of plaintext in parallel.

3.2.3.1 Algebraic Normal Form

To introduce bitsliced implementation, we will need to discuss the algebraic normal


form for a Boolean function. Let n be a positive integer in this part.
Definition 3.2.1 A Boolean function is a function .ϕ : Fn2 → F2 .
From the definition, we can see that a Boolean function has .2n possible input
values. For each input value, there are 2 possible output values. Thus, in total, we
n
have .22 possible Boolean functions defined for .Fn2 → F2 . In particular, a Boolean
function can be specified by giving the output values for all inputs, such a table is
called a truth table.
Example 3.2.3 The parity-check bit defined for 3 bits is a Boolean function

ϕ : F32 → F2
.

x2 x1 x0 |→ x0 + x1 + x2 .

Its truth table is given by:

x2 0 0 0 0 1 1 1 1
x1 0 0 1 1 0 0 1 1
.
x0 0 1 0 1 0 1 0 1
ϕ(x) 0 1 1 0 1 0 0 1

Example 3.2.4 Now let use consider the Boolean function defined as follows:

ϕ0 : F42 → F2
.

x |→ SBPRESENT (x)0

where .SBPRESENT (x)0 is the 0th bit of SB.PRESENT (x), the PRESENT Sbox output
corresponding to .x. The truth table of .ϕ0 is given by the first five and the second
160 3 Modern Cryptographic Algorithms and Their Implementations

Table 3.13 The Boolean function .ϕ0 takes input .x and outputs the 0th bit of SB.PRESENT (x). The
second last row lists the output of .ϕ0 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ0
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ0 (x) 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0
.λx 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0

last (the row for .ϕ0 (x)) rows in Table 3.13. For example, if the input is 0, the Sbox
output is C .= 1100. Then .ϕ0 (x) = 0.

Definition 3.2.2 Fix .v = vn−1 vn−2 , . . . , v1 v0 ∈ Fn2 , we define the indicator


function for .v, denoted .1v , as follows:

1v : Fn2 → F2
.
∏ ∏
x |→ xi (1 − xi ).
i:vi =1 i:vi =0

With this definition, for any .ϕ : Fn2 → F2 , we can express .ϕ in the following
polynomial expression:

ϕ(x) =
. ϕ(v)1v (x).
v∈Fn2

After simplification, .ϕ can be written as


⎛ ⎞
∑ ∏
n−1
.ϕ(x) = λv xivi ,
v∈Fn2 i=0

which is called the algebraic normal form representation of the Boolean function .ϕ.
Example 3.2.5 Continuing Example 3.2.3, we can find the algebraic normal form
of .ϕ as follows

ϕ(x) =
. ϕ(v)1v (x) = 1001 (x) + 1010 (x) + 1100 (x) + 1111 (x)
v∈Fn2

= x0 (1 − x1 )(1 − x2 ) + x1 (1 − x0 )(1 − x2 ) + x2 (1 − x0 )(1 − x1 ) + x0 x1 x2


= x0 + x1 + x2 − 2(x0 x1 + x0 x2 + x1 x2 ) + 4x0 x1 x2 = x0 + x1 + x2 .
3.2 Implementations of Symmetric Block Ciphers 161

It can be proven that the algebraic normal form of a Boolean function is unique.2
Theorem 3.2.1 Every Boolean function .ϕ : Fn2 → F2 has a unique algebraic
normal form representation
⎛ ⎞
∑ ∏
n−1
ϕ(x) =
. λv xivi . (3.9)
v∈Fn2 i=0

The coefficients .λv ∈ F2 are given by



λv =
. ϕ(w), (3.10)
w≤v

where .w ≤ v means that .wi ≤ vi for all .0 ≤ i ≤ n − 1.


n
We note that there are .22 Boolean functions defined for .Fn2 → F2 . Furthermore,
n
there are .22 choices for the coefficients .λv (.λv = 0, 1 and there are .2n distinct .v).
Thus the number of distinct expressions on both sides of Eq. 3.9 coincides.
Example 3.2.6 Continuing Example 3.2.3. By Eq. 3.10,

λ110 = ϕ(000) + ϕ(100) + ϕ(010) + ϕ(110) = 0 + 1 + 1 + 0 = 0.


.

Similarly, we can calculate all the coefficients .λ:

λ000 = 0, λ001 = 1, λ010 = 1, λ011 = 1 + 1 = 0,


.
λ100 = 1, λ101 = 0, λ110 = 0, λ111 = 0.

By Eq. 3.9,
⎛ ⎞
∑ ∏
n−1
ϕ(x) =
. λv xivi = λ001 x0 + λ010 x1 + λ100 x2 = x0 + x1 + x2
v∈Fn2 i=0

which agrees with Example 3.2.5.


Example 3.2.7 Continuing Example 3.2.4, we can calculate .λv using Eq. 3.10.
Those values are given by the last row of Table 3.13. For example,

λ1100 = ϕ0 (0000) + ϕ0 (1000) + ϕ0 (0100) + ϕ0 (1100) = 0 + 1 + 1 + 0 = 0.


.

By Eq. 3.9,

2 For the proof, see, e.g., [MS77, page 372] and [O’D14, page 149].
162 3 Modern Cryptographic Algorithms and Their Implementations

⎛ ⎞
∑ ∏
n−1
ϕ0 (x) =
. λv xivi = λ0001 x0 + λ0100 x2 + λ0110 x1 x2 + λ1000 x3
v∈Fn2 i=0

= x0 + x2 + x1 x2 + x3 . (3.11)

For example, if the input is 0 .= 0000, the PRESENT Sbox output is C .= 1100, then
the output of .ϕ0 is 0 and

x0 + x2 + x1 x2 + x3 = 0 + 0 + 0 + 0 = 0.
.

If the input is 7 .= 1110, the PRESENT Sbox output is D .= 1101, then the output of
ϕ0 is 1 and
.

x0 + x2 + x1 x2 + x3 = 1 + 1 + 1 + 0 = 1.
.

Similarly, we can define .ϕi (x) = SBPRESENT (x)i for .i = 1, 2, 3, where


SBPRESENT (x)i is the ith bit of PRESENT Sbox output for .x. We can calculate
.

the algebraic normal form for each of .ϕi in a similar way (see Appendix D). They
are given by:

ϕ1 (x) = x1 + x3 + x1 x3 + x2 x3 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 , .
. (3.12)
ϕ2 (x) = 1 + x2 + x3 + x0 x1 + x0 x3 + x1 x3 + x0 x1 x3 + x0 x2 x3 , . (3.13)
ϕ3 (x) = 1 + x0 + x1 + x3 + x1 x2 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 . (3.14)

3.2.3.2 Bitsliced Implementation of PRESENT

In this part, we will use PRESENT as a running example to show how the bitsliced
implementation of a symmetric block cipher is designed.
First, we discuss how to transform the plaintext blocks into bitsliced format. As a
simple example, let us consider block length 3 and a 4-bit architecture, which allows
us to encrypt 4 blocks of plaintext simultaneously. We take 4 plaintext blocks, say

p1 = 010, p2 = 110, p3 = 001, p4 = 100.


.

The bitsliced format of .pj s is given by a .3 × 4 array, denoted S, where each column
is given by one block of plaintext:
⎛ ⎞
0010
.S = ⎝1 1 0 0⎠ .

0101
3.3 RSA 163

In particular, if we let .S[x] denote the xth row of S, then .S[0] corresponds to the 0th
bits of .pj . .S[1] corresponds to the 1st bits of .pj . And .S[2] corresponds to the 2nd
bits of .pj .
Next, we will show how to encrypt 8 plaintext blocks in parallel with PRESENT
assuming an 8-bit architecture. Let .p1 , p2 , . . . p8 be 8 plaintext blocks, each of
length 64. We convert them into bitsliced format as described above and store them
in a .64 × 8 array .S0 , where .S0 [y] contains the yth bits of each plaintext block.
Furthermore, for each round key .Ki , we construct a .64 × 8 array Keyi whose
columns are given by .Ki , i.e.,

Keyi[y][z] = Ki [y]
. ∀0 ≤ z < 8. (3.15)

The bitsliced implementation of the ith round of PRESENT is given in Algo-


rithm 3.6. Line 1 implements addRoundKey. For example, when .i = 0, the xth bit
of each plaintext (row x of .S0 ) are XORed with the xth bit of .K1 (row x of Key0). To
implement the Sbox in bitsliced format, we refer to the algebraic normal forms for
each output bit of the Sbox as a function of the input bits (see Example 3.2.7). We
recall that addition and multiplication in .F2 can be implemented as logical XOR (.⊕)
and logical AND (.&) respectively (see Definition 1.2.17). There are in total 16 Sboxes
and we consider each of them in one loop of line 2. .x0 , x1 , x2 , x3 defined in line 3
are arrays of size 8, each storing one bit of Sbox input from all eight encryption
computations. Lines 4–7 compute eight Sboxes in parallel, each corresponding to
the encryption of one plaintext block. The 0th bits of the Sbox outputs are given by
the 4bth bits of the cipher state at the end of sBoxLayer, where .0 ≤ b ≤ 15. Line 4
computes the 0th bits of the Sbox outputs using Eq. 3.11. Similarly, lines 5, 6, 7
compute the 1st, 2nd and 3rd bit of Sbox outputs using Eqs. 3.12, 3.13 and 3.14
respectively. Finally, pLayer is implemented by line 8 onward using Table 3.12. We
note that .Si [0] (line 8) is an array of 8 bits and we are permuting the 0th bit of cipher
state for 8 encryptions simultaneously. The same can be done for the remaining 63
bits.
It is easy to see that with 32-bit (resp. 64-bit) architecture, we can encrypt
32 (resp. 64) plaintext blocks in parallel. We note that bitsliced implementations
are mostly used for bit-oriented ciphers (e.g., DES, PRESENT). For byte-oriented
ciphers (e.g., AES), table-based implementations will likely give better perfor-
mance.

3.3 RSA

In Sect. 2.1.2 we have mentioned that there are symmetric key and asymmetric
cryptosystems. Up to now, we have only seen symmetric cryptosystems, both
classical and modern designs. For symmetric key cipher, a prior communication
of the master key (key exchange) is required before any ciphertext is transmitted.
With only a symmetric key cipher, the key exchange may be difficult to achieve
164 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.6: Bitsliced implementation of round i of PRESENT, .1 ≤ i ≤ 31


Input: Si−1 , Keyi// Si−1 is the output of round i − 1. When i = 1, S0
contains the plaintext blocks in bitsliced format.
// Keyi is the ith round key Ki in bitsliced format given in Eq. 3.15.
Output: Si : output of round i
// addRoundKey---
1 Si−1 = Si−1 ⊕ Keyi// bitwise XOR
// sBoxLayer---
2 for b = 0, b < 16, b + + do
// Bits of Sbox inputs
3 x0 = Si−1 [4b], x1 = Si−1 [4b + 1], x2 = Si−1 [4b + 2], x3 = Si−1 [4b + 3]
// 0th bit of Sbox output
4 state[4b] = x0 ⊕ x2 ⊕ (x1 & x2 ) ⊕ x3
// 1st bit of Sbox output
5 state[4b + 1] =
x1 ⊕ x3 ⊕ (x1 & x3 ) ⊕ (x2 & x3 ) ⊕ (x0 & x1 & x2 ) ⊕ (x0 & x1 & x3 ) ⊕ (x0 & x2 & x3 )
// 2nd bit of Sbox output
6 state[4b + 2] =
1 ⊕ x2 ⊕ x3 ⊕ (x0 & x1 ) ⊕ (x0 & x3 ) ⊕ (x1 & x3 ) ⊕ (x0 & x1 & x3 ) ⊕ (x0 & x2 & x3 )
// 3rd bit of Sbox output
7 state[4b + 3] =
1 ⊕ x0 ⊕ x1 ⊕ x3 ⊕ (x1 & x2 ) ⊕ (x0 & x1 & x2 ) ⊕ (x0 & x1 & x3 ) ⊕ (x0 & x2 & x3 )
// pLayer---
8 Si [0] = state[0]
9 Si [16] = state[1]
10 Si [32] = state[2]
11 ...
12 return Si

due to, e.g., far distance, and too many parties involved. In practice, this is where
asymmetric key cryptosystem comes into use.
For example, Alice would like to communicate with Bob using AES. To
exchange the master key, k, for AES, she will encrypt k by a public key cryptosystem
using Bob’s public key e. Let .c = Ee (k). The resulting ciphertext c will be sent to
Bob, and Bob can decrypt it with his secret private key d, .k = Dd (c). Then Alice
and Bob can communicate with key k using AES.
Clearly, we require that it is computationally infeasible to find the private
key d given the public key e. In practice, this is guaranteed by some intractable
problem.3 However, the cipher might not be secure in the future. For example, if a
quantum computer with enough bits is manufactured, it can break many public key
cryptosystems [EJ96]. Furthermore, we note that a public key cipher is not perfectly
secure (see Sect. 2.2.7) as the attacker can brute force the key.
In this section, we will be discussing one public key cryptosystem—RSA. It
was published in 1977 and named after its inventors Ron Rivest, Adi Shamir, and

3A problem is intractable if there does not exist an efficient algorithm to solve it.
3.3 RSA 165

Leonard Adleman. RSA is the first public key cryptosystem, and still in use today.
The security relies on the difficulty of finding the factorization of a composite
positive integer.
Definition 3.3.1 (RSA) Let .n = pq, where .p, q are distinct prime numbers. Let
P = C = Zn , .K = Z∗ϕ(n) − {1}. For any .e ∈ K, define encryption
.

Ee : Zn → Zn ,
. m |→ me mod n,

and the corresponding decryption

Dd : Z n → Zn ,
. c |→ cd mod n,

where .d = e−1 mod ϕ(n).


The cryptosystem .(P, C, K, E, D), where .E = {Ee : e ∈ K}, .D = {Dd : d ∈ K},
is called RSA.
Recall by Theorem 1.4.3, .ϕ(n) = (p − 1)(q − 1) and by Definition 1.4.5, .Z∗ϕ(n)
consists of elements in .Zϕ(n) that are coprime to .ϕ(n), or equivalently, that have
multiplicative inverses modulo .ϕ(n).
For encryption, the message sender needs to have knowledge of n and e. They
are the public key for RSA. n is called RSA modulus and e is called the encryption
exponent. The private key d for decryption is kept secret. In this case, only the
private key owner can decrypt the message sent to him.
To generate the keys for RSA, we first generate randomly and independently two
large primes p and q. Then we compute .n = pq. Normally p and q are supposed
to have equal lengths. For example, take p and q to be 512-bit primes, and n will be
a 1024-bit modulus. Next .e ∈ Z∗ϕ(n) is chosen. Since .ϕ(n) is even, e is odd. Finally,
we compute .d = e−1 mod ϕ(n).
Example 3.3.1 As a toy example, suppose Bob would like to generate his private
and public keys for RSA. Bob randomly generates .p = 3 and .q = 5. Then he
computes .n = 15 and

ϕ(n) = (3 − 1) × (5 − 1) = 2 × 4 = 8.
.

From .Z∗8 = {1, 3, 5, 7}, Bob chooses .e = 3. By the extended Euclidean algorithm,
he computes

8 = 3 × 2 + 2, 3 = 2 × 1 + 1 =⇒ 1 = 3 − 2 × 1 = 3 − (8 − 3 × 2) = −8 + 3 × 3.
.

Hence his private key .d = 3−1 mod 8 = 3.


Suppose Alice would like to send plaintext .m = 2 to Bob, using Bob’s public
key .n = 15 and .e = 3. Alice computes

c = me mod n = 23 mod 15 = 8 mod 15.


.
166 3 Modern Cryptographic Algorithms and Their Implementations

After receiving the ciphertext c from Alice, Bob computes the plaintext using his
private key

m = cd mod n = 83 mod 15 = 512 mod 15 = 2 mod 15.


.

Example 3.3.2 Now we will look at a bit larger values for p and q. Let .p = 29,
.q = 41, then .n = 1189 and .ϕ(n) = 28 × 40 = 1120. It is easy to verify that
.3 ∤ ϕ(n). Let us choose .e = 3. By the extended Euclidean algorithm

1120 = 3 × 373 + 1 =⇒ 1 = 1120 − 3 × 373.


.

Hence

d = −373 mod 1120 = 747.


.

To send plaintext .m = 2 to Bob. Alice computes

c = me mod n = 23 mod 1189 = 8 mod 1189.


.

To decrypt, Bob calculates

m = cd mod n = 8747 mod 1189 = 2 mod 1189.


.

Since

747 = 512 + 128 + 64 + 32 + 8 + 2 + 1,


.

we compute

84 mod 1189 = 4096 mod 1189 = 529, 88 mod 1189 = 5292 mod 1189 = 426,
816 mod 1189 = 4262 mod 1189 = 748, 832 mod 1189 = 7482 mod 1189 = 674,
.
864 mod 1189 = 6742 mod 1189 = 78, 8128 mod 1189 = 782 mod 1189 = 139,
8256 mod 1189 = 1392 mod 1189 = 297, 8512 mod 1189 = 2972 mod 1189 = 223.

And we have

8512+128 mod 1189 = 223 × 139 mod 1189 = 83,


.

864+32 mod 1189 = 78 × 674 mod 1189 = 256


88+2+1 mod 1189 = 426 × 64 × 8 mod 1189 = 525,
8747 mod 1189 = 83 × 256 × 525 mod 1189 = 2.

Next, we explain why the decryption works. By the choice of e and d,


3.4 RSA Signatures 167

ed ≡ 1 mod ϕ(n) =⇒ ed = ϕ(n)a + 1 for some a ∈ Z.


.

Then

cd = (me )d = mϕ(n)a+1 = m(p−1)(q−1)a m.


.

By Corollary 1.4.3,

cd ≡ m mod p,
. cd ≡ m mod q.

Since p and q are distinct prime numbers and .n = pq, by Chinese Remainder
Theorem (see Theorem 1.4.7 and Example 1.4.19)

.cd ≡ m mod n.

We note that, if p or q is known to the attacker, they can factorize n and compute
ϕ(n). Then with e, d can be computed using the extended Euclidean algorithm
.

(Algorithm 1.2). Thus all .p, q, ϕ(n) should be kept secret.


RSA can only be secure if computing d from n and e is intractable. Of course,
if the attacker can factorize n with an efficient algorithm, then RSA is broken.
However, there is no proof to conclude if factorizing n is intractable or not. Up to
now, the best-known algorithm for integer factorization has been used to factorize
RSA modulus of bit length 768 [KAF+ 10]. In practice, the most commonly used
RSA modulus n is 1024, 2048, or 4096 bit. Interestingly, it has been proved that
if d is known, then n can be factorized with an efficient algorithm (see [Buc04,
Page 172]). On the other hand, there is no proof that RSA is secure if factoring is
computationally infeasible—there might be other ways to attack RSA [May03].
Normally e is chosen to be small to make the encryption efficient. However, e
cannot be too small. It has been shown that only the .n/4 least significant bits of d
suffice to recover d in the case of a small e [BDF98]. Also, d cannot be too small,
it was proven that if .d < n0.292 , then RSA can be broken [BD00].

3.4 RSA Signatures

In this section, we discuss how RSA can be used for digital signatures.
As mentioned in Sect. 2.1, digital signatures provide a means for an entity to
bind its identity to a message stored in electronic form. This normally means that
the sender uses their private key to sign the (hashed) message. Whoever has access
to the public key can then verify the origin of the message. For example, the message
can be electronic contracts or electronic bank transactions.
In more detail, suppose Alice signs a message m with a private key d and
generates signature s. The receiver Bob receives the message and the signature, he
168 3 Modern Cryptographic Algorithms and Their Implementations

can then verify s with public key e and a verification algorithm. Given m and s, the
verification algorithm returns true to indicate a valid signature and false otherwise.
To use RSA for digital signature, we again let p and q be two distinct primes. Let
.n = pq. We choose .e ∈ Z
∗ −1 mod ϕ(n). Same as for RSA,
ϕ(n) and compute .d = e
the public key consists of e and n. And d is the private key. p, q and .ϕ(n) should be
kept secret.
To sign a message m, Alice computes the signature

s = md mod n.
.

Then Alice sends both m and s to Bob. To verify the signature, Bob computes

s e mod n.
.

If

s ≡ m mod n,
.

then the verification algorithm outputs true, and false otherwise.


Up to now, the only method known to compute s from .m mod n is using d, so if
the verification algorithm outputs true, Bob can conclude that Alice is the owner of
d.
Example 3.4.1 Alice chooses .p = 5 and .q = 7. Then .n = 35 and .ϕ(n) = 24.
Suppose Alice chooses .e = 5, which is coprime to 24. By the extended Euclidean
algorithm

24 = 5 × 4 + 4, 5 = 4 + 1 =⇒ 1 = 5 − (24 − 5 × 4) = 24 × (−1) + 5 × 5,
.

we have .d = e−1 mod 24 = 5. To sign message .m = 10, Alice computes

s = md mod n = 105 mod 35 = 5.


.

Alice sends both the message .m = 10 and signature .s = 5 to Bob. Bob verifies the
signature

. s e mod n = 55 mod 35 = 10 = m.

The most common attack for a digital signature is to create a valid signature for a
message without knowing the secret key. Such an attack is called forgery. If the goal
is to create a valid signature given a message that was not signed by Alice before, it
is called selective forgery. If the goal is to create a valid signature for any message
not signed by Alice before, then the attack is called existential forgery.
There are normally three attacker assumptions. Key-only attack assumes the
attacker only has knowledge of e. Known message attack considers an attacker who
3.4 RSA Signatures 169

has a list of messages previously signed by Alice. In a chosen message attack, the
attacker can request Alice’s signature on a list of messages.
Next, we discuss the security of RSA signatures with respect to forgery attacks.
First, we consider a known message existential forgery attack. Suppose the
attacker, Eve, knows messages .m1 , m2 and their corresponding signatures .s1 and
.s2 . Eve computes

s = s1 s2 mod n,
. m = m1 m2 mod n.

Since

s = md1 md2 mod n = (m1 m2 )d mod n = md mod n,


.

s is a valid signature for m.


A chosen message selective forgery attack works as follows. Eve chooses a
message .m ∈ Zn and takes any message .m1 ∈ Z∗n that is different from m. She
computes

m2 = mm−1
.
1 mod n.

Eve obtains valid signatures

s1 = md1 mod n,
. and s2 = md2 mod n

for .m1 and .m2 . Then she computes

.s = s1 s2 mod n.

Since

s = md1 md2 mod n = (m1 m2 )d mod n = md mod n,


.

s is a valid signature for m.


In view of those attacks, RSA signatures are commonly used together with a fast
public hash function h (see Sect. 2.1.1). To sign a message m, Alice computes the
signature

s = h(m)d mod n.
.

Then she sends both m and s to Bob. Bob computes .s e mod n and .h(m). If

. s e mod n = h(m),

then Bob concludes the signature is valid.


170 3 Modern Cryptographic Algorithms and Their Implementations

With a hash function, the two attacks discussed above will not work. Suppose
Eve knows messages .m1 , m2 and their corresponding signatures .s1 and .s2 . She can
compute .h(m1 ) and .h(m2 ) as h is public. However, to repeat the known message
existential forgery attack, she needs to find m such that .h(m) = h(m1 )h(m2 ), which
is computationally infeasible according to property .(c) of hash functions listed in
Sect. 2.1.1.
Suppose Eve chooses a message m, and computes .h(m). To repeat the chosen
message selective forgery attack, she needs to find .m1 such that .h(m1 ) = y for
some .y ∈ Z∗n . For the same reason as above, this is computationally infeasible.

3.5 Implementations of RSA Cipher and RSA Signatures

In this section, we discuss several methods for implementing RSA and RSA signa-
ture computations. Section 3.5.1 presents three methods for implementing modular
exponentiation. As we will see, those methods will require the computations of other
modular operations. Then in Sect. 3.5.2, we discuss how to efficiently implement
modular multiplication.

3.5.1 Implementing Modular Exponentiation

To implement RSA or RSA signatures, we need to compute

a d mod n
.

for some integer .a ∈ Zn , where .n = pq is a product of two distinct primes and


d ∈ Z∗ϕ(n) . We can compute .d − 1 modular multiplications, but it will be inefficient
.

for large d. In practice, the bit length of d ranges in thousands, thus making the
calculation infeasible by this naïve method. We will discuss three methods to make
modular exponentiation computations faster.

3.5.1.1 Square and Multiply Algorithm

In this part, let .n ≥ 2 be an integer and .d ∈ Zϕ(n) . We discuss how to calculate

a d mod n
.

for .a ∈ Zn .
By Theorem 1.1.1, we can write d in the following form
3.5 Implementations of RSA Cipher and RSA Signatures 171

d −1
𝓁∑
d=
. di 2i ,
i=0

where .di = 0, 1, for .0 ≤ i ≤ 𝓁d − 1, and

d = d𝓁d −1 . . . d2 d1 d0
.

is the binary representation of d. Then we have

∑𝓁d −1 d −1
𝓁∏ ∏
di 2i i i
a =a
.
d i=0 = (a 2 )di = a2 .
i=0 0≤i<𝓁d ,di =1

i
Thus, to compute .a d mod n, we can first compute .a 2 for .0 ≤ i < 𝓁d . Then .a d is the
i
product of .a 2 for which .di = 1. One can see that compared to the naïve calculation,
requiring .d − 1 multiplications, this method only needs .≈ log2 d multiplications.
This observation leads us to the square and multiply algorithm listed in Algo-
i+1
rithm 3.7. Line 5 computes .a 2 in loop i. We check each bit of d (line 3), if the
i
ith bit of d is 1, then .a 2 is multiplied to the result (line 4). As this algorithm starts
from the least significant bit of d, i.e., .d0 , it is also called the right-to-left square and
multiply algorithm. Accordingly, the left-to-right square and multiply algorithm is
listed in Algorithm 3.8. We can see that compared to Algorithm 3.7, Algorithm 3.8
requires one less variable and hence less storage.

Algorithm 3.7: Right-to-left square and multiply algorithm for computing


modular exponentiation
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 result = 1, t = a
2 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
3 if di = 1 then
i
// multiply by a 2
4 result = result ∗ t mod n
i+1
// t = a 2
5 t = t ∗ t mod n
6 return result

Example 3.5.1 Let .n = 15, d = 3 = 112 , a = 2. Computing

a d mod n = 23 mod 15 = 8 mod 15 = 8


.
172 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.8: Left-to-right square and multiply algorithm for computing


modular exponentiation
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n)
Output: a d mod n
1 t =1
2 for i = 𝓁d , i ≥ 0, i − − do
3 t = t ∗ t mod n
// ith bit of d is 1
4 if di = 1 then
5 t = a ∗ t mod n

6 return t

using Algorithm 3.7, we get the values of the variables in each loop as follows:

i di t result
. 0 1 4 2
1 1 1 8

The returned value is 8. Similarly, using Algorithm 3.8, the intermediate values are:

i di t
.1 1 2

0 1 8

Where in the last loop, line 3 computes .t = 4 and line 5 calculates .t = 8 mod 15 =
8.
Example 3.5.2 Let .n = 23, d = 4 = 1002 , a = 5. Computing

a d mod n = 54 mod 23 = 625 mod 23 = 4


.

using Algorithm 3.7, we get the values of the variables in each loop as follows:

i di t result
0 0 2 1
.
1 0 4 1
2 1 16 4

The final result is 4. Using Algorithm 3.8, in the first loop (.i = 2), line 3 computes
t = 1 mod 23 and line 5 calculates .t = 1 × 5 mod 23 = 5 mod 23. The intermediate
.

values are:
3.5 Implementations of RSA Cipher and RSA Signatures 173

i di t
2 1 5
.
1 0 2
0 0 4

The final output is 4.

3.5.1.2 Montgomery Powering Ladder

Same as in Sect. 3.5.1.1, in this part, let .n ≥ 2 be an integer and .d ∈ Zϕ(n) . We


introduce another method, Montgomery powering ladder, to compute .a d mod n for
.a ∈ Zn .

Montgomery powering ladder was first introduced for efficient computations of


elliptic curve scalar multiplications [Mon87]. Then it was adopted for computing
exponentiation in any abelian group [JY03]. We will present the details of the
method used for modular exponentiation. In particular, the abelian group we
consider here is .Zn with modular multiplication.
Recall that we have the following binary representation of d

d −1
𝓁∑
.d= di 2i .
i=0

For .0 ≤ j ≤ 𝓁d − 1, define

d −1
𝓁∑
Lj :=
. di 2i−j , Hj := Lj + 1.
i=j

Then

d −1
𝓁∑ d −1
𝓁∑ d −1
𝓁∑
2Lj +1 = 2
. di 2i−(j +1) = di 2i−j = −dj + di 2i−j = −dj + Lj .
i=j +1 i=j +1 i=j

We have

Lj = 2Lj +1 + dj = Lj +1 + Hj +1 + dj − 1 = 2Hj +1 + dj − 2,
.

and
⎧ ⎧
2Lj +1 if dj = 0 Lj +1 + Hj +1 if dj = 0
Lj =
. , Hj = .
Lj +1 + Hj +1 if dj = 1 2Hj +1 if dj = 1
174 3 Modern Cryptographic Algorithms and Their Implementations

Then for any .a ∈ Zn ,


⎧ ⎧
(a Lj +1 )2 if dj = 0 a Lj +1 a Hj +1 if dj = 0
a
.
Lj
= , a Hj
= . (3.16)
a Lj +1 a Hj +1 if dj = 1 (a Hj +1 )2 if dj = 1

Since

d −1
𝓁∑
L0 =
. di 2i = d,
i=0

to compute .a d mod n is equivalent to computing .a L0 mod n. By Eq. 3.16,



(a L1 )2 if d0 = 0
a
.
L0
= .
a L1 a H1 if d0 = 1

Similarly, .a L1 and .a H1 can be computed with .a L2 and .a H2 . Thus, we can start


from the most significant bit of d, .d𝓁d −1 , compute .a L𝓁d −1 and .a H𝓁d −1 , then calculate
L𝓁 −2
.a d and .a H𝓁d −2 with Eq. 3.16, and so on. Note that

L𝓁d −1 = d𝓁d −1 ,
. H𝓁d −1 = d𝓁d −1 + 1

and
⎧ ⎧
L𝓁d −1 1 if d𝓁d −1 = 0 H𝓁d −1 a if d𝓁d −1 = 0
a
. = , a = . (3.17)
a if d𝓁d −1 = 1 a2 if d𝓁d −1 = 1

Details of Montgomery powering ladder for implementing modular exponentia-


tion are shown in Algorithm 3.9, where at the end of the j th iteration, .R0 and .R1
correspond to .a Lj and .a Hj respectively. When .j = 𝓁d − 1, lines 4–9 implement
Eq. 3.17. For .j < 𝓁d − 1, lines 4–9 implement Eq. 3.16.
The computations of lines 5 and 6 (respectively lines 8 and 9) can be done in
parallel by first storing the computation results in temporary variables and then
assigning to .R1 and .R0 (respectively .R0 and .R1 ).
Example 3.5.3 Same as in Example 3.5.1, let .n = 15, d = 3 = 112 , a = 2.
We have calculated that .a d mod n = 8. To compute it with Algorithm 3.9, the
intermediate values are

j = 1, d1 = 1, R0 = R0 R1 mod n = 2,
. R1 = R12 = 22 mod 15 = 4
j = 0, d0 = 1, R0 = R0 R1 mod n = 2 × 4 mod 15 = 8

and the final result is 8.


3.5 Implementations of RSA Cipher and RSA Signatures 175

Algorithm 3.9: Montgomery powering ladder for computing modular expo-


nentiation
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn , d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 R0 = 1
2 R1 = a
3 for j = 𝓁d − 1, j ≥ 0, j − − do
4 if dj = 0 then
5 R1 = R0 R1 mod n// a Hj = a Lj +1 Hj +1 for j < 𝓁d − 1
( )2
6 R0 = R02 mod n// a Lj = a Lj +1 for j < 𝓁d − 1
7 else
8 R0 = R0 R1 mod n// a Lj = a Lj +1 Hj +1 for j < 𝓁d − 1
( )2
9 R1 = R12 mod n// a Hj = a Hj +1 for j < 𝓁d − 1

10 return R0

Example 3.5.4 Here we repeat the computation in Example 3.5.2. Let .n = 23, d =
4 = 1002 , a = 5. We know that .a d mod n = 4. With Algorithm 3.9, the
intermediate values are

j = 2, d2 = 1, R0 = R0 R1 mod n = 5,
R1 = R12 = 52 mod 23 = 25 mod 23 = 2
j = 1, d1 = 0, R1 = R0 R1 mod n = 5 × 2 mod 23 = 10,
.
R0 = R02 = 52 mod 23 = 2
j = 0, d0 = 0, R1 = R0 R1 mod n = 2 × 10 mod 23 = 20,
R0 = R02 = 22 mod 15 = 4

and the final result is 4.

3.5.1.3 Chinese Remainder Theorem (CRT) Based RSA

In this part, we focus on the case when .n = pq is the RSA modulus (p, q are
distinct odd primes) and .d ∈ Z∗ϕ(n) is the private key.
By Chinese Remainder Theorem (see Theorem 1.4.7 and Example 1.4.19),
finding the solution for

x ≡ a d mod n
.

is equivalent to solving

x ≡ a d mod p,
. x ≡ a d mod q.

By Corollary 1.4.3, we can compute


176 3 Modern Cryptographic Algorithms and Their Implementations

xp := a d mod (p−1) mod p,


. xq := a d mod (q−1) mod q,

and solve for

x ≡ xp mod p,
. x ≡ xq mod q. (3.18)

An implementation that computes .a d mod n by solving equation 3.18 is called CRT-


based RSA implementation.
By Eqs. 1.19 and 1.20, we compute

Mq = q,
. Mp = p, yq = Mq−1 mod p = q −1 mod p,

yp = Mp−1 mod q = p−1 mod q,

and

x = xp yq q + xq yp p mod n
. (3.19)

gives us the solution to Eq. 3.18.


Calculating x by Eq. 3.19 is called the Gauss’s algorithm for CRT. While
Garner’s algorithm calculates

.x = xp + ((xq − xp )yp mod q)p. (3.20)

We will show that Eq. 3.20 indeed gives the solution to Eq. 3.18. First, it is
straightforward to see .x ≡ xp mod p. Furthermore,

. x ≡ xp + (xq − xp ) ≡ xq mod q.

Since .xp ∈ Zp , .xp < p. Similarly, .(xq − xp )yp mod q ≤ q − 1. And

x = xp + ((xq − xp )yp mod q)p < p + (q − 1)p = n.


.

Thus .x ∈ Zn .
Example 3.5.5 Let us consider the toy example from Example 3.3.1. We have

.p = 3, q = 5, n = 15, ϕ(n) = 8, e = 3, d = 3.

Bob receives ciphertext .c = 8 from Alice. Instead of computing the plaintext


directly using

m = cd mod n = 83 mod 15,


.

we compute
3.5 Implementations of RSA Cipher and RSA Signatures 177

. mp = cd mod (p−1) mod p = 83 mod 2 mod 3 = 8 mod 3 = 2,

mq = cd mod (q−1) mod q = 83 mod 4 mod 5 = 512 mod 5 = 2.

By the extended Euclidean algorithm,

5 = 3 × 1 + 2,
. 3 = 2 + 1 =⇒ 1 = 3 − (5 − 3) = 3 × 2 − 5.

Thus

yp = p−1 mod q = 3−1 mod 5 = 2 mod 5,


.

yq = q −1 mod p = 5−1 mod 3 = −1 mod 3 = 2 mod 3.

By Gauss’s algorithm,

m = mp yq q + mq yp p mod n = 2 × 2 × 5 + 2 × 2 × 3 = 32 mod 15 = 2.
.

By Garner’s algorithm,

m = mp + ((mq − mp )yp mod q)p = 2 + 0 = 2.


.

Both algorithms give us the original plaintext from Alice.


Example 3.5.6 Here we look at Example 3.3.2. We have

p = 29,
. q = 41, n = 1189, ϕ(n) = 1120, e = 3, d = 747,

and ciphertext .c = 8. Then

mp = cd mod (p−1) mod p = 8747 mod 28 mod 29 = 819 mod 29 = 2,


.

mq = cd mod (q−1) mod q = 8747 mod 40 mod 41 = 827 mod 41 = 2.

By the extended Euclidean algorithm

41 = 29 + 12,
. 29 = 12 × 2 + 5, 12 = 5 × 2 + 2, 5 = 2 × 2 + 1,

and

1 = 5 − 2 × (12 − 5 × 2) = −2 × 12 + (29 − 12 × 2) × 5
.

= 29 × 5 − 12 × (41 − 29) = −41 × 12 + 29 × 17.

We have
178 3 Modern Cryptographic Algorithms and Their Implementations

yp = p−1 mod q = 29−1 mod 41 = 17 mod 41,


.

yq = q −1 mod p = 41−1 mod 29 = −12 mod 29 = 17 mod 29.

By Gauss’s algorithm,

m = mp yq q + mq yp p mod n = 2 × 17 × 41 + 2 × 17 × 29 mod 1189


.

= 2380 mod 1189 = 2.

By Garner’s algorithm,

m = mp + ((mq − mp )yp mod q)p = 2 + 0 = 2.


.

Example 3.5.7 Same as in Example 3.5.6, we keep

p = 29,
. q = 41, n = 1189, ϕ(n) = 1120, e = 3, d = 747.

Then we have

yp = 17,
. yq = 17.

Let .c = 155, then

mp = cd mod (p−1) mod p = 155747 mod 28 mod 29 = 1019 mod 29 = 21,


.

mq = cd mod (q−1) mod q = 155747 mod 40 mod 41 = 3227 mod 41 = 9.

To compute .1019 mod 29, we note that

102 mod 29 = 100 mod 29 = 13,


.

104 mod 29 = 132 mod 29 = 24,


108 mod 29 = 242 mod 29 = 25,
1016 mod 29 = 252 mod 29 = 16.

Thus

1019 mod 29 = 1016 × 102 × 10 mod 29 = 16 × 13 × 10 mod 29 = 21.


.

Similarly,

322 mod 41 = 40
.

323 mod 41 = 32 × 40 mod 41 = 9


3.5 Implementations of RSA Cipher and RSA Signatures 179

329 mod 41 = 93 mod 41 = 32


3227 mod 41 = 323 mod 41 = 9.

By Gauss’s algorithm,

m = mp yq q + mq yp p mod n = 21 × 17 × 41 + 9 × 17 × 29 mod 1189


.

= 19074 mod 1189 = 50.

By Garner’s algorithm,

m = mp + ((mq − mp )yp mod q)p = 21 + ((9 − 21) × 17 mod 41) × 29


.

= 21 + 1 × 29 = 50.

Example 3.5.8 Let us consider Example 3.4.1 for RSA signatures computation. We
have

p = 5,
. q = 7, n = 35, ϕ(n) = 24, e = 5, d = 5, m = 10.

To sign message .m = 10, Alice computes

sp = md mod (p−1) mod p = 105 mod 4 mod 5 = 0,


.

sq = md mod (q−1) mod q = 105 mod 6 mod 7 = 5.

By the extended Euclidean algorithm

7 = 5 + 2,
. 5 = 2 × 2 + 1 =⇒ 1 = 5 − 2 × (7 − 5) = 5 × 3 − 2 × 7

We have

.yp = p−1 mod q = 3 mod 7,


yq = q −1 mod p = −2 mod 5 = 3.

By Gauss’s algorithm,

s = sp yq q + sq yp p mod n = 5 × 3 × 5 mod 35 = 5.
.

By Garner’s algorithm,

. s = sp + ((sq − sp )yp mod q)p = 0 + (5 × 3 mod 7) × 5 = 1 × 5 = 5.


180 3 Modern Cryptographic Algorithms and Their Implementations

Compared to Gauss’s algorithm, Garner’s algorithm does not require the final
modulo n reduction.
CRT-based RSA implementation can improve the efficiency of the computation
in many ways. Firstly, .yp and .yq can be precomputed, which saves time during
communication. Secondly, the intermediate values during the computation are
only half as big compared to those in the computation of .a d mod n since they
are in .Zp or .Zq rather than .Zn . Moreover, .xp = a d mod (p−1) mod p and
.xq = a
d mod (q−1) mod q can be calculated by the square and multiply algorithm

(Algorithms 3.7 and 3.8) or Montgomery powering ladder (Algorithm 3.9) to further
improve the efficiency. In this case, .d mod (p − 1) and .d mod (q − 1) are much
smaller than d, computing .xp or .xq requires fewer multiplications than computing
d d
.a mod p or .a mod q.

3.5.2 Implementing Modular Multiplication

From the previous subsection, we see that to have more efficient modular exponen-
tiation implementations, we need to compute modular addition, subtraction, inverse,
and multiplications. For modular addition and subtraction, we can just compute the
corresponding integer operations and then perform a single reduction modulo the
modulus. For inverse modulo an integer, as has been mentioned a few times, we can
utilize the extended Euclidean algorithm. Next, we will discuss two methods for
implementing modular multiplication.
Throughout this subsection, let n be an integer of bit length .𝓁n , in particular

2𝓁n −1 ≤ n < 2𝓁n .


. (3.21)

Let .a, b ∈ Zn be two integers. Then .0 ≤ a, b < n. We would like to compute

R := ab mod n.
.

Let us assume the computer’s word size (see Sect. 2.1.2) is .ω. Define
⎾ ⏋
𝓁n
.κ := , i.e., (κ − 1)ω < 𝓁n ≤ κω.
ω

We can write

a = aκ−1 ||aκ−2 || . . . ||a0 ,


. b = bκ−1 ||bκ−2 || . . . ||b0 ,
0 ≤ ai , bj < 2ω for 0 ≤ i, j < κ.

where .|| indicates concatenation. Note that some .ai or .bj might be 0 if the bit length
of a or b is less than .𝓁n . Furthermore, we have
3.5 Implementations of RSA Cipher and RSA Signatures 181


κ−1 ∑
κ−1
a=
. ai (2ω )i , b= bj (2ω )i . (3.22)
i=0 j =0

Then the product of a and b is given by

t = ab = t2κ−1 ||t2κ−2 || . . . ||t0 ,


.

where

.tx = ai bj , 0 ≤ x ≤ 2κ − 1.
i,j, i+j =x

Such a multiplication method can be described by Algorithm 3.10.

Algorithm 3.10: Standard multiplication


Input: a, b// a, b ∈ Zn , where n ≥ 2 is an integer of bit length 𝓁n
Output: ab
1 for i = 0, 1, 2 . . . , 2κ − 1, ti = 0// κ = ⎾𝓁n /ω⏋, where ω is the word size of
the computer
// for each bj
2 for j = 0, j < κ, j + + do
3 T1 = 0
// for each ai
4 for i = 0, i < κ, i + + do
// Ti has bit length at most ω
5 T1 ||T0 = ti+j + ai bj + T1
6 ti+j = T0
7 tj +κ = T1
8 return t2κ−1 ||t2κ−2 || . . . ||t0

One drawback of Algorithm 3.10 is that a variable with double word size is being
processed in line 5. To see this, the maximum value of the right-hand side in line 5
is

2ω − 1 + (2ω − 1)(2ω − 1) + 2ω − 1 = 22ω − 1.


.

Moreover, to compute .R = t mod n, division by n will be required.


Example 3.5.9 As a simple example, let us consider word size .ω = 2 and let .n =
15 be a 4-bit integer. Let .a = 13 = 11012 and .b = 5 = 01012 . We have
⎾ ⏋
4
a0 = 012 ,
. a1 = 112 , b0 = 012 , b1 = 012 , κ= = 2.
2
182 3 Modern Cryptographic Algorithms and Their Implementations

The product .t = t3 ||t2 ||t1 ||t0 has bit length at most 8. Computations in lines 5–7 for
each loop are as follows:

T1 ||T0 = t0 + a0 b0 + T1 = 00 + 01 + 00 = 0001, t0 = 01,


T1 ||T0 = t1 + a1 b0 + T1 = 00 + 11 + 00 = 0011, t1 = 11,
.
T1 ||T0 = t1 + a0 b1 + T1 = 11 + 01 + 00 = 0100, t1 = 00,
T1 ||T0 = t2 + a1 b1 + T1 = 00 + 11 + 01 = 0100, t2 = 00, t3 = 01

The values for each variable in Algorithm 3.10 are listed below

j i ai bj T1 T0 t3 t2 t1 t0
0 0 01 00 01 00 00 00 01
.0 1 11 00 11 00 00 11 01
1 0 01 01 00 00 00 00 01
1 1 11 01 00 01 00 00 01

As expected, we get

t = 01000001 = 65 = 13 × 5.
.

Furthermore, if we would like to continue the computation and find .ab mod 15, we
will divide 65 by 15 and calculate the remainder, which is 5.

3.5.2.1 Blakely’s Method

First proposed in 1983 [Bla83], Blakely’s method for computing modular multipli-
cation interleaves the multiplication steps with the reduction steps. The product ab
is computed as follows
⎛κ−1 ⎞
∑ ∑
κ−1
t = ab =
.
ω i
ai (2 ) b= (2ω )i ai b,
i=0 i=0

where .ai s are given in Eq. 3.22. Algorithm 3.11 lists the steps for computing

R = t mod n = ab mod n
.

with Blakely’s method.


Note that in line 3,

R ≤ 2ω (n − 1) + (2ω − 1)(n − 1) = (2ω+1 − 1)n − (2ω+1 − 1) < (2ω+1 − 1)n.


.
3.5 Implementations of RSA Cipher and RSA Signatures 183

Algorithm 3.11: Blakely’s method for computing modular multiplication


Input: n, a, b// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a, b ∈ Zn
Output: ab mod n
1 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
2 for i = κ − 1, i ≥ 0, i − − do
3 R = 2ω R + ai b
4 R = R mod n
5 return R

Thus, line 4 can be replaced by comparing R with n for .2ω+1 − 2 times and subtract
n from R in case .R ≥ n:
1 for j = 0, 1, 2 . . . , 2ω+1 − 2 do
2 if R ≥ n then
3 R =R−n
4 else break

In this way, we can avoid dividing by n to compute the remainder. In particular,


when .ω = 1, .2ω+1 − 2 = 2. And we have Algorithm 3.12, which is the original
proposal from Blakely [Bla83, Koç94].

Algorithm 3.12: Blakely’s method for computing modular multiplication by


taking .ω = 1
Input: n, a, b// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a, b ∈ Zn
Output: ab mod n
1 R=0
2 for i = 𝓁n − 1, i ≥ 0, i − − do
3 R = 2R + ai b
4 if R ≥ n then R = R − n
5 if R ≥ n then R = R − n
6 return R

Example 3.5.10 Same as in Example 3.5.9, let the word size .ω = 2, and

a = 13 = 11012 ,
. b = 5, n = 15, 𝓁n = 4, κ = 2.

We have

.a0 = 012 = 1, a1 = 112 = 3.

Let us calculate .ab mod n using Algorithm 3.11. For .i = 1,


184 3 Modern Cryptographic Algorithms and Their Implementations

. R = 0 + 3 × 5 mod 15 = 0 mod 15.

And for .i = 0,

. R = 0 + 1 × 5 mod 15 = 5 mod 15.

We have the final result .13 × 5 mod 15 = 5.


Example 3.5.11 Let

a = 55 = 1101112 ,
. b = 46, n = 69, ω = 2.

n is a 7-bit integer. Then


⎾ ⏋
7
a0 = 11 = 3,
. a1 = 01 = 1, a2 = 11 = 3, a3 = 0, κ= = 4.
2

Computing .ab mod n with Algorithm 3.11 gives us the following intermediate
values:

i=3
. line 3, R = 0,
line 4, R = 0,
i=2 line 3, R = 3 × 46 = 138,
line 4, R = 138 mod 69 = 0,
i=1 line 3, R = 1 × 46 = 46,
line 4, R = 46 mod 69 = 46,
i=0 line 3, R = 22 × 46 + 3 × 46 = 322,
line 4, R = 322 mod 69 = 46.

We have .ab mod n = 46.


Now we can expand the modular multiplication computations in the square and
multiply algorithm with Blakely’s method. The details are listed in Algorithm 3.13
for right-to-left square and multiply algorithm, and in Algorithm 3.14 for left-to-
right square and multiply algorithm.
Since .𝓁n is the bit length of n, the bit lengths of the variables “result,” “t,” and
“a” are at most .𝓁n . We can write


κ−1 ∑
κ−1 ∑
κ−1
result =
. hj (2ω )j , t= tj (2ω )j , a= aj (2ω )j .
j =0 j =0 j =0
3.5 Implementations of RSA Cipher and RSA Signatures 185

Algorithm 3.13: Right-to-left square and multiply algorithm with Blakely’s


method for modular multiplication
Input: n, a, d// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a ∈ Zn ; d ∈ Zϕ(n) has bit
length 𝓁d
Output: a d mod n
1 result = 1
2 t =a
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
// lines 5-9 implement result = result∗t mod n
5 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
6 for j = κ − 1, j ≥ 0, j − − do
7 R = 2ω R + hj t
8 R = R mod n
9 result = R
// lines 10-14 implement t = t ∗ t mod n
10 R=0
11 for j = κ − 1, j ≥ 0, j − − do
12 R = 2ω R + tj t
13 R = R mod n
14 t =R
15 return result

Then, in Algorithm 3.13, lines 5–9 implement .result = result ∗ t mod n (line 4 of
Algorithm 3.7) and lines 10–14 implement .t = t ∗t mod n (line 5 of Algorithm 3.7).
Similarly, in Algorithm 3.14, lines 3–7 implement .t = t ∗ t mod n (line 3 of
Algorithm 3.8) and lines 9–13 implement .t = a ∗ t mod n (line 5 of Algorithm 3.8).
Example 3.5.12 Let us repeat the computation in Example 3.5.2 with Blakley’s
method. We will calculate

a d mod n = 54 mod 23 = 625 mod 23 = 4.


.

Suppose the computer word size .ω = 2. .n = 23 = 101112 has .𝓁n = 5 bits, then
κ = ⎾5/2⏋ = 3. Lines 1 and 2 in Algorithm 3.13 give
.

result = 1,
. h0 = 01, h1 = 00, h2 = 00, t = 5 = 01012 ,
t0 = 01, t1 = 01, t2 = 00.

The intermediate values during the computation are:


186 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.14: Left-to-right square and multiply algorithm with Blakely’s


method for modular multiplication
Input: n, a, d// n ∈ Z, n ≥ 2 has bitlength 𝓁n ; a ∈ Zn ; d ∈ Zϕ(n) has bit
length 𝓁d
Output: a d mod n
1 t =1
2 for i = 𝓁d − 1, i ≥ 0, i − − do
// lines 3-7 implement t = t ∗ t mod n
3 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
4 for j = κ − 1, j ≥ 0, j − − do
5 R = 2ω R + tj t
6 R = R mod n
7 t =R
// ith bit of d is 1
8 if di = 1 then
// lines 9-13 implement t = a ∗ t mod n
9 R=0
10 for j = κ − 1, j ≥ 0, j − − do
11 R = 2ω R + aj t
12 R = R mod n
13 t =R

14 return t

i = 0 d0 = 0
loop line 11 j = 2 R=0
j =1 R = 2ω R + t1 t mod n = 5 mod 23
j =0 R = 2ω R + t0 t mod n = 22 × 5 + 1 × 5 mod 23
= 2 mod 23
line 14 t =2 t0 = 10, t1 = 00, t2 = 00
i = 1 d1 = 0
loop line 11 j = 2 R=0
.
j =1 R=0
j =0 R = t0 t mod n = 2 × 2 mod 23 = 4 mod 23
line 14 t =4 t0 = 00, t1 = 01, t2 = 00
i = 2 d2 = 1
loop line 6 j = 2 R=0
j =1 R=0
j =0 R = h0 t mod n = 4 mod 23
line 9 result = 4

And the output is 4. Similarly, with Algorithm 3.14, line 1 gives


3.5 Implementations of RSA Cipher and RSA Signatures 187

t = 1,
. t0 = 01, t1 = 00, t2 = 00.

We also have

. a = 5, a0 = 01, a1 = 01, a2 = 00.

The intermediate values are

i = 2 d2 = 1
loop line 4 j = 2 R = 0
j =1R=0
j = 0 R = t0 t mod n = 1 mod 23
line 7 t = 1 t0 = 01, t1 = 00, t2 = 00
loop line 10 j = 2 R = 0
j = 1 R = 2ω R + a1 t = 1 mod 23
j = 0 R = 2ω R + a0 t = 22 + 1 = 5 mod 23
line 13 t = 5 t0 = 01, t1 = 01, t2 = 00
i = 1 d1 = 0
.
loop line 4 j = 2 R = 0
j = 1 R = t1 t mod n = 5 mod 23
j = 0 R = 2ω R + t0 t mod n = 22 × 5 + 5 mod 23
= 25 mod 23 = 2 mod 23
line 7 t = 2 t0 = 10, t1 = 00, t2 = 00
i = 0 d0 = 0
loop line 4 j = 2 R = 0
j =1R=0
j = 0 R = t0 t mod n = 2 × 2 mod 23 = 4 mod 23
line 7 t =4

The output is also 4.


Similarly, we can adopt Blakely’s method in Montgomery powering ladder
(Algorithm 3.9) and we get Algorithm 3.15 for computing modular exponentiation.
Since .𝓁n is the bit length of n, the bit lengths of the variables .R0 and .R1 are at most
.𝓁n . We can write


κ−1 ∑
κ−1
. R0 = R0i (2ω )i , R1 = R1i (2ω )i .
i=0 i=0

Then lines 5–9 implement .R1 = R0 R1 mod n (line 5 of Algorithm 3.9). Lines 10–14
implement .R0 = R02 mod n (line 6 of Algorithm 3.9). Lines 16–20 implement .R0 =
R0 R1 mod n (line 8 of Algorithm 3.9). Lines 21–25 implement .R1 = R12 mod n
(line 9 of Algorithm 3.9).
188 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.15: Montgomery powering ladder with Blakely’s method for


computing modular multiplication
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 R0 = 1
2 R1 = a
3 for j = 𝓁d − 1, j ≥ 0, j − − do
4 if dj = 0 then
// lines 5-9 implement R1 = R0 R1 mod n
5 R=0
6 for i = κ − 1, i ≥ 0, i − − do
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
7 R = 2ω R + R0i R1
8 R = R mod n
9 R1 = R
// lines 10-14 implement R0 = R02 mod n
10 R=0
11 for i = κ − 1, i ≥ 0, i − − do
12 R = 2ω R + R0i R0
13 R = R mod n
14 R0 = R
15 else
// lines 16-20 implement R0 = R0 R1 mod n
16 R=0
17 for i = κ − 1, i ≥ 0, i − − do
18 R = 2ω R + R0i R1
19 R = R mod n
20 R0 = R
// lines 21-25 implement R1 = R12 mod n
21 R=0
22 for i = κ − 1, i ≥ 0, i − − do
23 R = 2ω R + R1i R1
24 R = R mod n
25 R1 = R

26 return R0

Example 3.5.13 Here we repeat the computation in Example 3.5.4 with Algo-
rithm 3.15. Let

.n = 23, d = 4 = 1002 , a = 5.

We have calculated that .a d mod n = 4. Same as in Example 3.5.12, we assume


.ω = 2. Then we have .𝓁n = 5 and .κ = 3. With Algorithm 3.15, lines 1 and 2 give

R0 = 1,
. R00 = 01, R01 = 00, R02 = 00, R1 = 5, R10 = 01,
3.5 Implementations of RSA Cipher and RSA Signatures 189

R11 = 01, R12 = 00.

The intermediate values are

j = 2 d2 = 1
loop line 17 i = 2 R=0
i=1 R=0
i=0 R = R00 R1 mod n = 5 mod 23
line 20 R0 = 5 R00 = 01, R01 = 01, R02 = 00
loop line 22 i = 2 R=0
i=1 R = 2ω R + R11 R1 mod n = 5 mod 23
i=0 R = 2ω R + R10 R1 mod n = 22 × 5 + 5 mod 23 = 2
line 25 R1 = 2 R10 = 10, R11 = 00, R12 = 00
j = 1 d1 = 0
loop line 6 i = 2 R=0
i=1 R = R01 R1 mod n = 2 mod 23
i=0 R = 2ω R + R00 R1 mod n = 22 × 2
+2 mod 23 = 10
. line 9 R1 = 10 R10 = 10, R11 = 10, R12 = 00
loop line 11 i = 2 R=0
i=1 R = 2ω R + R01 R0 mod n = 5 mod 23
i=0 R = 2ω R + R00 R0 mod n = 22 × 5
+5 mod 23 = 2
line 14 R0 = 2 R00 = 10, R01 = 00, R02 = 00
j = 0 d0 = 0
loop line 6 i = 2 R=0
i=1 R=0
i=0 R = R00 R1 mod n = 2 × 10 mod 23 = 20
line 9 R1 = 20
loop line 11 i = 2 R=0
i=1 R=0
i=0 R = R00 R0 mod n = 2 × 2 mod 23 = 4
line 14 R0 = 4

Hence the output is 4.

3.5.2.2 Montgomery’s Method

In this part, we discuss another method for computing modular multiplication,


attributed to Peter Montgomery [Mon85].
190 3 Modern Cryptographic Algorithms and Their Implementations

Suppose n is odd and let .r = 2𝓁n . In particular, .gcd(n, r) = 1. By Bézout’s


identity (Theorem 1.1.3), there exist integers .r −1 and .n̂ such that

.rr −1 − nn̂ = 1. (3.23)

We have discussed that such a pair of integers .r −1 and .n̂ can be found with the
extended Euclidean algorithm.
Remark 3.5.1 We note that for any positive integer t,

rr −1 − nn̂ + trn − trn = 1 =⇒ r(r −1 + tn) − n(n̂ + tr) = 1.


. (3.24)

Then .r −1 + tn, .n̂ + tr can replace .r −1 and .n̂ in Eq. 3.23.


For the rest of this part, we further require that .n̂ is positive.
Example 3.5.14 Let .n = 15. Then .𝓁n = 4 and .r = 24 = 16. By the extended
Euclidean algorithm

16 = 15 + 1 =⇒ 1 = 16 − 15,
.

we have .r −1 = 1, and .n̂ = 1.


Example 3.5.15 Let .n = 23. Then .𝓁n = 5 and .r = 25 = 32. By the extended
Euclidean algorithm

32 = 23 + 9,
. 23 = 9 × 2 + 5, 9 = 5 + 4, 5 = 4 + 1,

and

1 = 5−(9−5) = −9+(23−9×2)×2 = 23×2−5×(32−23) = 23×7−32×5.


.

Hence .r −1 = −5 and .n̂ = −7. To make .n̂ positive, we can take .t = 1 as in Eq. 3.24,
we have

r −1 = −5 + n = −5 + 23 = 18,
. n̂ = −7 + r = −7 + 32 = 25.

We can check that

18r − 25n = 18 × 32 − 25 × 23 = 1.
.

Example 3.5.16 Let .n = 57. Then .𝓁n = 6 and .r = 26 = 64. By the extended
Euclidean algorithm

. 64 = 57 + 7, 57 = 7 × 8 + 1 =⇒ 1 = 57 − (64 − 57) × 8 = −64 × 8 + 57 × 9,


3.5 Implementations of RSA Cipher and RSA Signatures 191

and we have .r −1 = −8, and .n̂ = −9. To get a positive .n̂, we choose (see
Remark 3.5.1)

r −1 = −8 + n = −8 + 57 = 49,
. n̂ + r = −9 + 64 = 55.

We can check that

49r − 55n = 49 × 64 − 55 × 57 = 1.
.

Example 3.5.17 Let .n = 1189. Then .𝓁n = 11 and .r = 211 = 2048. By the
extended Euclidean algorithm

2048 = 1189 + 859, 1189 = 859 + 330, 859 = 330 × 2 + 199, 330 = 199 + 131,
= 131 + 68,
. 199 131 = 68 + 63, 68 = 63 + 5, 63 = 5 × 12 + 3,
5 = 3 + 2, 3 = 2 + 1,

and

1 = 3 − 2 = (63 − 5 × 12) × 2 − 5 = 63 × 2 − (68 − 63) × 25


.

= (131 − 68) × 27 − 68 × 25
= 131 × 27 − (199 − 131) × 52 = (330 − 199) × 79 − 199 × 52
= 330 × 79 − (859 − 330 × 2) × 131
= (1189 − 859) × 341 − 859 × 131 = 1189 × 341 − (2048 − 1189) × 472
= 2048 × (−472) − 1189 × (−813).

We have .r −1 = −472 and .n̂ = −813. To have a positive .n̂, we take

r −1 = −472 + 1189 = 717,


. n̂ = 2048 − 813 = 1235.

Before computing .R = ab mod n, we first introduce Algorithm 3.16, denoted


MonPro, which calculates .abr −1 mod n given .a, b, n, and .n̂.
By Eq. 3.23,

1 + nn̂ ≡ 0 mod r.
.

Then in line 3 of Algorithm 3.16,

t + mn = t + t n̂n = t (1 + n̂n)
.

is divisible by r, and the output u is an integer. By our choice of .r = 2𝓁n and Eq. 3.21

t = ab < rn.
.
192 3 Modern Cryptographic Algorithms and Their Implementations

From line 2 we know .m < r. Hence in line 3,

rn + rn
u<
. = 2n,
r
which shows that lines 4–5 calculate .u mod n. Furthermore,

ab + mn
u≡
. ≡ abr −1 + mnr −1 ≡ abr −1 mod n.
r

Thus, Algorithm 3.16 indeed outputs .abr −1 mod n.

Algorithm 3.16: MonPro, Montgomery product algorithm


Input: n, r, n̂, a, b// n is an odd integer of bit length 𝓁n ; r = 2𝓁n ; n̂ is a
positive integer satisfying equation 3.23; a, b ∈ Zn
Output: abr −1 mod n
1 t = ab
2 m = t n̂ mod r
t + mn
3 u=
r
4 if u ≥ n then
5 u=u−n
6 return u

Let .x = x𝓁x −1 x𝓁x −2 . . . x1 x0 be a positive integer of bit length .𝓁x . By definition


(see Theorem 1.1.1), we know that

x −1
𝓁∑
x=
. xi 2i .
i=0

If .𝓁x ≥ 𝓁n , for any .i ≥ 𝓁n , .xi 2i is a multiple of .r = 2𝓁n . Thus,

−1,𝓁n −1}
min{𝓁x∑
x mod r =
. xi 2i .
i=0

In other words, to compute .x mod r, we just keep the least significant .𝓁n bits of x.
Note that the integer .r − 1 has binary representation given by a binary string with
.𝓁n 1s. We have

x mod r = x & (r − 1).


.

We know that .a, b ≥ 0. Since we also choose .n̂ > 0, line 2 can be replaced by
3.5 Implementations of RSA Cipher and RSA Signatures 193

m = t n̂ & (r − 1).
.

In case x is a multiple of r. We have


| 𝓁x −1
| ∑
2𝓁n ||
. xi 2i .
i=0

It is easy to show that .xi = 0 for .0 ≤ i < 𝓁n . And

x −1
𝓁∑
x
. = xi 2i . (3.25)
r
i=𝓁n

For any positive integer .s ≤ 𝓁x , we define right shift x by s bits to be the integer
x𝓁x −1 x𝓁x −2 . . . xs .4 We write
.

x ⪢ s := x𝓁x −1 x𝓁x −2 . . . xs .
. (3.26)

Compared with Eq. 3.25, division by r is equivalent to right shift by .𝓁n . We have
shown that .t + mn in line 3 is a multiple of r. Then line 3 can be replaced by

u = (t + mn) ⪢ 𝓁n .
.

In summary, Algorithm 3.16 can be rewritten as Algorithm 3.17. The discussions


above demonstrate the main advantage of using MonPro over a standard modular
multiplication method—the operations modulo n is replaced by modulo r, which
can be simplified to an AND operation. Furthermore, to compute division by r, we
can simply do a right shift.
Example 3.5.18 Let .n = 15, Then

.𝓁n = 4, r = 24 = 16, r − 1 = 15.

We have

53 mod r = 5,
. 53 & 15 = 110101 & 1111 = 101 = 5.

Furthermore,

240 240
. = = 15, 240 ⪢ 4 = 11110000 ⪢ 4 = 1111 = 15.
r 16

4 Note that when .s = 𝓁x , we have .x ⪢ s = 0.


194 3 Modern Cryptographic Algorithms and Their Implementations

Algorithm 3.17: MonPro, Montgomery product algorithm


Input: n, r, n̂, a, b// n is an odd integer of bit length 𝓁n ; r = 2𝓁n ; n̂ is a
positive integer satisfying equation 3.23; a, b ∈ Zn
Output: abr −1 mod n
1 t = ab
2 m = t n̂ & (r − 1)// for a non-negative integer, mod r is equivalent to
computing AND with r − 1. This line implements line 2 of
Algorithm 3.16.
3 u = (t + mn) ⪢ 𝓁n // for a non-negative integer, shift right by 𝓁n bits
is equivalent to division by r = 2𝓁n . This line implements line 3 of
Algorithm 3.16.
4 if u ≥ n then
5 u=u−n
6 return u

Example 3.5.19 Let .n = 23. Then .𝓁n = 5 and .r = 25 = 32. In Example 3.5.15
we have discussed that .r −1 = 18 and .n̂ = 25. We will compute a few modular
multiplications which will be useful for Example 3.5.27.
Let .a = 22, .b = 22. Following Algorithm 3.16, we have

t = ab = 22 × 22 = 484,
.

m = t n̂ mod r = 484 × 25 mod 32 = 4,


t + mn 484 + 4 × 23
u= = = 18,
r 32

and the output is 18. Indeed, .abr −1 mod n = 22 × 22 × 18 mod 23 = 18.


Let .a = 18, .b = 18. We have

t = ab = 18 × 18 = 324,
.

m = t n̂ mod r = 324 × 25 mod 32 = 4,


t + mn 324 + 4 × 23
u= = = 13,
r 32

and the output is 13. We can verity that .abr −1 mod n = 18 × 18 × 18 mod 23 = 13.
Let .a = 9, .b = 13. We have

t = ab = 9 × 13 = 117,
.

m = t n̂ mod r = 117 × 25 mod 32 = 13,


t + mn 117 + 13 × 23
u= = = 13,
r 32

and the output is 13. We can verity that .abr −1 mod n = 9 × 13 × 18 mod 23 = 13.
3.5 Implementations of RSA Cipher and RSA Signatures 195

Let .a = 13, .b = 13. We have

t = ab = 169,
.

m = t n̂ mod r = 169 × 25 mod 32 = 1,


t + mn 169 + 1 × 23
u= = = 6,
r 32

and the output is 6. We can verity that .abr −1 mod n = 13 × 13 × 18 mod 23 = 6.


Let .a = 13, .b = 1. We have

t = ab = 13,
.

m = t n̂ mod r = 13 × 25 mod 32 = 5,
t + mn 13 + 5 × 23
u= = = 4,
r 32

and the output is 4. We can verity that .abr −1 mod n = 13 × 18 mod 23 = 4.


Let .a = 9, .b = 9. We have

t = ab = 81,
.

m = t n̂ mod r = 81 × 25 mod 32 = 9,
t + mn 81 + 9 × 23
u= = = 9,
r 32

and the output is 9. We can verity that .abr −1 mod n = 9 × 9 × 18 mod 23 = 9.


Let .a = 9, .b = 22. We have

t = ab = 198,
.

m = t n̂ mod r = 198 × 25 mod 32 = 22,


t + mn 198 + 22 × 23
u= = = 22,
r 32

and the output is 22. We can verity that .abr −1 mod n = 9 × 22 × 18 mod 23 = 22.
Example 3.5.20 Let .n = 15, .a = 3, b = 5. We have discussed in Example 3.5.14
that .r = 24 = 16, .r −1 = 1 and .n̂ = 1. Following Algorithm 3.16, we have

t = ab = 3 × 5 = 15,
.

m = t n̂ mod r = 15 × 1 mod 16 = 15,


t + mn 15 + 15 × 15
u= = = 15,
r 16
196 3 Modern Cryptographic Algorithms and Their Implementations

and the output is 0. Indeed, .abr −1 mod n = 15 mod 15 = 0.


Example 3.5.21 Let .n = 57, .a = 3, b = 5. We have discussed in Example 3.5.16
that .r = 64, .r −1 = 49 and .n̂ = 55. Following Algorithm 3.16, we have

t = ab = 3 × 5 = 15
.

m = t n̂ mod r = 15 × 55 mod 64 = 57 mod 64,


t + mn 15 + 57 × 57
u= = = 51,
r 64
and the output is 51. We can check that

abr −1 mod n = 3 × 5 × 49 mod 57 = 735 mod 57 = 51.


.

Example 3.5.22 Let .n = 57, .a = 21, b = 5. We know from Example 3.5.16 that

r = 64,
. r −1 = 49, n̂ = 55.

Following Algorithm 3.16, we have

t = ab = 21 × 5 = 105
.

m = t n̂ mod r = 105 × 55 mod 64 = 15


t + mn 105 + 15 × 57
u= = = 15,
r 64
and the output is 15. We can check that

.abr −1 mod n = 21 × 5 × 49 mod 57 = 5145 mod 57 = 15.

For any .a ∈ Zn , we define the .n−residue of a with respect to r as

ar := ar mod n.
.

Example 3.5.23 Let .n = 15, and .a = 3. Then .r = 16 and

ar = ar mod n = 3×16 mod 15 = (3 mod 15)×(16 mod 15) = 3×1 mod 15 = 3.


.

Example 3.5.24 Let .n = 57, and .a = 3, then .r = 64 and

ar = ar mod n = 3 × 64 mod 57 = (3 mod 57)(64 mod 57) = 3 × 7 mod 57 = 21.


.

To compute .R = ab mod n, we note that

R = ab mod n = ar br −1 mod n = MonPro(ar , b).


.
3.5 Implementations of RSA Cipher and RSA Signatures 197

We refer to such a computation as Montogomery’s method for modular multiplica-


tion. Details are given in Algorithm 3.18.

Algorithm 3.18: Montgomery’s method for computing modular multiplication


Input: n, r, a, b// n an odd integer of bit length 𝓁n ; r = 2𝓁n ; a, b ∈ Zn
Output: ab mod n
1 Compute a positive n̂ with the extended Euclidean algorithm (Algorithm 1.2)
2 ar = ar mod n
3 u = MonPro(n, r, n̂, ar , b)// Algorithm 3.16 or 3.17
4 return u

Example 3.5.25 Let .n = 15, .a = 3, and .b = 5. We have discussed that


ar = 3 (see Example 3.5.23) and MonPro.(3, 5) = 0 (see Example 3.5.20). Then
.

by Algorithm 3.18, .ab mod n = 0. Indeed, .ab mod n = 3 × 5 mod 15 = 0.


Example 3.5.26 Let .n = 57, .a = 3, and .b = 5. We know that .ar = 21
(see Example 3.5.20) and MonPro.(21, 5) = 15 (see Example 3.5.22). Then by
Algorithm 3.18, .ab mod n = 15. We can check that .ab mod n = 3 × 5 mod 57 =
15.
Utilizing MonPro for computing modular multiplication as in Algorithm 3.18 is
not optimal as it requires computing .ar mod n for each multiplication. Even though
.n̂ can be precomputed by the extended Euclidean algorithm, it is time-consuming.

MonPro will be more useful when multiple multiplications are computed. We will
discuss a more efficient way of using MonPro.
By Corollary1.4.2, the set

Zrn := {ar = ar mod n | a ∈ Zn }


.

contains the same elements modulo n as in .Zn . We define addition .+Mon and
multiplication .×Mon operation on .Zrn as follows:

.ar +Mon br := (a + b)r , ar ×Mon br := (ab)r mod n.

Then we have the following lemma.


Lemma 3.5.1 .(Zrn , +Mon , ×Mon ) is a commutative ring with additive identity .0r and
multiplicative identity .1r .
Proof Firstly, .(a + b)r = (a + b)r mod n and .(ab)r = abr mod n are both in .Zrn .
Thus .Zrn is closed under .+Mon and .×Mon .
Associativity and commutativity of .+Mon follows from that for addition in .Zn .
The identity element for .+Mon is .0r = 0 mod n since for any .ar ∈ Zrn ,

ar + 0r = ar mod n + 0 mod n = ar .
.
198 3 Modern Cryptographic Algorithms and Their Implementations

The inverse of .ar with respect to .+Mon is .(−a)r , where .−a is the inverse of a in .Zn
with respect to addition modulo n:

.ar + (−a)r = ar mod n + (−a)r mod n = ar − ar mod n = 0 mod n = 0r .

We have proved that .(Zrn , +Mon ) is an abelian group.


Now, for any .ar , br , cr ∈ Zrn .

(ar ×Mon br ) ×Mon cr = (ab)r ×Mon cr = (abc)r = abcr mod n


.

ar ×Mon (br ×Mon cr ) = ar ×Mon (bc)r = (abc)r = abcr mod n.

Hence

(ar ×Mon br ) ×Mon cr = ar ×Mon (br ×Mon cr )


.

and .×Mon is associative. Moreover,

ar ×Mon (br +Mon cr ) = ar ×Mon (b + c)r = (a(b + c))r


.

= (ab + ac)r = (ab)r +Mon (bc)r


= ar ×Mon br +Mon ar ×Mon cr ,

so the distributive law holds for .×Mon and .+Mon . The identity element for .×Mon is
1r = r mod n since
.

.ar ×Mon 1r = 1r ×Mon ar = ar .

Hence, .(Zrn , +Mon , ×Mon ) is a commutative ring (see Definition 1.2.8).


Remark 3.5.2 We note that

ar ×Mon br = (ab)r = abr mod n = arbrr −1 mod n = ar br r −1 mod n


.

= MonPro(ar , br ).

Thus, .MonPro(ar , br ) implements the multiplication in the ring .(Zrn , +Mon , ×Mon ).
Now we can apply the Montgomery product algorithm MonPro (Algorithm 3.16
or 3.17) for computing multiplications in the right-to-left (Algorithm 3.7) and the
left-to-right (Algorithm 3.8) square and multiply algorithms. The details are listed
in Algorithms 3.19 and 3.20.
By Lemma 3.5.1 and Remark 3.5.2, lines 5 and 6 in Algorithm 3.19 compute

.resultr = resultr ×Mon tr and tr = tr ×Mon tr


3.5 Implementations of RSA Cipher and RSA Signatures 199

Algorithm 3.19: Montgomery right-to-left square and multiply algorithm


Input: n, r, n̂, a, d // n is an odd integer of bit length 𝓁n ; r = 2𝓁n ; n̂ is
given by Eq. 3.23; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 resultr = r mod n
2 tr = ar mod n
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
5 resultr = MonPro(n, r, n̂, resultr , tr )// resultr = resultr ×Mon tr
6 tr = MonPro(n, r, n̂, tr , tr )// tr = tr ×Mon tr
7 t = MonPro(n, r, n̂, resultr , 1, )// t = tr ×Mon 1 = resultr × r −1 mod n
8 return result

respectively. It follows from Algorithm 3.7 that lines 1–6 in Algorithm 3.19
calculate

tr = (ar )d mod n = (a d )r mod n.


.

Then line 7 removes r from .(a d )r and outputs the final result.

Algorithm 3.20: Montgomery left-to-right square and multiply algorithm


Input: n, r, n̂, a, d // n is an odd integer of bit length 𝓁n ; r = 2𝓁n ; n̂ is
given by Eq. 3.23; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 tr = r mod n
2 ar = ar mod n
3 for i = 𝓁d − 1, i ≥ 0, i − − do
4 tr = MonPro(n, r, n̂, tr , tr )// tr = tr ×Mon tr
5 if di = 1 then
6 tr = MonPro(n, r, n̂, tr , ar )// tr = tr ×Mon ar

7 t = MonPro(n, r, n̂, tr , 1)// t = tr ×Mon 1 = tr r −1 mod n


8 return t

Similarly, lines 4 and 6 in Algorithm 3.20 compute

tr = tr ×Mon tr
. and tr = tr ×Mon ar

respectively. It follows from Algorithm 3.8 that lines 1–6 in Algorithm 3.20
calculate

tr = (ar )d mod n = (a d )r mod n.


.
200 3 Modern Cryptographic Algorithms and Their Implementations

Then line 7 removes r from .(a d )r mod n and outputs the final result.
Example 3.5.27 Let

n = 23,
. d = 4 = 1002 , a = 5.

In Example 3.5.2, we have computed

a d mod n = 54 mod 23 = 625 mod 23 = 4


.

with square and multiply algorithm. In Example 3.5.12 we showed the steps
when modular multiplications in the square and multiply algorithm are done with
Blakely’s method. Now we calculate the same modular exponentiation with the
square and multiply algorithm and Montgomery’s method for modular multiplica-
tion.
According to Example 3.5.15 that

r = 32,
. r −1 = 18, n̂ = 25.

For the detailed computations with MonPro below, we refer to Example 3.5.19.
Following Algorithm 3.19, lines 1 and 2 give

resultr = 32 mod 23 = 9,
. tr = 5 × 32 mod 23 = 22.

For .i = 0, .d0 = 0, line 6 computes

tr = MonPro(23, 32, 25, 22, 22) = 18.


.

For .i = 1, .d1 = 0, line 6 computes

tr = MonPro(23, 32, 25, 18, 18) = 13.


.

For .i = 2, .d2 = 1, line 5 computes

tr = MonPro(23, 32, 25, 9, 13) = 13.


.

Then line 6 computes (note that this computation does not affect the final output)

tr = MonPro(23, 32, 25, 13, 13) = 6.


.

Finally line 7 computes

. t = MonPro(23, 32, 25, 13, 1) = 4.

Following Algorithm 3.20, lines 1 and 2 give


3.5 Implementations of RSA Cipher and RSA Signatures 201

tr = 32 mod 23 = 9,
. ar = 5 × 32 mod 23 = 22.

For .i = 2, .d2 = 1, line 4 computes

tr = MonPro(23, 32, 25, 9, 9) = 9.


.

Then line 6 computes

tr = MonPro(23, 32, 25, 9, 22) = 22.


.

For .i = 1, .d1 = 0, line 4 computes

tr = MonPro(23, 32, 25, 22, 22) = 18.


.

For .i = 0, .d1 = 0, line 4 computes

tr = MonPro(23, 32, 25, 18, 18) = 13.


.

Finally, line 7 computes the output

tr = MonPro(23, 32, 25, 13, 1) = 4.


.

We can also apply the Montgomery product algorithm (Algorithm 3.16 or 3.17)
to Montgomery powering ladder (Algorithm 3.9) for computing modular exponen-
tiation. We have Algorithm 3.21.

Algorithm 3.21: Montgomery powering ladder with Montgomery’s method for


modular multiplication
Input: n, r, n̂, a, d // n is an odd integer of bit length 𝓁n ; r = 2𝓁n ; n̂ is
given by Eq. 3.23; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 R0 = r mod n
2 R1 = ar mod n
3 for j = 𝓁d − 1, j ≥ 0, j − − do
4 if dj = 0 then
5 R1 = MonPro(n, r, n̂, R0 , R1 )// R1 = R0 ×Mon R1
6 R0 = MonPro(n, r, n̂, R0 , R0 )// R0 = R0 ×Mon R0
7 else
8 R0 = MonPro(n, r, n̂, R0 , R1 )// R0 = R0 ×Mon R1
9 R1 = MonPro(n, r, n̂, R1 , R1 )// R1 = R1 ×Mon R1

10 R0 = MonPro(n, r, n̂, R0 , 1)// R0 = R0 ×Mon 1 = R0 × r −1 mod n


11 return R0
202 3 Modern Cryptographic Algorithms and Their Implementations

By Lemma 3.5.1 and Remark 3.5.2, lines 5 and 6 in Algorithm 3.21 compute

. R1 = R0 ×Mon R1 , R0 = R0 ×Mon R0

respectively. Similarly, lines 8 and 9 in Algorithm 3.21 compute

R0 = R0 ×Mon R1 ,
. R1 = R1 ×Mon R1 .

It follows from Algorithm 3.9 that lines 1–9 in Algorithm 3.21 calculate

(ar )d mod n = (a d )r mod n.


.

Then line 10 removes r from .(a d )r mod n and outputs the final result.
Example 3.5.28 We repeat the computation in Example 3.5.27 with Algo-
rithm 3.21. We have

.n = 23, a = 5, d = 4 = 1002 , r = 32, r −1 = 18, n̂ = 25.

For the detailed computations with MonPro below, we refer to Example 3.5.19.
Lines 1 and 2 in Algorithm 3.21 give

.R0 = r mod n = 32 mod 23 = 9, R1 = ar mod n = 5 × 32 mod 23 = 22.

For .j = 2, .d2 = 1, line 8 computes

R0 = MonPro(23, 32, 25, 9, 22) = 22.


.

Since .d1 = d0 = 0, for the rest of the computations, only .R0 is relevant for the
result. For .j = 1, line 6 calculates

R0 = MonPro(23, 32, 25, 22, 22) = 18.


.

For .j = 0, line 6 calculates

R0 = MonPro(23, 32, 25, 18, 18) = 13.


.

Finally, line 10 gives the output

R0 = MonPro(23, 32, 25, 13, 1) = 4.


.
3.6 Further Reading 203

3.6 Further Reading

Figures We note that figures in this chapter are adjusted versions of drawings
from [Jea16]. Jean [Jea16] includes plenty of source files for various cryptographic-
related illustrations.
Implementation of symmetric block ciphers For more discussions on implemen-
tations of symmetric block ciphers, we refer the readers to [Osw]. For a detailed
analysis of algebraic normal form and Boolean functions, we refer the readers
to [O’D14].
Bitsliced implementation of DES can be found in, e.g., [MPC00, Kwa00]. For
AES, [KS09] discusses a bitsliced implementation for 64−architecture and [SS16]
presents the design for 32-bit architecture. More efficient bitsliced implementations
of PRESENT can be found in [BGLP13] for 64-bit architecture and in [RAL17]
for 32-bit architecture.
A related novel way of implementing symmetric block ciphers called Fixslicing
was introduced in 2020 [ANP20, AP20] to achieve efficient software constant-time
implementations. The main idea is to have an alternative representation of several
rounds of the cipher by fixing the bits within a certain register to never move.
RSA security Currently, a few hundred qubits (a quantum counterpart to the classi-
cal bit) are possible for a quantum computer [Cho22]. To break RSA, thousands of
qubits are required [GE21]. Nevertheless, post-quantum public key cryptosystems
are being proposed (see, e.g., [HPS98, BS08]) to protect communications after a
quantum computer is built.
Implementations of RSA For more discussions on different methods for imple-
menting RSA, we refer the readers to [Koç94]. Koç [Koç94] also discusses
how Garner’s algorithm (Eq. 3.20) can be designed for solving simultaneous linear
congruences in general. For a more efficient way to implement the extended
Euclidean algorithm, see [Sti05, Algorithm 5.3].
Digital Signatures There are other digital signatures based on different public key
cryptosystems. For more discussions, we refer the readers to [Buc04, Chapter 12].
Secret key In Sect. 2.2.6 we have seen that exhaustive key search can be used
to break shift cipher and affine cipher. The lesson is that the key space should
be big enough so the attacker cannot brute force the secret key. This size is
determined by the current computation power. For example, the 56-bit secret key
of DES was successfully broken in 1998 [Fou98]. The U.S. National Institute
for Standards and Technology (NIST) issues recommendations for key sizes for
government institutions in the USA. According to those, 80-bit keys were “retired”
in 2010 [BBB+ 07], and lesser than 112-bit keys were considered insufficient from
2015 onward [BD16]. National Security Agency (NSA) currently requires AES-
256 for Top Secret classification since 2015 due to the emergence of quantum
computing [Age15].
Chapter 4
Side-Channel Analysis Attacks
and Countermeasures

Side-channel analysis attacks target cryptographic implementations passively. The


attacks exploit the possibility of the attacker observing the physical characteristics of
a device that is running a cryptographic algorithm. The attacker obtains the so-called
side-channel information, e.g., power consumption, electromagnetic emanation,
execution time, etc., and then utilizes such information to recover the secret key.
In this chapter, we will focus on power analysis attacks that exploit power
consumption information. The attack methodologies can be used in a similar manner
when electromagnetic emanation (EM) is analyzed.
Although side-channel analysis attacks can refer to a wide range of attacks,
including timing analysis [Koc96], cache attacks [GMWM16], etc., in this book,
we use the terminology side-channel analysis attacks only in the narrower meaning
which refers to power analysis attacks. In short, we also write side-channel analysis
as SCA.
Device under test The device that the attacker takes measurements of is called the
device under test (DUT). For example, it can be a microcontroller, running a soft-
ware implementation, an FPGA, or an ASIC, realizing a hardware implementation.
For power analysis attacks we study in this chapter, we assume the attacker
has certain knowledge of the implementation, for example, how to interface with
the encryption routine, whether the computation is executed serially or in parallel,
whether the implementation is round-based or bit-sliced, or whether some types
of countermeasures are present. Generally, this type of information can also be
obtained by reverse engineering, visual inspection of the side-channel measure-
ments, or sometimes just with a simple trial-and-error technique.
Attacker goal The ultimate goal of the attacker is to recover the master key of a
symmetric block cipher or the private key of a public key cipher.
Attacker’s assumptions Based on the assumption of whether the attacker can
obtain a similar device to the target device, we distinguish two types of SCA
attacks:

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 205
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_4
206 4 Side-Channel Analysis Attacks and Countermeasures

• Non-profiled SCA. If the attacker does not have access to a similar device, just
the target device or just the measurements coming from the target device, we
talk about a non-profiled SCA. In a general scenario, this attack utilizes a set
of measurements where a fixed secret key is used to encrypt multiple (random)
plaintexts.
• Profiled SCA. If we assume the attacker has access to a clone device of the
target device, then they can carry out a profiled SCA. This attack operates in two
phases. In the profiling phase, the attacker acquires side-channel measurements
for known plaintext/ciphertext and known key pairs. This set of data is used
to characterize or model the device. Then in the attack phase, the attacker
acquires a few measurements from the target device, which is usually identical
to the clone device, with known plaintext/ciphertext and an unknown key. These
measurements from the target device are then tested against the characterized
model from the clone device.

Source Code
The source code and measurement data for this chapter can be found in the
following link:
https://github.com/XIAOLUHOU/SCA-measurements-and-analysis----
Experimental-results-for-textbook

4.1 Experimental Setting

Power analysis measures the power consumption of the DUT in the form of a voltage
change. The most convenient device to capture the voltage change over time is a
digital sampling oscilloscope—a device that takes samples of the measured voltage
signal over time. We refer to each sample point as a time sample. More information
on measurement setups is provided in Sect. 6.1.
To be able to target the correct time slot, in our experiments, a trigger signal is
raised to high (5V) during the computation that we want to capture and lowered
afterward. One measurement consists of the voltage values for each time sample in
this duration. It can be stored in an array of length equal to the total number of time
samples in the measured time interval. It can also be drawn in a graph where the x-
axis corresponds to time samples and the y-axis records the voltage values.1 Thus,
we refer to the result of one measurement as a (power) trace.

1 Note that, in the case of ChipWhisperer, which will be used for our experiments and analysis, the

y-axis does not show the actual voltage value but a 10-bit value proportional to the current going
through the shunt resistor.
4.1 Experimental Setting 207

Fig. 4.1 Side-channel measurement setup used for the experiments: a laptop, the ChipWhisperer-
Lite measurement board (black), and the CW308 UFO board (red) with the mounted ARM Cortex-
M4 target board (blue). Note that the benchtop oscilloscope in the back was only used for the initial
analysis—all the measurements were done by the ChipWhisperer

Device under test and oscilloscope For the experiments in this chapter, we used
a ready-to-use measurement platform NewAE ChipWhisperer-Lite. The program
code was running on a 32-bit ARM Cortex-M4 microcontroller (STM32F3) with
a clock speed of .≈ 7.4 MHz. ADC was set to capture the samples at .4× that
speed, i.e., .≈ 29.6 MHz with a 10-bit resolution. However, for plotting purposes, we
normally reduced the number of time samples. The measurement setup is depicted
in Fig. 4.1. The ChipWhisperer-Lite board is in the middle of the picture in black
color, handling the communication with the DUT and the acquisition. The red PCB
on the right is the CW 308 UFO board—a breakout board with the DUT—and ARM
Cortex-M4 (blue board) mounted on top. The controlling and data processing were
done from a laptop, from the Jupyter environment available for the ChipWhisperer
platform. In the back, there is a Teledyne T3DSO3504 benchtop oscilloscope that
was used mainly for convenience purposes—to precisely locate the time intervals in
the initial analysis stage.
Figure 4.2 shows one power trace for the first five rounds of PRESENT
encryption. In order to see the trace more clearly, we have added a sequence nop
instructions before and after the five rounds of cipher computation. This trace has in
total 18,500 time samples. Certain patterns can be seen from the trace, and we can
deduce the corresponding operations in each time interval. For example, from time
sample 0–1434 and from time sample 17,514–18,500, we have nop instructions. We
can also see the five repeated patterns in the figure and deduce the duration of each
round, as indicated in the figure by red dotted lines. In terms of time samples, one
round takes on average 3216 time samples. In this particular case, we reduced the
number of samples by a factor of 3 (simply by taking every third sample) so that
the patterns would still be visible to the reader. That means, with the ADC speed of
.≈ 29.6 MHz, one round takes .(3216 × 3)/29.6 ≈ 325.9 μs. It is important to note
208 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.2 Power trace of the first five rounds of PRESENT encryption. A sequence of nop
instructions was executed before and after the cipher computation to clearly distinguish the
operations

that for presentation purposes, we used an unoptimized software implementation of


PRESENT.
Datasets Four datasets will be analyzed in more detail in the later parts of
this chapter. All the datasets capture one round of software implementation of
PRESENT. The description of each of them is given below:
• Fixed dataset A: This dataset contains 5000 traces with a fixed round key
FEDCBA0123456789 and a fixed plaintext ABCDEF1234567890.
• Fixed dataset B: This dataset contains 5000 traces with a fixed round key
FEDCBA0123456789 and a fixed plaintext 84216BA484216BA4.
• Random plaintext dataset: This dataset contains 5000 traces with a fixed round
key

.FEDCBA0123456789 (4.1)

and a random plaintext for each trace.


• Random dataset: This dataset contains 10,000 traces with a random round key
and a random plaintext for each trace.

In each case, the execution of the cipher is surrounded by nop instructions so


that the round operation patterns can be clearly distinguished from the provided
plots. While the raw traces are all 5000 time samples long, for plotting and analysis
purposes, we shorten them to 3600 time samples as the later parts correspond to nop
instructions and do not contain any useful information. We also note that for these
datasets, we reduced the number of collected time samples by a factor of 3.
4.2 Side-Channel Leakages 209

4.1.1 Attack Methods

There are two main classical power analysis attack methods, simple power analysis
(SPA) and differential power analysis (DPA). SPA assumes the attacker has access
to only one or a few measurements corresponding to some fixed inputs. In DPA, we
assume the attacker can take measurements for a potentially unlimited number of
different inputs. We will present several DPA attacks on symmetric block ciphers
(Sects. 4.3.1 and 4.3.2) and DPA (Sect. 4.4.1) and SPA (Sect. 4.4.2) attacks on RSA.
We will also discuss a newly proposed side-channel assisted differential plaintext
attack (SCADPA) on SPN ciphers (Sect. 4.3.3). Similar to SPA, the attack does
not require statistical analysis of the traces; only visual inspection is enough. The
amount of traces needed is in between that for SPA and DPA, mostly dependent on
the measurement equipment.

4.2 Side-Channel Leakages

In the later parts of the chapter, we will see that by analyzing the power con-
sumption, we can deduce the secret key. Consequently, we also refer to the power
consumption as the leakage of the device. We consider the leakage consists of two
parts: signal and noise. Signal refers to the part of the leakage containing useful
information for our attack; the rest is noise. For example, suppose we would like to
recover the hamming weight of an intermediate value. In that case, the part of the
leakage correlated to the hamming weight of that intermediate value is our signal.
Before we see how leakage can be defined and modeled, we show that it is
dependent on the operations being executed and the data being processed.
We first take the Fixed dataset A described in Sect. 4.1. The average of those
5000 traces is shown in Fig. 4.3. As mentioned in Sect. 4.1, each trace in this dataset
corresponds to one round of PRESENT computation surrounded by nop operations.
By visual inspection, we can deduce that the beginning (time samples 0–209)
and the ending (time samples 3381–3600) parts that consist of relatively uniform
patterns correspond to nop instructions. Other than that, we can see three distinct
patterns between them. Since one round of PRESENT consists of addRoundKey,
sBoxLayer, and pLayer (see Fig. 3.8), we can roughly identify each of these three
operations in the trace—they correspond to the blue (time samples 210–382),
pink (time samples 383–567), and green (time samples 568–3380) parts of the
trace, respectively. In this case, one round computation corresponds to 3170 time
samples, which is fewer than that in Fig. 4.2. Such a difference can be caused
by round counter and loop operations, register updates of round keys, etc., which
are additionally computed in the five-round PRESENT implementation. These
observations demonstrate that the leakage is dependent on the operations being
executed in the DUT.
210 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.3 The averaged trace for 5000 traces from the Fixed dataset A (see Sect. 4.1). The blue,
pink, and green parts of the trace correspond to addRoundKey, sBoxLayer, and pLayer, respectively

Fig. 4.4 The averaged trace for 1000 plaintexts with the 0th bit equal to 0. The computation
corresponds to one round of PRESENT with a fixed round key

For another experiment, with the experimental setup described in Sect. 4.1, we
have conducted measurements for one round of PRESENT with a fixed round key.
A total of 1000 traces were collected, each for a random plaintext with the 0th
bit equal to 0. The averaged trace is shown in Fig. 4.4. With the same key, we
collected traces for 1000 plaintexts with the 0th bit equal to 1. And the averaged
trace is shown in Fig. 4.5. We can see that those two averaged traces are very similar.
Unsurprisingly, they also look similar to the trace in Fig. 4.3. Thus, the time interval
for each operation in the first round of PRESENT corresponds to that in Fig. 4.3 as
well.
We can gain more information when we take the difference between traces in
Figs. 4.4 and 4.5. The difference trace is shown in Fig. 4.6. There are a few peaks in
this difference trace, and apart from those peaks, most of the points are close to zero.
Those peaks indicate that the 0th bit of the plaintext is related to the computations
at the corresponding time samples. Compared with Fig. 4.3, we can see that the first
and second peaks correspond to addRoundKey and pLayer operations. In particular,
4.2 Side-Channel Leakages 211

Fig. 4.5 The averaged trace for 1000 plaintexts with the 0th bit equal to 1. The computation
corresponds to one round of PRESENT with a fixed round key

Fig. 4.6 The difference between traces from Figs. 4.4 and 4.5

these observations show that the leakage is dependent on the data being processed
in the DUT.

Note
In the SCA attacks we will see in this book, we will only be interested in
operation or/and data-dependent leakages.
SPA typically exploits the relationship between the executed operations
and the leakage (power consumption). DPA and SCADPA focus on the rela-
tionship between the processed data and the leakage (power consumption).

To analyze the leakage better, we model the leakage, signal, and noise at a given
point in time as random variables. In particular, for a fixed time sample t, let .Lt , .Xt ,
212 4 Side-Channel Analysis Attacks and Countermeasures

and .Nt denote the random variables corresponding to the leakage, signal, and noise,
respectively. As we consider the leakage consists of signal and noise, we can write

Lt = Xt + Nt .
. (4.2)

Since .Xt contains the part of the leakage that is useful to us and the rest is noise, we
make the “independent noise assumption” (see, e.g., [Pro13]) and assume .Nt and
.Xt are independent random variables. When .Xt is a constant, according to Eqs. 1.36

and 1.33, we have

Var(Lt ) = Var(Nt ),
. Xt = E [Lt ] − E [Nt ] . (4.3)

But how do we decide when .Xt is constant? That depends on the information
we would like to obtain from the traces. Let us consider one round of PRESENT
computation. Suppose we are interested in the 0th Sbox output of the sBoxLayer
(the right-most Sbox in Fig. 3.9), denoted by .v. If we want information revealing the
exact value of .v, then for any given time sample t, the signal .Xt is considered to be
constant across the following dataset: measurements for computations of one round
of PRESENT with a fixed master key and plaintexts with a fixed 0th nibble. The
identical 0th nibble in plaintexts and fixed key guarantees the same 0th Sbox output.
We can also use random master keys that result in the first round key having the
same 0th nibble. If we want information revealing the Hamming weight of .v, then
measurements with master keys and plaintexts that result in a fixed .wt (v) would
correspond to constant .Xt s.

4.2.1 Distribution of the Leakage

Since we are only interested in either data or operation-related leakages, for a given
point in time t, if we fix the operation and the data, we get a constant signal, i.e.,
.Xt is a constant. In the following, we will show that, in this case, the experimental

results (histograms) demonstrate that it is reasonable to consider the distribution


induced by .Lt to be a normal distribution (see Example 1.7.23).
We note that since the noise comes from many sources, e.g., environment,
other components in the DUT, setup, etc., it can be considered as a combination
of various independent random variables. Thus, according to the central limit
theorem [Dud14],2 it is reasonable to assume the distribution induced by the noise
is normal.
Let us take the Fixed dataset A described in Sect. 4.1. Figure 4.7 shows a small
part from five randomly selected traces. We can see that they are very similar, with

2 Roughly speaking, the central limit theorem says that if we combine different independent

random variables, the resulting distribution tends to be normal.


4.2 Side-Channel Leakages 213

Fig. 4.7 Part of five random traces from the Fixed dataset A (see Sect. 4.1)

Fig. 4.8 Histogram of leakages at time sample .t = 3520 across 5000 traces from the Fixed
dataset A

minor differences. As the signal is the same, the minor differences are caused by the
noise. We will further characterize the noise using histograms.
Recall that the averaged trace of those 5000 traces in Fixed dataset A is shown in
Fig. 4.3. Take .t = 3520. As we have mentioned in the discussion regarding Fig. 4.3,
this time sample corresponds to nop operations. If we plot the histogram of leakages
.L3520 across those 5000 traces, we get Fig. 4.8. Most leakages are around .0.0435,

and very few are below .0.037 or above .0.049.


Now we take another time sample .t = 2368, which gives the highest peak in
Fig. 4.3 and corresponds to pLayer computation. The histogram of leakages .L2368
across those 5000 traces is shown in Fig. 4.9. Most leakages are around .0.213, and
very few are below .0.207 or above .0.219. Compared to Fig. 4.8, we have much
higher leakage values. This is because .t = 3520 corresponds to nop operations, and
for .t = 2368, we have PRESENT round computations.
For both cases, the shapes of the histograms are similar to the PDF of a
normal distribution (see Fig. 1.2). If we take a different time sample, the histogram
will be similar, with differences in the values on the x-axis. In other words, the
distribution induced by the leakages can be approximated by normal distributions.
As mentioned, all traces correspond to the same operation and data in the DUT for
214 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.9 Histogram of leakages at time sample .t = 2368 across 5000 traces from the Fixed
dataset A

a fixed time sample, resulting in a constant .Xt . Thus, the variants in the leakage are
caused by the noise.
Leakage Models One important concept for a power analysis attack is the leakage
model, namely a model that estimates how the leakage is related to the data
being processed. A good leakage model can make the attack more efficient (see
Sect. 4.3.2).
Three commonly used leakage models are identity leakage model, Hamming
distance leakage model, and Hamming weight leakage model. Assume a value .v
is being processed in the DUT, and right before it, another value .u was used by
the DUT. Then, according to the identity leakage model, the leakage is correlated
(see Definition 1.7.8) to .v. The Hamming distance leakage model assumes that the
leakage is correlated to .dis (v, u), the Hamming distance between .v and .u (see
Eq. 1.24). Following the Hamming weight leakage model, the leakage will then be
correlated to .wt (v), the Hamming weight of .v (see Definition 1.6.10). We refer the
readers to Sect. 6.1.1 for more explanations of why there are side-channel leakages
when the data in the DUT is changed.
In particular, let .noise ∼ N(0, σ 2 ) be a normal random variable with mean 0 and
variance .σ 2 . For the identity leakage model, the modeled leakage is given by

L(v) = v + noise.
.

For the Hamming distance leakage model, we have

L(v) = dis (v, u) + noise.


.

Similarly, for the Hamming weight leakage model,

L(v) = wt (v) + noise.


. (4.4)
4.2 Side-Channel Leakages 215

Even though the actual leakage may not be exactly equal to the modeled leakage
L(v), those leakage models can be used to approximate the behavior of the actual
.

leakages or for statistical analysis (see Sect. 4.3.1). For example, our previous
experiments have demonstrated that the identity leakage model is realistic since
when the data is fixed, the distribution of leakages is close to a normal distribution.
It can be shown that the other two leakage models are also realistic (see [MOP08,
Section 4.3]).
In this book, we will focus on two leakage models: the identity leakage model
and the Hamming weight leakage model.

4.2.2 Estimating Leakage Distributions

In this subsection, we look at the analysis of leakages from a statistical point of


view and provide concrete examples of methods for analyzing unknown distribution
parameters discussed in Sect. 1.8.2. We consider the DUT computing PRESENT
encryption with a fixed plaintext and a fixed key. We have shown that in this case,
we can assume .Lt induces a normal distribution for a given time sample t. Let .μt
and .σt2 denote the mean and variance of this normal distribution.
For the running example, we focus on the Fixed dataset A as described in
Sect. 4.1. Let .t = 2368, which gives the highest peak in Fig. 4.3 and corresponds to
the computation of pLayer. We have seen the histogram of the leakages at this time
sample in Fig. 4.9. With the terminologies from Sect. 1.8.2, a sample for .L2368 is
given by all leakage values at .t = 2368 from those 5000 traces in Fixed dataset
A. We can use our sample to estimate the mean (.μt ) and variance (.σt2 ) of the
distribution induced by .L2368 (assuming it is normal) using point estimators given
by sample mean and sample variance (see Remark 1.8.4). Let .M = 5000 denote our
sample size.
Example 4.2.1 (Example of approximating mean and variance with sample
mean and sample variance) By Eq. 1.49, the sample mean is given by the average
of leakages at time sample 2368 across those 5000 traces. We have

l2368 ≈ 0.2132.
.

Following Eq. 1.53, we have also computed the sample variance


2
s2368
. ≈ 8.5196 × 10−6 . (4.5)

Then the sample mean .0.2132 is an estimate for .μ2368 , and the sample variance
8.5196 × 10−6 is an estimate for .σ2368
.
2 .

We can also estimate the mean with an interval estimator.


216 4 Side-Channel Analysis Attacks and Countermeasures

Example 4.2.2 (Example of interval estimator for the mean) Since we do not
know the variance of .L2368 , by Eq. 1.59, a .100(1 − α) percent confidence interval
for .μ2368 is given by
⎧ ⎫
s2368 s2368
. l2368 − tα/2,M−1 √ , l2368 + tα/2,M−1 √ .
M M

Take .α = 0.01. Then according to Remark 1.8.2 and Table 1.4, we get

t0.005,4999 ≈ z0.005 = 2.576.


.

By Eq. 4.5,

s2368 =
. 8.5196 × 10−6 ≈ 2.9188 × 10−3 .

And a .99% confidence interval for .μ2368 is


⎧ ⎫
2.9188 × 10−3 2.9188 × 10−3
. 0.2132 − 2.576 × √ , 0.2132 + 2.576 × √
5000 5000
≈ (0.2131, 0.2133) . (4.6)

2
Assume we know the variance of .L2368 is actually given by .σ2368 = 8.5196 × 10−6 .
Suppose we want to find an estimate for .μt with precision .c = 0.001 and .99% of
confidence. By Eq. 1.58, the number of traces we need to collect is given by

2
σ2368 8.5196 × 10−6
. z 2
= × 2.5762 ≈ 57, (4.7)
c2 α/2 0.0012
where .1 − α = 0.99 gives .α = 0.01, and as mentioned above, .z0.005 = 2.576.
Thus, we should collect at least 57 traces to get a .99% percent confidence interval
for .μ2368 .
Since the number of traces to be collected is more than 30, according to Eq. 1.60,
if we do not know the variance of .Lt , we can use the sample variance .s23682 to
compute the number of traces required. In this case, we will get the same result as
in Eq. 4.7 since we have assumed the variance to be equal to this sample variance.
Now we take the Fixed dataset B described in Sect. 4.1. We again look at
the time sample .t = 2368. Let .L'2368 denote the random variable corresponding
to the leakage at time sample 2368 for one round encryption of the plaintext
84216BA484216BA4 with round key FEDCBA0123456789. Let .μ'2368 and .σ '2 denote
the mean and variance of .L'2368 , respectively. Then the Fixed dataset B provides a
sample for .L'2368 . Similarly to Example 4.2.1, we can compute the sample mean and
sample variance for .L'2368 with this sample, and we have
4.2 Side-Channel Leakages 217

'2
'
l2368
. ≈ 0.2133, s2368 ≈ 8.6198 × 10−6 . (4.8)

Example 4.2.3 (Example of interval estimator for the mean) Let us assume
L2368 and .L'2368 are independent. We further assume that we know that the actual
.

variances for .L2368 and .L'2368 are equal to the sample variances we have computed.
Suppose we want to find an estimation for .μ2368 − μ'2368 . By Eq. 1.62, a .99%
confidence interval estimate for .μ2368 − μ'2368 is given by
⎛ / / ⎞
2
σ2368 '2
σ2368 σ 2 σ '2
. ⎝l2368 − l ' − z0.005 + '
, l2368 − l2368 + z0.005 2368
+ 2368 ⎠
2368
M M M M
⎧ /
= 0.2132 − 0.2133 − 2.576 (8.5196 × 10−6 + 8.6198 × 10−6 )/5000,
/ ⎫
−6 −6
0.2132 − 0.2133 + 2.576 (8.5196 × 10 + 8.6198 × 10 )/5000
⎛ ⎞
= −2.5082 × 10−4 , 5.0820 × 10−5 .

On the other hand, by Eq. 1.63, to achieve an estimation with precision, say .c =
0.001, and .100(1 − α) confidence, the number of data required to collect is given by

'2
2368 + σ2368 )
2 (σ 2
zα/2
. .
c2
Take .α = 0.01, then .z0.005 = 2.576, and we have
'2 )
2
z0.005 2
(σ2368 + σ2368 2.5762 × (8.5196 × 10−6 + 8.6198 × 10−6 )
. = ≈ 114.
c2 0.0012
'
If we assume we do not know the variances, but we know that .σ2368 = σ2368 , by
Eq. 1.67, the number of traces to collect is given by

2 s2
2zα/2 p 2 × 2.5762 × 8.5697 × 10−6
. = ≈ 114,
c2 0.0012
where

(M − 1)sx2 + (M − 1)sy2 sx2 + sy2 8.5196 × 10−6 + 8.6198 × 10−6


sp2 =
. = =
M +M −2 2 2
= 8.5697 × 10−6 .
218 4 Side-Channel Analysis Attacks and Countermeasures

Remark 4.2.1 We note that the sample variances of .L2368 and .L'2368 are very close.
2
This is expected as it has been shown in Eq. 4.3 that the variances .σ2368 '2
and .σ2368
are both equal to the variance of the noise at time sample 2368.
For now, we have seen how to analyze the leakage at one particular time sample
by approximating its distribution with a normal distribution. Similarly, we can
also approximate the distribution of leakages across different time samples. In
this case, we consider a random vector (see Definition 1.7.9) instead of a random
variable. Thus, we would approximate the distributions induced by the leakages
as multivariate normal distributions (Gaussian distributions). It can be seen from
Definition 1.7.10 that to find a good Gaussian distribution for approximating the
noise/leakage, we just need to approximate the mean vector and covariance matrix.
We will see in Sect. 4.3.2.3 that the profiling phase of the template attack is exactly
to calculate estimations for the mean vector and the covariance matrix. In reality,
leakages at different time samples are correlated (see Definition 1.7.8). However,
the effort to calculate the covariance matrix grows quadratically with the number of
considered time samples. Thus, in practice, only a small part of the traces would be
profiled with a non-diagonal covariance matrix (see Example 1.7.24).

4.2.3 Leakage Assessment

In the rest of this chapter, we will see various attacks on cryptographic imple-
mentations. As a developer, one might want to evaluate the implementation and
conclude if it is vulnerable to SCA. On the other hand, different new attacks are
being developed, and it is impractical to verify the security of our implementation
against all of them. Leakage assessment aims to solve this problem by analyzing the
power trace and answering whether any data-dependent information can be detected
in the traces of the DUT.

Note
We note that the leakage assessment methods do not provide any conclusions
in cases where data-dependent leakage is not detected. Therefore, the absence
of data-dependent leakage indicated by a particular method does not prove
that the implementation is leakage-free.

In this part, we discuss a method for leakage assessment based on student’s t-


test and Welch’s t-test (see Sect. 1.8.3). The methodology is also referred to as test
vector leakage assessment (TVLA).
Consider a DUT running PRESENT encryption, and fix a time sample t. We also
fix an intermediate value .v, for example, the input plaintext or one Sbox output.
4.2 Side-Channel Leakages 219

We take the signal as the part of the leakage related to .v. Let .Lt and .L't denote
the leakages at time sample t corresponding to two encryptions with different fixed
values of .v.
Example 4.2.4 (Example of .Lt and .L't ) If we take .v to be the plaintext, following
the convention, we require the key to be the same, then .Lt and .L't would correspond
to encryptions of two different fixed plaintexts with the same key. For example, we
can take Fixed dataset A and Fixed dataset B as samples of .Lt and .L't for the first
round of the encryption.
If we take .v to be the 0th Sbox output in the first round of PRESENT, then .Lt and
'
.Lt would correspond to encryptions that result in two different 0th Sbox outputs. For

example, let us take Random dataset; we have 634 traces corresponding to .v = 0


and 651 traces corresponding to .v = F. Those two sets of traces provide us with
samples of .Lt and .L't for the first round of the encryption.
As discussed before, when the signal is fixed, we assume a normal distribution
can approximate the distribution induced by .Lt . We can write

Lt = Xt + Nt ,
. L't = Xt' + Nt' ,

with

Lt ∼ N(μt , σt2 ),
. L't ∼ N(μ't , σt'2 ).

Since (see Eq. 4.2)

Lt = Xt + Nt
.

and the signal .Xt is a constant, the variance of .Lt is given by the variance of .Nt ,
and the mean of .Lt is given by the sum of the constant .Xt and the mean of .Nt , as
shown in Eq. 4.3. In other words,

μt = Xt + E [Nt ] ,
. σt2 = Var(Nt ). (4.9)

Similarly, we have
⎡ ⎤ ( )
μ't = Xt' + E Nt' ,
. σt'2 = Var Nt' .

As the noise is independent of the signal, we have .Nt = Nt' . Consequently,

μt − Xt = μ't − Xt' ,
. σt2 = σt'2 . (4.10)

Before going into details about the TVLA methodology, we recall hypothesis
testing techniques from Sect. 1.8.3. We can use those techniques to test hypotheses
about .μt and .μ't .
220 4 Side-Channel Analysis Attacks and Countermeasures

Example 4.2.5 (Example of a hypothesis) If we are interested in whether .μt = 0,


we can set a hypothesis that .μt = 0.
Example 4.2.6 (Example of two-sided hypothesis testing concerning .μx ) Let .v
be the plaintext, and we use Fixed dataset A (see Sect. 4.1) as a sample for .Lt (i.e.,
.Lt denotes the leakage at time sample t for one round encryption of the plaintext

ABCDEF1234567890 with round key FEDCBA0123456789). Fix .t = 2368, which


gives the highest peak in Fig. 4.3 and corresponds to the computation of pLayer.
Recall that in Example 4.2.1, we have calculated a sample mean of .l2368 ≈ 0.2132
for .L2368 . We would like to know if .μ2368 = 0. Following Eq. 1.68, we have null
and alternative hypotheses given by

H0 : μ2368 = 0,
. H1 : μ2368 /= 0.

Suppose we know the variance is equal to the sample variance we have computed
in Example 4.2.1, namely we assume

.
2
σ2368 = 8.5196 × 10−6 , which gives σ2368 ≈ 2.9188 × 10−3 .

There are in total 5000 traces in Fixed dataset A. For a test with significance level
α = 0.01, the critical region is given by Eq. 1.69, with (see Eq. 1.72)
.

zα/2σ 2.9188 × 10−3


c= √
. = 2.576 × √ ≈ 1.06 × 10−4 ,
5000 5000

where .z0.005=2.576 (see Table 1.4). Since the sample mean

l2368 ≈ 0.2132 > c,


.

we reject the null hypothesis and conclude that .μ2368 /= 0. The probability that our
decision is wrong is given by .α = 0.01.
Example 4.2.7 (Example of one-sided hypothesis testing concerning .μx ) With
the same notation as in Example 4.2.6, suppose we know that the mean of .L2368 ,
.μ2368 , is at least 0; we would like to know if it is bigger than 0. We set .μ0 = 0 in

Eq. 1.75 and get the following null hypothesis and the alternative hypothesis:

H0 : μ2368 = 0,
. H1 : μ2368 > 0.

First, let us assume we know the variance is equal to the sample variance we
computed in Example 4.2.1. There are in total 5000 traces in Fixed dataset A, and
for a test with significance level .α = 0.01, the critical region is given by Eq. 1.76,
with (see Eq. 1.77)
4.2 Side-Channel Leakages 221

σ2368 2.326 × 2.9188 × 10−3


c = z0.01 √
. = √ ≈ 9.601 × 10−5 ,
5000 5000

where .z0.01 = 2.326 (see Table 1.4). Since our sample mean

l2368 ≈ 0.2132 > c,


.

we reject the null hypothesis and conclude that .μ2368 > 0. The probability that our
decision is wrong is given by .α = 0.01.
Furthermore, we also would like to check how many traces are required for a
test with significance level .α = 0.01. For this, we need to choose a value of c.
Considering the value of the sample mean and sample variance, let us choose .c =
0.001 in Eq. 1.78. According to Eq. 1.79, the number of traces to collect is then

2
σ2368 8.5196 × 10−6
. z 2
= × 2.3262 ≈ 46. (4.11)
c2 α 0.0012
2 . Since the number of traces is
Now, suppose we do not know the variance .σ2368
big, according to Eq. 1.80, we compute

√ l2368 √ 0.2132
. 5000 × = 5000 × ≈ 5165,
s2368 2.9188 × 10−3

which is bigger than .z0.01 = 2.326. Thus, we can reject the null hypothesis and
conclude that .μ2368 > 0. The probability of a wrong decision is given by .α = 0.01.
As for the number of traces needed, by Eq. 1.81, we will use the sample variance
and reach the same result as in Eq. 4.11.
Example 4.2.8 (Example of two-sided hypothesis testing about .μx and .μy ) The
same as in Example 4.2.3, we take the leakages at .t = 2368 from the Fixed dataset
B as a sample for .L'2368 . We have computed the sample mean and sample variance
for this random variable, given in Eq. 4.8. We would like to know if the mean of
.L2368 (.μ2368 ) and the mean of .L
' '
2368 (.μ2368 ) are the same. We set the following
hypotheses (see Eq. 1.82):

H0 : μ'2368 = μ2368 ,
. H1 : μ'2368 /= μ2368 .

Assume we know the variances for both random variables are equal to the sample
variances that we have computed (see Eqs. 4.5 and 1.82). There are in total 5000
traces in both Fixed dataset A and Fixed dataset B, and for a test with significance
level .α = 0.01, the critical region is given by Eq. 1.83, with (see Eq. 1.86)
/ /
'2
2
σ2368 σ2368 8.5196 ×10−6 + 8.6198 ×10−6
c = z0.005
. + = 2.576× ≈ 0.00015,
5000 5000 5000
222 4 Side-Channel Analysis Attacks and Countermeasures

where .z0.005 = 2.576 (see Table 1.4). Since our sample mean

'
l2368
. − l2368 ≈ 0.0001 < c,

we accept the null hypothesis and conclude that .μ'2368 = μ2368 . The probability that
our decision is wrong is given by .α = 0.01.
Moreover, to check how many traces are needed for a test with significance level
.α = 0.01, we choose .c = 0.001 in Eq. 1.86. According to Eq. 1.87, the number of

traces to collect is then


'2
2
σ2368 + σ2368 8.5196 × 10−6 + 8.6198 × 10−6
2
zα/2
.
2
= 2.5762 × ≈ 114. (4.12)
c 0.0012
In case we do not know the variances, since the number of traces in both datasets
is 5000, following the student’s t-test, we compute (see Eq. 1.89)

'
|l2368 − l2368 | 0.0001
. / 2 '2
=/ ≈ 1.7 < z0.005 .
s2368 +s2368 8.5196×10−6 +8.6198×10−6
5000 5000

We accept the null hypothesis and conclude that .μ'2368 = μ2368 . The probability
that our decision is wrong is given by .α = 0.01.
Set .c = 0.001, then the number of traces needed for a student’s t-test with
significance level .α = 0.01 is given by (see Eq. 1.90)

'2
2
s2368 + s2368 8.5196 × 10−6 + 8.6198 × 10−6
2
z0.005
. = 2.5762
× ≈ 114.
c2 0.0012

Example 4.2.9 (Another example of two-sided hypothesis testing about .μx and
μy ) Similar to Example 4.2.8, let us now look at a different time sample .t = 392.
.

We can compute the sample mean and sample variance of .L392 with Fixed dataset
A. They are given by

l392 ≈ −0.0525,
.
2
s392 ≈ 1.5141 × 10−6 .

With Fixed dataset B, we get the sample mean and sample variance of .L'392 as
follows:
'2
'
l392
. ≈ −0.0501, s392 ≈ 1.4801 × 10−6 .

Similar to Example 4.2.8, we set the following hypotheses (see Eq. 1.82):

H0 : μ'392 = μ392 ,
. H1 : μ'392 /= μ392 .
4.2 Side-Channel Leakages 223

Let .α = 0.01. Then according to student’s t-test with significance level .α, we
compute (see Eq. 1.89)

'
|l392 − l392 | 0.0024
. / 2 '2
=/ ≈ 98.1 > z0.005 (z0.005 = 2.576).
s392 +s392 1.5141×10−6 +1.4801×10−6
5000 5000

We reject the null hypothesis and conclude that .μ'392 /= μ392 . The probability that
our decision is wrong is given by .α = 0.01.
Set .c = 0.001, then the number of traces needed for a student’s t-test with
significance level .α = 0.01 is given by (see Eq. 1.90)

2 + s '2
s392 392 1.5141 × 10−6 + 1.4801 × 10−6
2
z0.005
.
2
= 2.5762 × ≈ 20.
c 0.0012

Example 4.2.10 (Example of one-sided hypothesis testing about .μx and .μy )
With the same notations as in Example 4.2.9, suppose we know that

. μ'392 ≥ μ392 .

We would like to know if .μ'392 > μ392 . Then we have the following hypotheses:

H0 : μ'392 = μ392 ,
. H1 : μ'392 > μ392 .

Firstly, suppose we know the variances for both random variables are equal to
the sample variances that we have computed. There are 5000 traces in both Fixed
dataset A and Fixed dataset B. For a test with significance level .α = 0.01, the value
of c in the critical region given by Eq. 1.92 is (see Eq. 1.93)
/ /
2 + σ '2
σ392 1.5141 × 10−6 + 1.4801 × 10−6
c = zα
.
392
= 2.326× ≈ 5.692×10−5 ,
5000 5000

where .z0.01 = 2.326. Since

.
'
l392 − l392 = 0.0024 > c,

we reject the null hypothesis and conclude that .μ'392 > μ392 . The probability of this
choice being wrong is given by .α = 0.01.
Set .c = 0.001. Then, the number of traces to collect for a hypothesis test with a
level of significance .α = 0.01 (zα = 2.326) is given by (see Eq. 1.94)

2 + σ '2 )
zα2 (σ392 392 2.3262 × (1.5141 × 10−6 + 1.4801 × 10−6 )
.
2
= ≈ 17.
c 0.0012
224 4 Side-Channel Analysis Attacks and Countermeasures

Remark 4.2.2 According to Eq. 4.10,

.Xt = Xt' , ⇐⇒ μt = μ't . (4.13)

Then Example 4.2.8 concludes that when we take the signal to be part of the
leakage related to the plaintext value, the signals at time sample 2368 for one
round encryption of plaintexts ABCDEF1234567890 and 84216BA484216BA4 with
the same round key FEDCBA0123456789 are very likely to be equal, according
to our measurements Fixed dataset A and Fixed dataset B. The probability of the
conclusions being wrong is .0.01. On the other hand, Example 4.2.9 concludes that
the signals at time sample 392 are likely to be different (with a probability of .0.01
being wrong).
Furthermore, we see that to decide if the signals are different at a particular time
sample with .c = 0.0013 and significance level .0.01 (i.e., probability of making
wrong conclusions) we do not need that many traces.
Next, let us consider .v being the 0th Sbox output in the first round of PRESENT.
In this case, we can take .Lt to be the leakages for a fixed value of .v at time sample
t and .L't to be the leakages for another fixed value of .v at t.
Example 4.2.11 When we consider .v to be the 0th Sbox output in the first round
of PRESENT, there are 16 different values of .v that we can consider. Let .Lt and .L't
denote the random variable for leakages corresponding to .v = 0 and .v = F at time
sample t. We would like to know if the signals at time sample .t = 392 are the same
for those two values of .v.
Take the Random dataset. As mentioned in Example 4.2.4, we have 634 traces
for .v = 0 and 651 traces for .v = F. We take those 634 (respectively, 651) traces
as a sample for .Lt (respectively, .L't ). The same as in Examples 4.2.8 and 4.2.9, we
make the following hypotheses:

H0 : μ'392 = μ392 ,
. H1 : μ'392 /= μ392 .

Firstly, we compute the sample means and sample variances for .L392 and .L'392 :

l392 ≈ −0.0425, s392


.
2
≈ 2.2962×10−6 , l392
' '2
≈ −0.0539, s392 ≈ 2.7378×10−6 .

Let .α = 0.01. Then following student’s t-test with significance level .α, we compute
(see Eq. 1.65)

3 We note that this value of c can be considered as a precision.


4.2 Side-Channel Leakages 225

2 + (651 − 1)s '2


(634 − 1)s392 392
.sp2 =
634 + 651 − 2
633 × 2.2962 × 10−6 + 650 × 2.7378 × 10−6
= ≈ 2.5199 × 10−6
1283
and (see Eq. 1.88)

' |
|l392 − l392 |−0.0425 + 0.0539|
. / ≈√
1
sp2 ( 634 + 651
1
) 2.5199 × 10−6 × 3.1134 × 10−3

≈ 128.7 > z0.005 (z0.005 = 2.576).

We reject the null hypothesis and conclude that .μ'392 /= μ392 . The probability that
our decision is wrong is given by .α = 0.01.
Remark 4.2.3 Note that in Examples 4.2.8 and 4.2.9, the sample sizes (number
of traces) are the same (both are 5000), but in Example 4.2.11, the sample sizes
are different for .Lt and .L't . Thus, instead of using Eq. 1.89 as in Examples 4.2.8
and 4.2.9, we applied Eq. 1.88. But those two equations are the same when the
sample sizes are equal.
We have seen before that the leakage .Lt is dependent on the data being processed
in the device. In fact, as mentioned at the beginning of Sect. 4.2, some SCA attacks
(see Sects. 4.3.1, 4.3.3, and 4.4.2) exploit the dependency of the leakage on certain
intermediate values. If the leakage is not exploitable, we would expect, at least, that
the signals at time sample t should be the same when the only difference is the
values of the data being processed. With our notations above, this means that we
would like to test if .Xt = Xt' , or equivalently .μt = μ't (see Remark 4.2.2), for two
different fixed values of a certain intermediate value .v.
Example 4.2.12 Continuing Remark 4.2.2 and the above discussion, we can
conclude that the leakages at time sample 392 for our implementation of PRESENT
on our DUT are very likely to be vulnerable to SCA attacks.
Another approach to analyzing whether the leakage is exploitable is to consider
the signals for a fixed value of .v and that for random values of .v. Let .Lrt denote
the random variable corresponding to the leakage at time sample t for encryptions
corresponding to random values of .v. Let .Xtr and .Ntr be the random variables for
the corresponding signal and noise. We have

Lrt = Xtr + Ntr .


.

With our assumptions and modeling, the signal .Xt is a constant for a fixed value
of .v at time t. When the value of .v is random, .Xtr is itself a random variable that
varies depending on .v. It is not easy to approximate the distribution induced by .Lrt
226 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.10 Histogram of leakages at time sample .t = 392 across 5000 traces from the Random
plaintext dataset

Fig. 4.11 Histogram of leakages at time sample .t = 392 across 10,000 traces from the Random
dataset

in this case. However, following the convention, we still use a normal distribution
for the approximation.
To see that this makes sense, let us take the Random plaintext dataset and plot
the histogram of leakages at .t = 392 across 5000 traces from this dataset. We
get Fig. 4.10. This corresponds to random values of .v when .v is taken to be the
plaintext. As another example, the histogram for leakages at .t = 392 across the
10,000 traces from the Random dataset is shown in Fig. 4.11. In this case, we can
consider the leakage corresponds to random values of .v when .v is taken to be the
0th Sbox output. Those two figures demonstrate that it is reasonable to approximate
the distribution induced by the leakage .Lrt with a normal distribution.
Suppose

Lrt ∼ N(μrt , σtr2 ).


.

Since the noise is independent of the signal, we have .Nt = Ntr . By Eqs. 1.33, 1.36,
and 4.9,
4.2 Side-Channel Leakages 227

⎡ ⎤ ⎡ ⎤ ( ) ( ) ( )
μrt = E Xtr + E Ntr ,
. σtr2 = Var Xtr + Var Ntr = Var Xtr + σt2 .

We have
⎡ ⎤
μt − Xt = μrt − E Xtr ,
. σtr2 /= σt2 . (4.14)

Same as before, in case the leakage is not exploitable at time sample t, we expect
the signal to be a constant at t, namely

Xt = Xtr ,
. and equivalently μrt = μt . (4.15)

Consequently, our hypotheses will be the same as in Examples 4.2.8, 4.2.9,


and 4.2.11. The difference is that in this case, .σtr2 /= σt2 . For this reason, we apply
Welch’s t-test instead of the student’s t-test.
Example 4.2.13 We consider the signal given by the plaintext value, i.e., .v =
plaintext. Let .t = 392. Then we can take Fixed dataset A as a sample for .L392 and
Random plaintext dataset as a sample for .Lr392 We know that (see Example 4.2.11)

l392 ≈ −0.0525,
.
2
s392 ≈ 1.5141 × 10−6 .

We can also compute

r ≈ −0.0488,
l392
.
r2
s392 ≈ 1.1700 × 10−5 .

We would like to test if .μt = μrt . Thus, we set the following hypotheses:

H0 : μr392 = μ392 ,
. H1 : μr392 /= μ392 .

Let .α = 0.01. Then following Welch’s t-test with significance level .α, we compute
(see Eq. 1.91)

|l392 − l392
r |
| − 0.0525 + 0.0488|
. / 2 =/ ≈ 72.0 > z0.005 .
s392 r2
s392 1.5141×10−6 1.1700×10−5
5000 + 5000 5000 + 5000

We reject the null hypothesis and conclude that .μr392 /= μ392 . The probability that
our decision is wrong is equal to .α = 0.01.
Example 4.2.14 Now we consider the signal to be given by the 0th Sbox output.
For the fixed signal we choose .v = 0. Take the Random dataset. Let .Lt and .Lrt
denote the random variables corresponding to leakages for .v = 0 and random values
of .v at time sample t, respectively.
We know that there are 634 traces for .v = 0. Fix .t = 392. In Example 4.2.11 we
have computed
228 4 Side-Channel Analysis Attacks and Countermeasures

l392 ≈ −0.0425,
.
2
s392 ≈ 2.2962 × 10−6 .

For the random values of .v, we can take the whole dataset, which contains 10,000
traces, as a sample for .Lrt . We have

r ≈ −0.0487,
l392
.
2
s392 ≈ 1.1624 × 10−5 .

Let .α = 0.01. Then according to Welch’s t-test with significance level .α, we
compute (see Eq. 1.91)

r |
|l392 − l392 | − 0.0425 + 0.0487|
. / 2 =/ ≈ 89.6 > z0.005 .
r2 2.2962×10−6 −5
s392
+
s392
634 + 1.1624×10
10,000
634 5000

We reject the null hypothesis and conclude that .μr392 /= μ392 . The probability that
our decision is wrong is .α = 0.01.
The rationale of the TVLA methodology is that if the leakage is not exploitable,
the encryptions corresponding to two different intermediate values (or the encryp-
tion corresponding to one fixed intermediate value and that to a random intermediate
value) should exhibit identical signals. Then according to Eq. 4.13 (or Eq. 4.15), the
corresponding leakages will have the same means. With the help of the student’s
t-test (or Welch’s t-test), we make hypotheses about means of leakages and test if
they are equal.
Recall that for student’s t-test and Welch’s t-test (when the sample size is big), we
need to choose a significance level .α and compare computations using our samples
with a threshold .zα/2 (see Eqs. 1.88 and 1.91). For TVLA, following the convention,
we set .zα/2 = 4.5. By Eq. 1.43, this threshold corresponds to

α
. = 1 − Ф(zα ) = 1 − Ф(4.5) = 1 − 0.9999966023268753 ≈ 3.4 × 10−6 .
2
The significance level is given by

α ≈ 6.8 × 10−6 .
.

This means that there is a .6.8 × 10−4 percent chance that we would reject the null
hypothesis (i.e., conclude that the means are different) in case it is true (i.e., the
means are in fact the same).
The steps for TVLA are as follows:
TVLA Step 1 Identify the cryptographic implementation for analysis. In prin-
ciple, TVLA can be used for analyzing leakages of implementations
for any type of algorithm. In practice, they are mostly used for the
analysis of symmetric block cipher implementations.
4.2 Side-Channel Leakages 229

TVLA Step 2 Choose the intermediate value .v. The choice of .v determines how
we measure our traces. TVLA tests if different values of .v result in
different signals.
TVLA Step 3 Experimental setup and measure leakages. As we can imagine, for
the actual attacks, experimental setups are crucial factors for success.
For leakage assessment, it would be better to carry out measurements
with equipment that is expected to be used by attackers that we would
like to protect against.
We will prepare two datasets, denoted by .T1 and .T2 . To get the
first dataset .T1 , we choose a fixed value for .v. Then we randomly
take .M1 inputs for the cryptographic implementation such that the
value of .v is equal to this fixed value. One trace is taken for each
input.
For the second dataset .T2 , there are two options.
(a) Fixed versus fixed. Choose a different fixed value for .v. Then
randomly take .M2 inputs for the cryptographic implementation
such that the value of .v is equal to this fixed value. One trace is
collected for each input.
(b) Fixed versus random. Randomly take .M2 inputs for the crypto-
graphic implementation so that the value of .v is random. One
trace is collected for each input.
Let us represent those two sets of traces as follows:

(1) (1) (1) (2) (2) (2)


. T1 = {𝓁1 , 𝓁2 , . . . , 𝓁M1 }, T2 = {𝓁1 , 𝓁2 , . . . , 𝓁M2 }.
⎛ ⎞
(i) (i) (i) (i)
Each trace .𝓁j = lj 1 , lj 2 , . . . , lj q contains q time samples (.i =
1, 2).
For our illustrations, we will consider two choices of .v—the
plaintext and the 0th Sbox output. When .v is given by the plaintext,
we take

T1 = Fixed dataset A,
. T2 = Fixed dataset B

for the fixed versus fixed setting and

T1 = Fixed dataset A,
. T2 = Random plaintext dataset

for the fixed versus random setting. For both cases, we will demon-
strate the results for .M1 = M2 = 5000 and .M1 = M2 = 50.
When .v is given by the 0th Sbox output, we take

T1 = traces in Random dataset for v = 0,


.
230 4 Side-Channel Analysis Attacks and Countermeasures

T2 = traces in Random dataset for v = F

for the fixed versus fixed setting. As discussed in Example 4.2.4,


M1 = 634, .M2 = 651. For the fixed versus random setting, we
.

choose

T1 = traces in Random dataset for v = 0,


. T2 = Random dataset

and .M1 = 634, .M2 = 10,000. For all our traces, .q = 3600.
(1) (2)
TVLA Step 4 t-Test for one time sample. Fix a time sample t. Let .Lt and .Lt
denote the random variable corresponding to leakages at time sample
t for computations resulting in datasets .T1 and .T2 , respectively.
Suppose

(1) (1) (1)2 (2) (2) (2)2


Lt
. ∼ N(μt , σt ), Lt ∼ N(μt , σt ).

By definition (see Eqs. 1.49 and 1.53), we compute the sample mean
and sample variance for .L(1) (2) (1)
t (respectively, .Lt ), denoted by .lt and
(1)2 (2) (2)2
st
. (respectively, .lt and .st ):

1 Σ 1 Σ
M1 M2
(1) (1) (2) (2)
lt
. = lj t , lt = lj t ,
M1 M2
j =1 j =1

and
M1 ⎧
Σ ⎫2 M2 ⎧
Σ ⎫2
(1)2 1 (1) (1) (2)2 1 (2) (2)
.st = lj t − lt , st = lj t − lt .
M1 − 1 M2 − 1
j =1 j =1

Then we propose the following null and alternative hypotheses:

H0 : μ(1)
.
(2)
t = μt , H1 : μ(1) (2)
t /= μt . (4.16)

Depending on our setting, we choose between the student’s t-test and


Welch’s t-test. As we have discussed above, for the fixed versus fixed
setting, the noise for both cases is assumed to be the same and (see
Eq. 4.10)

σt(1)2 = σt(2)2 ,
.

hence the usage of student’s t-test. In the fixed versus random setting,
the noises are different, and we have (see Eq. 4.14)
4.2 Side-Channel Leakages 231

(1)2 (2)2
σt
. /= σt ,

hence the application of Welch’s t-test.


(a) Student’s t-test for the fixed versus fixed setting. When the
second dataset .T2 is measured according to the fixed versus fixed
setting, following the student’s t-test, we compute (see Eq. 1.65)

(M1 − 1)st(1)2 + (M2 − 1)st(2)2


sp2 =
.
M 1 + M2 − 2

and (see Eq. 1.88)

(1) (2)
lt − lt
t − valuet := /
. . (4.17)
sp2 (1/M1 + 1/M2 )

(b) Welch’s t-test for the fixed versus random setting. When the
second dataset .T2 is measured according to the fixed versus
random setting, following Welch’s t-test, we compute

(1) (2)
l − lt
t − valuet := / t
. . (4.18)
(1)2 (2)2
st st
M1 + M2

Then we compare the t-value.t with our threshold .4.5. In case

t − valuet > 4.5,


. or t − valuet < −4.5,

we reject the null hypothesis. Following the previous discussions,


this means that the signals at time sample t are different for
computations with two fixed values of .v (or for a fixed value of .v
and random values of .v). We conclude that there is a high chance
that data-dependent leakage appears at time sample t.
TVLA Step 5 Repeat TVLA Step 4 for all time samples t.
We note that when the t-value is between .−4.5 and .4.5 for all time samples
.1, 2, . . . , q, we cannot conclude that the implementation is safe. As there might be
other attacks that do not exploit the dependency of leakages on the chosen .v.
Now, we show some results of the TVLA on our datasets. Let us first take .v
to be the plaintext. For the fixed versus fixed setting, we take Fixed dataset A
and Fixed dataset B as samples for our analysis. The t-values with the student’s
t-test (Eq. 4.17) are shown in Fig. 4.12, where we have used the entire datasets and
.M1 = M2 = 5000. We can see that most of the time samples have t-values outside
232 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.12 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with Fixed dataset A
and Fixed dataset B. The signal is given by the plaintext value, and the fixed versus fixed setting is
chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5

Fig. 4.13 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Fixed dataset A and 50 traces from Fixed dataset B. The signal is given by the plaintext value, and
the fixed versus fixed setting is chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5

of the threshold. This is not surprising as the implementation does not have any
countermeasures. In Sect. 4.3.1 we will see that using this implementation, with just
a few traces, we can recover the first round key. If we reduce the number of traces for
computing the t-values, we will get different results. For example, when we take 50
traces, i.e., .M1 = M2 = 50, we have Fig. 4.13. Compared to Fig. 4.12, the absolute
values of t-values are much smaller. This shows that when the sample size is bigger,
it is more likely for us to capture information about the inputs from the leakages.
For the fixed versus random setting, t-values with Welch’s t-test (Eq. 4.18)
are computed with Fixed dataset A and Random plaintext dataset. The results
are shown in Figs. 4.14 and 4.15 for .M1 = M2 = 5000 and .M1 = M2 =
50, respectively. Similarly, we also observe higher .|t|-values with more traces.
Furthermore, compared to Figs. 4.12 and 4.13, the .|t|-values are much lower. This
shows that it is more likely for us to distinguish between leakages corresponding
4.2 Side-Channel Leakages 233

Fig. 4.14 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with Fixed dataset
A and Random plaintext dataset. The signal is given by the plaintext value, and the fixed versus
random setting is chosen. Blue dashed lines correspond to the threshold .4.5 and .−4.5

Fig. 4.15 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Fixed dataset A and 50 traces from Random plaintext dataset. The signal is given by the plaintext
value, and the fixed versus random setting is chosen. Blue dashed lines correspond to the threshold
.4.5 and .−4.5

to two fixed plaintexts rather than between leakages for a fixed plaintext and for
random plaintexts.
Next, we take .v to be the 0th Sbox output. We use Random dataset as samples for
our random variables corresponding to leakages. For the fixed versus fixed setting,
we take the .M1 = 634 traces for .v = 0 as .T1 and .M2 = 651 traces for .v = F as .T2 .
The t-values with the student’s t-test (Eq. 4.17) are shown in Fig. 4.16. For the fixed
versus random setting, we take the .M1 = 634 traces for .v = 0 as .T1 and the whole
dataset as .T2 (.M2 = 10,000). Following Welch’s t-test, the t-values (Eq. 4.18) are
shown in Fig. 4.17. Again, we also show the results when fewer traces are used for
the computations. The t-values can be found in Figs. 4.18 and 4.19.
In summary, we have the following observations:
• When more traces are used (i.e., when the sample size is bigger), it is more likely
for us to capture information about the intermediate values from the leakages.
234 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.16 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. .T1 contains .M1 = 634 traces and .T2 contains .M2 = 651 traces. The signal
is given by the 0th Sbox output, and the fixed versus fixed setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5

Fig. 4.17 t-Values (Eq. 4.18) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. .T1 contains .M1 = 634 traces and .T2 contains .M2 = 10,000 traces. The signal
is given by the 0th Sbox output, and the fixed versus random setting is chosen. Blue dashed lines
correspond to the threshold .4.5 and .−4.5

We will see in Sect. 4.3.2.4 that more traces indeed indicate higher chances for
the attacks to be successful.
• When .v is given by the 0th Sbox output, the highest .|t|-value is obtained at 392
for all cases we have analyzed. We will see that this is the point of interest (POI)
for our attack (Sect. 4.3.2).
• Compared to .v being the plaintext, the .|t|-values are in general smaller with much
fewer time samples crossing the threshold when .v is given by the 0th Sbox output.
This is unsurprising as we would expect more computations to be correlated with
the plaintext rather than a single Sbox output.
4.2 Side-Channel Leakages 235

Fig. 4.18 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. Both .T1 and .T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is given
by the 0th Sbox output, and the fixed versus fixed setting is chosen. Blue dashed lines correspond
to the threshold .4.5 and .−4.5

Fig. 4.19 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with traces from
Random dataset. Both .T1 and .T2 contain 50 traces (i.e., .M1 = M2 = 50). The signal is given by
the 0th Sbox output, and the fixed versus random setting is chosen. Blue dashed lines correspond
to the threshold .4.5 and .−4.5

4.2.4 Signal-to-Noise Ratio

Signal-to-noise ratio (SNR) is commonly used in electrical engineering and signal


processing, and the general definition is

Var(signal)
SNR =
. ,
Var(noise)

where .Var refers to the variance of a random variable (see Eq. 1.35).
In our case, for a fixed time sample t, .Xt represents the signal, which is part of
the leakage relevant to our attack. And the SNR at time t is given by
236 4 Side-Channel Analysis Attacks and Countermeasures

Var(Xt )
.SNRt = . (4.19)
Var(Nt )

Var(Xt ) measures how much the leakage varies at time sample t due to the signal.
.

Var(Nt ) measures how much the leakage varies due to the noise. Thus, SNR
.

quantifies how much information is leaked at time sample t from the measurements.
The higher the SNR, the lower the noise.
Example 4.2.15 Suppose we are interested in the Hamming weight of an 8-bit
intermediate value at time sample t. In particular, the intermediate value we would
like to analyze is from .F82 . We further assume that the leakage .Lt is equal to the
modeled leakage following the Hamming weight leakage model (Eq. 4.4). Thus
.Xt = wt (v) for .v ∈ F . Then the variance of the signal is given by .Var(wt (v))
8
2
for .v ∈ F2 . By definition (Eq. 1.31),
8

8 ⎧ ⎫
1 Σ 1 Σ 8 1 Σ
8
8!
E [wt (v)] =
. wt (v) = i =
|F82 | 28 i 28 (i − 1)!(8 − i)!
v∈F82 i=1 i=1

7 ⎧ ⎫
8 Σ 8 Σ 7
8
7! 8 × 27
= = = = 4.
28 (i − 1)!(7 − (i − 1))! 28 j 28
i=1 j =0

And

⎡ ⎤ ⎧ ⎫
1 Σ ⎛ 2⎞ 1 Σ 2 8 1 Σ
8 8
8!
.E wt (v)
2
= 8 wt v = 8 i = 8 i
|F2 | 8
2 i 2 (i − 1)!(8 − i)!
i=1
v∈F2 i=1
⎧ 8 ⎫
8 Σ Σ8
7! 7!
= 8 (i − 1) +
2 (i − 1)!(8 − i)! (i − 1)!(7 − (i − 1))!
i=1 i=1
⎛ ⎞
7 ⎧ ⎫
1 ⎝ Σ Σ
8
6! 7 ⎠
= 5 7 +
2 (i − 2)!(6 − (i − 2))! j
i=2 j =0
⎛ ⎞
6 ⎧ ⎫
Σ
1 ⎝ 7 6 ⎠
= 5 2 +7
2 j
j =0

1 7
= (2 + 7 × 26 ) = 22 + 7 × 2 = 18.
25
By Eq. 1.35,
⎡ ⎤
Var(wt (v)) = E wt (v)2 − E [wt (v)]2 = 18 − 42 = 2.
.
4.2 Side-Channel Leakages 237

Let .σt2 denote the variance of the noise .Nt . We have

Var(Xt ) Var(wt (v)) 2


SNR =
. = 2
= 2.
Var(Nt ) σt σt

Example 4.2.16 In this example, let .Lt denote the random variable corresponding
to the leakage of one round of PRESENT encryption at time t. We take the Random
dataset (see Sect. 4.1) as a sample for .Lt . Suppose we are interested in the exact
value of the 0th Sbox output in the first round of PRESENT. Let us denote this
intermediate value by .v.
Fix a time sample t. .Xt is given by the part of the leakage related to the value
of .v. To compute .Var(Xt ), we first divide the traces in Random dataset into 16 sets
according to the value of .v. Let us denote those 16 sets of traces by .A1 , A2 , . . . , A16 ,
where .As contains traces corresponding to .v = s − 1.
As discussed in Sect. 4.2.1, for a fixed value of .v, .Xt is a constant, and the leakage
and the noise can be modeled by normal random variables. Let .Lt,s and .Nt,s denote
the random variables corresponding to leakage and noise at time sample t for .v =
s − 1. Let .Xt,s denote the constant leakage in this case.
Similar to Example 4.2.1, we can approximate the mean of .Lt,s using sample
mean computed with set .As . For example, take .t = 600, and we have

l600,1 ≈ 0.08212,
. l600,2 ≈ 0.08221, l600,3 ≈ 0.08209, ...

By Eq. 4.3, for any s,


⎡ ⎤ ⎡ ⎤
Xt,s = E Lt,s − E Nt,s .
. (4.20)

Variance of .Xt is given by variance of .Xt,s values, and we have


( ⎡ ⎤ ⎡ ⎤)
Var(Xt ) = Var E Lt,s − E Nt,s .
.

Since ⎤ information related to .v is contained in .Xt , which is independent of .Nt ,


⎡ any
E Nt,s is a constant for all s. We have (see Eq. 1.36)
.

( ⎡ ⎤)
. Var(Xt ) = Var E Lt,s ,
⎡ ⎤
which can be estimated with the sample variance of .E Lt,s . For .t = 600, we have

2
sX
.
600
≈ 1.0088 × 10−8 .

By Eq. 4.2,

Var(Nt ) = Var(Lt − Xt ).
.
238 4 Side-Channel Analysis Attacks and Countermeasures

⎡ ⎤
On the other hand, since .E Nt,s is a constant for different values of s, by Eqs. 1.36
and 4.20,
( ⎡ ⎤) ( ⎡ ⎤) ( )
Var Lt − E Lt,s =Var Lt − Xt,s − E Nt,s = Var Lt − Xt,s = Var(Lt − Xt ).
.

⎡ ⎤
Thus .Var(Nt ) can be approximated by the sample variance of .Lt − E Lt,s . For
.t = 600, we have

2
sN
.
600
≈ 6.4184 × 10−6 .

And the SNR at time sample 600 is given by

1.0088 × 10−8
2
sX
Var(X600 )
SNR600 =
. ≈ 2 600 = ≈ 0.00157.
Var(N600 ) sN600 6.4184 × 10−6

Example 4.2.17 For now, we have discussed the definition of SNR for one point
in time. With the same method as in Example 4.2.16, we can compute the sample
variance for .Var(Xt ) and .Var(Nt ), as well as SNR values for all time samples. They
are shown in Figs. 4.20, 4.21, and 4.22, respectively.
We can see that the shape of variance of noise has similarities to one round of
PRESENT computations (e.g., Fig. 4.3). This is reasonable since most of the leakage
is not related to .v.
Furthermore, the peaks for the variance of signal and SNR correspond to each
other. The first two peaks are likely related to AddRoundKey and sBoxLayer. The
peaks after 1000 are probably caused by the permutation of 4 bits of .v (the 0th
Sbox output). These observations can be confirmed by comparing them to Fig. 4.3.
In particular, we can deduce that the peak at .t = 392 is related to the 0th Sbox
computation—as observed in Fig. 4.3, sBoxLayer starts from around time sample
382.

Fig. 4.20 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the exact value of the 0th Sbox output
4.2 Side-Channel Leakages 239

Fig. 4.21 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the exact value of the 0th Sbox output

Fig. 4.22 SNR for each time sample, computed using Random dataset. The signal is given by the
exact value of the 0th Sbox output

Example 4.2.18 We again look at the Random dataset. Instead of the exact values
of .v as in Example 4.2.16, we focus on the Hamming weight of the 0th Sbox output,
i.e., .wt (v). Then, in this case, for a fixed time sample t, we divide the traces into
five sets according to the value of .wt (v). Let us denote those five sets of traces
by .A1 , A2 , . . . , A5 , where .As contains traces corresponding to .wt (v) = s − 1.
Following similar computations as in Example 4.2.16, for .t = 600, we have

l600,1 ≈ 0.08212,
. l600,2 ≈ 0.08206, l600,3 ≈ 0.08214,
l600,4 ≈ 0.08211, l600,5 ≈ 0.08206.

And
2
sX
.
600
≈ 1.1043 × 10−9 , 2
sN 600
≈ 6.4271 × 10−5 , SNR600 ≈ 0.0001718.

The results for all time samples are shown in Figs. 4.23, 4.24, and 4.25.
240 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.23 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the Hamming weight of the 0th Sbox output

Fig. 4.24 SNR for each time sample, computed using Random dataset. The signal is given by the
Hamming weight of the 0th Sbox output

Fig. 4.25 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the Hamming weight of the 0th Sbox output
4.2 Side-Channel Leakages 241

The sample variance of the noise is very similar to Fig. 4.21 and also resembles
the leakage of PRESENT computation since most of the leakage is not related to
.wt (v). The peaks in the variance of signal and SNR also correspond to each other.

Compared to Fig. 4.22, the locations of the peaks are similar. It is worth noting that
the highest peak in both Figs. 4.22 and 4.24 is at time sample 392. As mentioned in
Example 4.2.17, this time sample corresponds to the computation of the 0th Sbox in
sBoxLayer. We also note that Fig. 4.24 has a higher SNR value than Fig. 4.22 at this
point. This suggests that the Hamming weight leakage model is closer to our DUT
leakage than the identity leakage model.
Normally in DPA attacks, we would like to focus on time samples where the
corresponding SNRs are high. We refer to those time samples as points of interest
(POIs).
Example 4.2.19 Continuing Example 4.2.17, the time sample with the highest
SNR is given by .t = 392. We can then take this point as our POI. Or, we can
also take a few time samples that achieve the higher SNRs. For example, the top
three SNRs are obtained at .t = 392, 218, 1328.
Similarly, suppose we focus on the Hamming weight of the 0th Sbox output.
Following the results from Example 4.2.18, in case we take just one POI, we have
.t = 392. And for three POIs, we have .t = 392, 1309, 1304.

Those POIs will be further used for our attacks in Sect. 4.3.2.
Example 4.2.20 As another example, suppose instead of the exact value or Ham-
ming weight of the 0th Sbox output .v, we are interested in the 0th bit of .v.
With the same dataset Random dataset, we divide the traces into two sets .A1 , A2 ,
corresponding to the 0th bit of .v equal to 0 and 1, respectively. Following similar
computations as in Example 4.2.16, for .t = 600, we have

l600,1 ≈ 0.08206,
. l600,2 ≈ 0.08216.

And
2
sX
.
600
≈ 2.6879 × 10−9 , 2
sN 600
≈ 6.4256 × 10−6 , SNR600 ≈ 0.0004183.

The results for all time samples are shown in Figs. 4.26, 4.27, and 4.28.
We can see that Fig. 4.28 is similar to Figs. 4.21 and 4.25. Compared to Figs. 4.22
and 4.24, there are fewer peaks in Fig. 4.27. Furthermore, the highest peak is not
around the sBoxLayer, but during pLayer computation. This is expected since now
we only consider 1 bit instead of 4 bits of .v.
242 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.26 Sample variance of the signal for each time sample, computed using Random dataset.
The signal is given by the 0th bit of the 0th Sbox output

Fig. 4.27 SNR for each time sample, computed using Random dataset. The signal is given by the
0th bit of the 0th Sbox output

Fig. 4.28 Sample variance of the noise for each time sample, computed using Random dataset.
The signal is given by the 0th bit of the 0th Sbox output
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 243

4.3 Side-Channel Analysis Attacks on Symmetric Block


Ciphers

In this section, we will discuss two types of attacks on symmetric block cipher
implementations: differential power analysis (DPA) in Sects. 4.3.1 and 4.3.2 and
side-channel assisted differential plaintext attack (SCADPA) in Sect. 4.3.3.

4.3.1 Non-profiled Differential Power Analysis Attacks

As mentioned in Sect. 4.2, DPA exploits the relationship between leakages at


specific time samples and the data being processed in the DUT. In this subsection,
we will focus on the non-profiled setting, where we assume the attacker only has
access to the target device (or measurements from the target device) and they aims
to analyze the side-channel leakages to recover the master key of a symmetric block
cipher.
Attacker assumption In more detail, we assume the attacker has the knowledge of
the plaintext, and the goal is to recover the very first round key used at the beginning
of a symmetric block cipher—for some ciphers, e.g., PRESENT, this is the first
round key; for some ciphers, e.g., AES-128, this is the whitening key, which is equal
to the master key. We note that after getting this round key, for some ciphers, e.g.,
AES-128 (see Remark 3.1.4), the master key can be found. For some ciphers, e.g.,
DES and PRESENT-80 (see Remarks 3.1.1 and 3.1.5), part of the master key can
be found, and the remaining bits can be recovered by brute force. Otherwise, with
the knowledge of this round key, the same attack method can be used to recover the
next round key. In most cases, two round keys are enough to reveal the full master
key using the reverse key schedule.
Similar attack strategies apply if we assume the attacker has the knowledge of
the ciphertext and aims to recover the last round key. Furthermore, we also assume
that the attacker has certain knowledge of the implementation for example, how to
interface with the encryption routine, whether the implementation is round-based
or bit-sliced, whether the computation is executed serially or in parallel, or whether
some types of countermeasures are present.

4.3.1.1 Non-profiled DPA Attack Steps

A non-profiled DPA attack on symmetric block cipher implementations consists of


the following steps:
DPA Step 1 Identify the target cryptographic implementation. DPA attacks can
be applied to unprotected implementations of any symmetric block
244 4 Side-Channel Analysis Attacks and Countermeasures

ciphers that have been proposed so far. As a running example, we will


look at the computation of PRESENT.
DPA Step 2 Experimental setup and measure leakages. The efficiency and suc-
cess of the attack are highly dependent on the measurement devices the
attacker has access to. For our illustrations, we follow the experimental
settings as described in Sect. 4.1.
Suppose we have taken measurements of the target implementation
j j j
with .Mp plaintexts. For .j = 1, . . . , Mp , let .𝓁j = (l1 , l2 , . . . , lq )
denote the power trace corresponding to the j th plaintext, where q is
the total number of time samples in one trace. For our attacks, we will
use the Random plaintext dataset (see Sect. 4.1). In particular, we have
.q = 3600 and .Mp = 5000.

DPA Step 3 Choose the part of the key to recover. The DPA attack is normally
carried out in a divide-and-conquer manner. In particular, we focus on a
small part (e.g., a nibble and a byte) of a round key in each attack, and
each part of the round key can be recovered independently. With the
inverse key schedule, one (e.g., for AES) or two (e.g., for PRESENT,
DES) round keys will reveal the master key (see Remarks 3.1.1, 3.1.4,
and 3.1.5). Let k denote the target part of the key, and let .Mk denote
the number of possible values of k. For our attacks, we will focus on
the 0th nibble of the first round key for PRESENT and .Mk = 16.
DPA Step 4 Choose the target intermediate value. To recover the part of the key
chosen in the last step, we exploit relationships between leakages and
a certain intermediate value being processed in the DUT. The goal
is to gain information about this intermediate value, which reveals
information about our chosen part of the key. Let .v denote the target
intermediate value. We require that there is a function .ϕ, such that

v = ϕ(k, p),
.

where p denotes (part of) the plaintext. For our attack, to recover the
0th nibble of the first round key of PRESENT, we will target the 0th
Sbox output of the first round. Then we have

.v = SBPRESENT (k ⊕ p),

where k and p denote the 0th nibble of the first round key and that of
the plaintext.
DPA Step 5 Compute hypothetical target intermediate values. By our choice of
the target intermediate value, a small part of the key is related to it.
Thus, when we make a guess of this part of the key, with knowledge
of the plaintext we can obtain a hypothetical value for our target
intermediate value. In particular, for each key hypothesis .k̂i of k and
each (part of the) plaintext .pj , we can compute a hypothesis for .v,
denoted .v̂ ij , as follows:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 245

v̂ ij = ϕ(k̂i , pj ),
. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .

For our illustration, with each key hypothesis of the 0th nibble of the
first round key and each plaintext, we have a hypothetical value for the
0th Sbox output:

v̂ ij = SBPRESENT (k̂i ⊕pj ),


. i = 1, 2, . . . , 16, j = 1, 2, . . . , 5000,

where .pj is the 0th nibble of the plaintext corresponding to the attack
trace .𝓁j . Furthermore, we set

k̂i = i − 1,
. i = 1, 2, . . . , 16.

DPA Step 6 Choose the leakage model. For each hypothetical target intermediate
value, we can compute the hypothetical signal depending on our
leakage model

Hij := L(v̂ ij ) − noise,


. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp ,

where we subtract the noise component from the leakage model. For
example, if we choose the Hamming weight leakage model, according
to Eq. 4.4, we have
( )
. Hij = wt v̂ ij , i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .

In our analysis, we will consider the identity leakage model and


the Hamming weight leakage model. In Sect. 4.3.2.2 we will discuss
another leakage model obtained by profiling the device.
DPA Step 7 Statistical analysis. In this step, we aim to use a statistical distin-
guisher to distinguish the correct key hypotheses from the rest. In this
book, we will focus on correlation coefficient (see Definition 1.7.11).
For other methodologies, we refer the readers to, e.g., [MOP08,
Chapter 6].
For a fixed key hypothesis .k̂i , we view the modeled signal as a
random variable .Hi that varies when the plaintext changes. If we fix
a time sample t, we also consider the leakage at this time sample as
a random variable .Lt . Then a sample for this pair of random variable
.(Hi , Lt ) is given by

{ }
j
. (Hij , lt ) | j = 1, 2, . . . , Mp .

We would like to know how good the modeled signals are compared
to the actual leakages for each key hypothesis. For the correct key
hypothesis and the time samples corresponding to POIs, we expect
246 4 Side-Channel Analysis Attacks and Countermeasures

the modeled signals to be “most” correlated to the actual leakages as


compared to other key hypotheses and time samples. To measure how
correlated are the leakages and modeled signals, we adopt the notion
of correlation coefficient for further analysis. For each key hypothesis
.k̂i (.i = 1, 2, . . . , Mk ) and each time sample t (.t = 1, 2, . . . , q),

we compute the sample correlation coefficient (see Example 1.8.1),


denoted by .ri,t , of .Hi and .Lt :
ΣMp j
− Hi )(lt − lt )
j =1 (Hij
ri,t
. : = /Σ /Σ ,
Mp Mp j
j =1 (H ij − H i ) 2
j =1 (l t − l t ) 2

i = 1, 2, . . . , Mk , t = 1, 2, . . . , q.

In our case,
Σ5000 j
j =1 (Hij− Hi )(lt − lt )
ri,t = /Σ
. /Σ ,
5000 5000 j
j =1 (H ij − H i ) 2 (l
j =1 t − l t ) 2

i = 1, 2, . . . , 16, t = 1, 2, . . . , 3600. (4.21)

Since the target intermediate value .v we have chosen will be processed in our
DUT at certain points in time, we expect the leakages at those corresponding time
samples to be correlated to .v. Those time samples are our POIs. If a good leakage
model (i.e., a model that is close to the actual leakage of the DUT) is chosen, we
expect .Hi and .Lt to be correlated for the correct key hypothesis .k̂i and POIs t. Thus,
the key hypothesis that achieves the highest absolute value of .ri,t is expected to be
the correct key. Furthermore, the time samples that achieve higher absolute values
of .ri,t will be our POIs in the attack.
In practice, if all .rit s are low, we will need more traces for the attack.

Note
According to Eq. 4.1, the correct value of the 0th nibble of the first round key
is given by 9.

Example 4.3.1 As a simple example to illustrate how the sample correlation


coefficient can be computed, suppose we obtained a sample

{(1, 11), (0, 9), (1, 12), (1, 14), (0, 9)}
.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 247

for a pair of random variables .(U, W ). Then the sample mean for U is given by

1+0+1+1+0 3
u=
. = .
5 5
And the sample mean for W is given by

11 + 9 + 12 + 14 + 9 55
w=
. = = 11.
5 5
The sample correlation coefficient for U and W is given by
Σ5
− u)(wi − w)
i=1 (ui
r = /Σ
. /Σ
5 5
i=1 (ui − u) i=1 (wi − w)
2 2

0.4 × 0 + (−0.6) × (−2) + 0.4 × 1 + 0.4 × 3 + (−0.6) × (−2)


= √ √ ≈ 0.861.
0.42 × 3 + 0.62 × 2 22 + 1 + 32 + 22

4.3.1.2 Identity Leakage Model

Let us first consider the identity leakage model. Then in DPA Step 6, we have

Hij = v̂ ij ,
. i = 1, 2, . . . , 16, j = 1, 2, . . . , 5000.

Example 4.3.2 For the Random plaintext dataset, we have

p1 = 9,
. p2 = C.

As mentioned in DPA Step 5, .k̂1 = 0, .k̂2 = 1. Then according to Table 3.11,

H11 = v̂ 11 = SBPRESENT (k̂1 ⊕ p1 ) = SBPRESENT (0 ⊕ 9) = SBPRESENT (9) = E = 14,


.

H12 = v̂ 12 = SBPRESENT (k̂1 ⊕ p2 ) = SBPRESENT (0 ⊕ C) = SBPRESENT (C) = 4 = 4,


H21 = v̂ 21 = SBPRESENT (k̂2 ⊕ p1 ) = SBPRESENT (1 ⊕ 9) = SBPRESENT (8) = 3 = 3,
H22 = v̂ 22 = SBPRESENT (k̂2 ⊕ p2 ) = SBPRESENT (1 ⊕ C) = SBPRESENT (D) = 7 = 7.

The sample correlation coefficients .ri,t (.t = 1, 2, . . . , 3600) for .i = 1, 2, . . . , 16


are shown in Fig. 4.29. We can see that the blue plot has much bigger peaks than the
rest, which correspond to .k̂10 = 9. This is the correct 0th nibble of the round key as
given in Eq. 4.1. The plot of .r10,t (corresponding to the correct key hypothesis 9) is
shown in Fig. 4.30. We can also deduce that time samples that achieve those peaks
in Fig. 4.30 correspond to the time when .v (the 0th Sbox output) is being processed.
The first cluster of peaks is most likely caused by sBoxLayer computation, and
248 4 Side-Channel Analysis Attacks and Countermeasures

Sample correlation coefficient

Fig. 4.29 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for all time samples .t =
1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model and the Random
plaintext dataset. The blue line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient

Fig. 4.30 Sample correlation coefficients .r10,t (corresponds to the correct key hypothesis 9) for
all time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset

the other peak clusters are related to permutations of bits of .v in pLayer. Those
observations agree with the duration of each PRESENT round operation in Fig. 4.3.
We also notice that the biggest peak in Fig. 4.30 is obtained at .t = 392, which
corresponds to the point with the highest SNR from Fig. 4.22 (Example 4.2.17).
For further illustration, the plots of .ri,t (.t = 1, 2, . . . , 3600) for .i = 1, 5, 14
(corresponding to key hypotheses 0, 4, D) are shown in Figs. 4.31, 4.32, and 4.33,
respectively. Comparing those figures with Fig. 4.30, we can see some peaks appear
at similar time samples in all figures. This is due to the fact that .Hi s are not
independent random variables, and for those time samples t, .Hi s are also correlated
with .lt for .i /= 10.
Remark 4.3.1 The correlation between .Hi s also influences the magnitude of the
correlation coefficients for the wrong key hypotheses. If the correlation between .Hi s
is higher, we would also see higher peaks in some wrong key hypotheses. For AES,
the correlations between the first AddRoundKey outputs are higher than correlations
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 249

Sample correlation coefficient

Fig. 4.31 Sample correlation coefficients .r1,t (corresponds to a wrong key hypothesis 0) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
Sample correlation coefficient

Fig. 4.32 Sample correlation coefficients .r5,t (corresponds to a wrong key hypothesis 4) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
Sample correlation coefficient

Fig. 4.33 Sample correlation coefficients .r14,t (corresponds to a wrong key hypothesis D) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the identity leakage model
and the Random plaintext dataset
250 4 Side-Channel Analysis Attacks and Countermeasures

between the first SubBytes operation outputs, which is why in DPA Step 4 we chose
the target intermediate value to an Sbox output.

4.3.1.3 Hamming Weight Leakage Model

In this part, let us consider the Hamming weight leakage model. In DPA Step 6, we
have
( )
. Hij = wt v̂ ij , i = 1, 2, . . . , 16, j = 1, 2, . . . , 5000.

Example 4.3.3 Continuing Example 4.3.2, in this case, we have


( )
H11 = wt v̂ 11 = wt (E) = 3
.
( )
H12 = wt v̂ 11 = wt (4) = 1
( )
H21 = wt v̂ 11 = wt (3) = 2
( )
H22 = wt v̂ 11 = wt (7) = 3.

The sample correlation coefficients .ri,t (.t = 1, 2, . . . , 3600) for .i = 1, 2, . . . , 16


are shown in Fig. 4.34. The same as in Fig. 4.29, the blue plot has much bigger peaks
than the rest, which corresponds to .k̂10 = 9. The plot of .r10,t is shown in Fig. 4.35.
The time samples that achieve peaks in this plot are similar to those in Fig. 4.30.
Plots of .ri,t for .i = 1, 5, 14 are shown in Figs. 4.36, 4.37, and 4.38.
Remark 4.3.2 We note that the attacks we have seen recover one nibble of the first-
round key. The other nibbles can be recovered independently with a similar method
using the same traces.
Sample correlation coefficient

Fig. 4.34 Sample correlation coefficients .ri,t (.i = 1, 2, . . . , 16) for all time samples .t =
1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model and the Random
plaintext dataset. The blue line corresponds to the correct key hypothesis .k̂10 = 9
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 251

Sample correlation coefficient

Fig. 4.35 Sample correlation coefficients .r10,t (corresponds to the correct key hypothesis 9) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
Sample correlation coefficient

Fig. 4.36 Sample correlation coefficients .r1,t (corresponds to a wrong key hypothesis 0) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset

4.3.2 Profiled Differential Power Analysis

In this subsection, we will consider a profiled setting. In particular, we assume the


attacker has access to a clone device and can characterize the leakages of the clone
device in the profiling phase before attacking the target device in the attack phase.
Attacker assumption We assume the attacker has the knowledge of the plaintext,
and the goal is to recover the very first round key used in the encryption of a
symmetric block cipher—for some ciphers, e.g., PRESENT, this is the first round
key, and for some ciphers, e.g., AES, this is the whitening key, which is equal
to the master key. Similar attack strategies apply if we assume the attacker has
the knowledge of the ciphertext and aims to recover the last round key. We also
assume the attacker has the knowledge of the detailed implementation so that the
same program can be implemented by the attacker on the clone device. This is
252 4 Side-Channel Analysis Attacks and Countermeasures

Sample correlation coefficient

Fig. 4.37 Sample correlation coefficients .r5,t (corresponds to a wrong key hypothesis 4) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset
Sample correlation coefficient

Fig. 4.38 Sample correlation coefficients .r14,t (corresponds to a wrong key hypothesis D) for all
time samples .t = 1, 2, . . . , 3600. Computed following Eq. 4.21 with the Hamming leakage model
and the Random plaintext dataset

different from the non-profiled setting where only certain basic knowledge of the
implementation is required.
For our illustrations, we suppose the Random dataset is obtained from a clone
device, and the Random plaintext dataset is from the target device. Then before the
attack, we can analyze the Random dataset to obtain more information about the
leakage behavior of the DUT in the profiling phase.
The first major step in the profiling phase is to find the POIs, namely, time sam-
ples that will give us more information or with better signal. After identifying the
POIs, in the attack phase, instead of computing the sample correlation coefficients
for all time samples, we can just focus on the POIs.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 253

4.3.2.1 Profiled DPA Attack Steps

The detailed steps for a profiled DPA attack are as follows:


P-DPA Step 1 Identify the target cryptographic implementation. This step is
the same as DPA Step 1 in Sect. 4.3.1.1. As a running example, we
will look at the computation of PRESENT.
P-DPA Step 2 Measurement of profiling traces. We first collect a set of traces
for profiling using the clone device with random plaintexts and
random keys. Those traces are called the profiling traces. Note that
we assume the attacker has the knowledge of the plaintexts and
the keys. Suppose there are in total .Mpf profiling traces, and each
trace contains q time samples. For our illustrations, we will use
the Random dataset as profiling traces, then .Mpf = 10,000, and
.q = 3600.

P-DPA Step 3 Choose the part of the key to recover. This step is the same
as DPA Step 3 in Sect. 4.3.1.1. Let k denote the target part of the
key, and let .Mk denote the number of possible values of k. For our
attacks, the same as in Sect. 4.3.1.1, we will focus on the 0th nibble
of the first round key for PRESENT and .Mk = 16.
P-DPA Step 4 Choose the target intermediate value. This step is the same
as DPA Step 4 in Sect. 4.3.1.1. Let .v denote the target intermediate
value. We require that there is a function .ϕ, such that

.v = ϕ(k, p),

where p denotes (part of) the plaintext. For our attack, to recover
the 0th nibble of the first round key of PRESENT, we will target
the 0th Sbox output of the first round. Then we have

v = SBPRESENT (k ⊕ p),
.

where k and p denote the 0th nibble of the first round key and that
of the plaintext.
P-DPA Step 5 Decide on the target signal. Before we do further analysis of the
profiling traces, we need to choose what information related to the
target intermediate value .v we are looking for, for example, the
Hamming weight of .v or the 0th bit of .v. In our illustrations, we
will look at two types of target signals, one given by the exact value
of .v and the other one given by .wt (v), the Hamming weight of .v.
P-DPA Step 6 Group the profiling traces. We take our set of profiling traces and
divide them into .Msignal sets according to the target signal from P-
DPA Step 5. Let us denote those sets by .A1 , .A2 , .. . . , .AMsignal .
For our illustrations, when the target signal is given by .v,
the exact value of the output of the 0th Sbox in PRESENT,
254 4 Side-Channel Analysis Attacks and Countermeasures

we will divide our profiling traces Random dataset into 16 sets,


A1 , A2 , . . . , A16 , where .As contains traces corresponding to .v =
.

s − 1. When the target signal is given by .wt (v), the Ham-


ming weight of .v, we will divide the profiling traces into five
sets, .A1 , A2 , . . . , A5 , where .As contains traces corresponding to
.wt (v) = s − 1.

P-DPA Step 7 Modeling leakage, signal, and noise. Let us fix a time sample t
(.1 ≤ t ≤ q), and let .Lt , .Xt , and .Nt denote the random variables
corresponding to leakage, signal, and noise at t, respectively. When
we fix the signal, as discussed in Sect. 4.2.1, the leakage .Lt and the
noise .Nt can be modeled by normal random variables. When we
focus on one particular target signal, i.e., when we only consider
computations that result in the traces belonging to a particular set
.As , let .Lt,s and .Nt,s denote the random variables corresponding to

leakage and noise at t, respectively. We further denote the constant


signal as .Xt,s . Then .Lt,s and .Nt,s can be modeled by normal
random variables, and traces from .As give us a sample to analyze
.Lt,s and .Nt,s .

For example, in our attack, if we only look at computations that


result in .v = 1, we denote the leakage and noise at a given time
sample t as .Lt,2 and .Nt,2 .
P-DPA Step 8 Compute SNR. The SNR values for each time sample .t =
1, 2, . . . , q can be computed in a similar manner as in Exam-
ple 4.2.16. In more detail, by Eq. 4.3, for any s,
⎡ ⎤ ⎡ ⎤
Xt,s = E Lt,s − E Nt,s .
. (4.22)

Hence
( ) ( ⎡ ⎤ ⎡ ⎤)
Var(Xt ) = Var Xt,s = Var E Lt,s − E Nt,s .
.

Since any information related to the⎡ target


⎤ signal is contained in
.Xt , which is independent of .Nt , .E Nt,s is a constant for all s.
Consequently, we have (see Eq. 1.36)
( ⎡ ⎤)
Var(Xt ) = Var E Lt,s .
.

Together with Eqs. 1.36, 4.22, and 4.2, we also have


( ⎡ ⎤) ( ⎡ ⎤)
.Var Lt − E Lt,s = Var Lt − Xt,s − E Nt,s
( )
= Var Lt − Xt,s = Var(Lt − Xt ) = Var(Nt ).

Using our profiling traces, .Var(Xt ) and⎡ .Var(N


⎤ t ) can be approx-
⎡ ⎤
imated by the sample variances of .E Lt,s and .Lt − E Lt,s ,
respectively. We can then approximate the SNR at time sample t
using
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 255

⎡ ⎤
Var(Xt ) sample variance of E Lt,s
.SNRt = ≈ ⎡ ⎤.
Var(Nt ) sample variance of Lt − E Lt,s

The same computations can be done for all time samples t.


For our attacks, SNR values following the above steps have been
computed in Example 4.2.17 when the target signal is the exact
value of .v and in Example 4.2.18 when the target signal .wt (v).
P-DPA Step 9 Identify the point of interest. The point of interest is given by the
time sample that achieves the highest SNR value. For our attacks,
in Example 4.2.19, we have analyzed the Random dataset and
identified one POI for the target signal given by the exact value
of .v: .t = 392. The same POI is also for the case when the target
signal is given by .wt (v).
P-DPA Step 10 Measurement of attack traces. After getting our POI, we are
ready to carry out the attack. This step is the same as in DPA
Step 2 from Sect. 4.3.1.1. Suppose we have taken measurements
of our target device with .Mp plaintexts. For .j = 1, . . . , Mp , let
j j j
.𝓁j = (l , l , . . . , lq ) denote the corresponding power trace, where
1 2
q is the total number of time samples in one attack trace. Note
that the measurements should be done in such a way that attack
traces and profiling traces (see P-DPA Step 2) are aligned in the
time domain so that the POI we have identified is the actual POI.
In particular, one attack trace contains the same number of time
samples as one profiling trace. We argue that this is achievable since
we assume the attacker has the knowledge of the implementation
and is in procession of a clone device. For our illustrations, we
will use the Random plaintext dataset as our attack traces. We have
.Mp = 5000.

P-DPA Step 11 Compute hypothetical target intermediate values. This step is


the same as DPA Step 5 from Sect. 4.3.1.1. For each key hypothesis
.k̂i of k and each (part of the) plaintext .pj , we compute a hypothesis

for .v, which is given by

v̂ ij = ϕ(k̂i , pj ),
. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .

For our attacks, with each key hypothesis of the 0th nibble of the
first round key and each known plaintext, we have a hypothetical
value for the 0th Sbox output:

v̂ ij = SBPRESENT (k̂i ⊕ pj ), i=1, 2, . . . , 16, j =1, 2, . . . , 5000,


.

where .pj is the 0th nibble of the plaintext corresponding to the


attack trace .𝓁j . Furthermore, we set
256 4 Side-Channel Analysis Attacks and Countermeasures

. k̂i = i − 1, i = 1, 2, . . . , 16.

P-DPA Step 12 Identify the leakage model and compute the hypothetical sig-
nals. By our choice of the target signal from P-DPA Step 5, we have
a corresponding leakage model. For example, if the target signal is
the exact value of .v, a natural choice of leakage model will be the
identity leakage model.
For each hypothetical target intermediate value, we can compute
the hypothetical signal depending on our leakage model

Hij := L(v̂ ij ) − noise,


. i = 1, 2, . . . , Mk , j = 1, 2, . . . , Mp .

The main difference in this step as compared to DPA Step 6 in


Sect. 4.3.1.1 is that our leakage model cannot be randomly chosen.
We should choose a leakage model based on our target signal
chosen in P-DPA Step 5. In our illustrations, we will consider the
identity leakage model and the Hamming weight leakage model
corresponding to the signal given by .v and .wt (v), respectively.
P-DPA Step 13 Statistical analysis. For a fixed key hypothesis .k̂i , we view the
modeled signal as a random variable .Hi that varies when the
plaintext changes. Take the time sample .t = POI as identified
in P-DPA Step 9. We consider the leakage at this time sample as
a random variable .LPOI .
Then a sample for this pair of random variables .(Hi , LPOI ) is
given by
{ }
j
. (Hij , lPOI ) | j = 1, 2, . . . , M̂p ,

j
where .lPOI is the POI-th entry of the attack trace .𝓁j obtained in P-
DPA Step 10 (.j = 1, 2, . . . , Mp ) and .2 ≤ M̂p ≤ Mp .4 With this
sample, we can compute the sample correlation coefficient between
.Hi and .LPOI for each key hypothesis .k̂i (.i = 1, 2, . . . , Mk ):

ΣM̂p j
M̂p − Hi )(lPOI − lPOI )
j =1 (Hij
.r
i,POI := / / . (4.23)
ΣM̂p ΣM̂ j
j =1 (Hij − Hi ) j =1 (lPOI − lPOI )
2 p 2


p
Figure 4.39 presents the values of .ri,POI (i = 1, 2, . . . , 16) for .POI = 392
computed with the identity leakage model. The x-axis indicates the number of

4 When .M̂
p = 1, the denominator in Eq. 4.23 is equal to 0.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 257

Sample correlation coefficient


Fig. 4.39 Sample correlation coefficients .ri,POIp
(.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the identity leakage model and the Random plaintext dataset. The blue
line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient


p
Fig. 4.40 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the Hamming weight leakage model and the Random plaintext dataset.
The blue line corresponds to the correct key hypothesis .k̂10 = 9

traces .M̂p used. The figure shows that with just roughly 20 traces, we can clearly
distinguish the correct key hypothesis from the wrong ones.
Similarly, the results for the Hamming weight model are shown in Fig. 4.40. In
this case, we need less than five traces to identify the correct key. This indicates that
the Hamming weight leakage model is closer to our DUT leakage compared to the
identity leakage model.

Note
A good leakage model is beneficial to our attack.

Remark 4.3.3 Except for computing SNR in P-DPA Step 8 to identify the POIs,
other methods, e.g., t-test (Sect. 4.2.3) with a properly chosen intermediate value,
can also be used for this purpose.
258 4 Side-Channel Analysis Attacks and Countermeasures

4.3.2.2 Stochastic Leakage Model

To fully utilize the cloned device in the profiled setting, we can further characterize
the leakages instead of just identifying the POI. In this part, we will study a
leakage model that assumes each bit of the target intermediate value (P-DPA
Step 4 from Sect. 4.3.2.1) results in a different signal. In particular, suppose the
target intermediate value .v = vmv −1 vmv −2 . . . v1 v0 has bit length at most .mv ,5 the
stochastic leakage model assumes that

v −1

L(v) =
. αs vs + noise, (4.24)
s=0

where .noise ∼ N(0, σ 2 ) denotes the noise with mean 0 and variance .σ 2 . .αs (.s =
0, 1, . . . , mv −1) are real numbers. We refer to .αs as the coefficients of the stochastic
leakage model.
The attack with stochastic leakage model follows the same steps as described in
Sect. 4.3.2.1. The only difference is in P-DPA Step 12, where we need extra effort to
find our leakage model by profiling. We note that since the stochastic leakage model
assumes each value of .v has different signals, to identify the POI, we will choose
the target signal to be the exact value of .v in P-DPA Step 5. Then, using the leakages
at the POI, we will find estimations for .αs values. Those estimated values together
with Eq. 4.24 provide us with hypothetical signals in P-DPA Step 12.
To estimate .αs , we adopt the ordinary least square method from linear regression
[DPRS11]. Let
⎛ ⎞
pf j,pf j,pf j,pf
𝓁j = l1 , l2 , . . . , lq
.

denote the j th profiling trace, where .j = 1, 2, . . . , Mpf . The steps for computing
estimations of coefficients .αs for the stochastic leakage model are as follows:
SLM Step a Compute the vector of leakages. We only focus on the leakage at the
POI from each profiling trace. Let
⎛ M ,pf

1,pf 2,pf
𝓁pf := lPOI , lPOI , . . . , lPOIpf
.

be the vector of leakages at time sample .t = POI from all .Mpf


profiling traces.
For our illustrations, we aim to recover the same part of the key and
take the same target intermediate value as in Sect. 4.3.2.1. We also use
the Random dataset as our profiling traces, hence .Mpf = 10,000. As

5 When the bit length of .v is less than .mv , some bits .vmv −1 , .. . . are zero.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 259

discussed in P-DPA Step 9, our .POI = 392, which corresponds to the


target signal being the exact value of .v.
SLM Step b Construct matrix .Mv for the target intermediate values. For the j th
pf
profiling trace .𝓁j , let

pf pf pf pf
v j = vj (mv −1) . . . vj 1 vj 0 ,
. j = 1, 2, . . . , Mpf

be the corresponding target intermediate value. Then the matrix .Mv is


given by
⎛ pf pf pf ⎞
v10 v11 . . . v1(mv −1)
⎜ v pf v pf . . . v2(mv −1) ⎟
pf
⎜ 20 ⎟
.Mv := ⎜ ⎟.
21
⎜ .. .. .. .. ⎟ (4.25)
⎝ . . . . ⎠
pf pf pf
vMpf 0 vMpf 1 . . . vMpf (mv −1)

Since the stochastic leakage essentially assumes each value of .v has a


different leakage, we require that all possible values of .v appear in .Mv .
Furthermore, in this case, we can guarantee that the matrix .MvT Mv is
invertible (see Appendix A.2). In particular, we should take enough
random plaintexts so that all values of .v appear. For our illustrations,
.v is the 0th Sbox output. Hence .mv = 4, and we need all 16 values of

.v to appear.

SLM Step c Compute estimated values of coefficients .αs . The estimated values
.α̂s for .αs are given by

( )T ⎛ ⎞−1
. α̂0 α̂1 . . . α̂mv −1 = MvT Mv MvT 𝓁T
pf . (4.26)

j,pf
For each actual leakage .lt , define

v −1

.lˆt
j,pf pf
= α̂s vj s .
s=0

And let

𝓁ˆ pf := (lˆt , lˆt , . . . , lˆt pf ).


1,pf 2,pf M ,pf
.

Then by the ordinary least square method from linear regression, .αˆs values
computed with Eq. 4.26 minimize the Euclidean distance (Definition A.2.1) between
ˆ pf and .𝓁pf (see, e.g., [Ros20, Section 9.8]).
.𝓁
260 4 Side-Channel Analysis Attacks and Countermeasures

Example 4.3.4 The first trace in Random dataset corresponds to the plaintext with
the 0th nibble.= 4 and the key with the 0th nibble.= 7. Then in SLM Step b we have
(see Table 3.11 for PRESENT Sbox)
pf
v 1 = SBPRESENT (4 ⊕ 7) = SBPRESENT (3) = B = 10112 .
.

And the first row of our matrix .Mv is given by


( )
. 1101 .

With .POI = 392 and the Random dataset, we get the following estimated values
for the coefficients .αs :

α̂0 ≈ −0.02019,
. α̂1 ≈ −0.02027, α̂2 ≈ −0.01920, α̂3 ≈ −0.02039.

According to the stochastic leakage model in Eq. 4.24, the estimated leakage of
v = v3 v2 v1 v0 is given by
.

L(v) = αˆ0 v0 + αˆ1 v1 + αˆ2 v2 + αˆ3 v3 + noise.


.

For example,

L(E) = L(1110) = αˆ1 + αˆ2 + αˆ3 + noise = −0.05986 + noise.


.

And the estimated signal of .E = 1110 according to the stochastic leakage model is
given by .−0.052. Similarly, we can compute the estimated signals for all 16 possible
values of the target intermediate value .0, 1, . . . , F:

0 −0.02020 −0.02027 −0.04046 −0.01920 −0.03940 −0.03947


−0.05966
.
−0.02039 −0.04059 −0.04066 −0.06086 −0.03959 −0.05979 −0.05986
−0.08006.
(4.27)
We take the Random plaintext dataset as our attack traces.
Example 4.3.5 As mentioned in Example 4.3.2, .k̂1 = 0, .k̂2 = 1, and for the
Random plaintext dataset,

.p1 = 9, p2 = C.

Then following computations from Example 4.3.2 and the estimated signals given
in Eq. 4.27, with the profiled stochastic leakage model, in P-DPA Step 12, we have

H11 = L(E) − noise = −0.05986,


.

H12 = L(C) − noise = −0.03959,


4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 261

Sample correlation coefficient


Fig. 4.41 Sample correlation coefficients .ri,POIp
(.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the stochastic leakage model and the Random plaintext dataset. The blue
line corresponds to the correct key hypothesis .k̂10 = 9

H21 = L(8) − noise = −0.02039,


H22 = L(D) − noise = −0.05979.

Following P-DPA Step 13 from Sect. 4.3.2.1, the attack results are shown in
Fig. 4.41. Compared to Figs. 4.39 and 4.40, the attack results based on the stochastic
leakage model are similar to that based on the Hamming weight leakage model,
better than the results based on the identity leakage model. This shows that both the
stochastic and the Hamming weight leakage models are better approximations of
the DUT leakage than the identity leakage model.

4.3.2.3 Template-Based DPA

We have seen how to characterize the leakage assuming each bit of the target
intermediate value leaks differently focusing on one POI. We can also char-
acterize/profile the leakages of each possible value of the target intermediate
value (P-DPA Step 4 from Sect. 4.3.2.1) at several POIs. The result of this profiling
process is a set of templates. Then during the attack phase, instead of computing
correlation coefficients, we use those templates to see which of them fits better to
the measured power trace and deduce a probability for each key hypothesis.
As discussed in Sect. 4.2, for a computation with constant signal, the distribution
of leakages at a single time sample can be modeled with a normal distribution. And
leakages at a few time samples can be considered as a Gaussian random vector.
The goal of profiling in template-based DPA is to estimate the mean and variance
(respectively, mean vector and covariance matrix) of the normal random variable
(respectively, Gaussian random vector). The resulting estimations are our templates.
The steps for template-based DPA are similar to those in Sect. 4.3.2.1, except
for P-DPA Step 9, P-DPA Step 12, and P-DPA Step 13. P-DPA Step 9 will be
replaced by two steps (Template Step a and Template Step b below), P-DPA Step
262 4 Side-Channel Analysis Attacks and Countermeasures

12 will be removed, and P-DPA Step 13 will be replaced by the following Template
Step c:
Template Step a Identify point(s) of interest. The same as in P-DPA Step 9, POIs
are given by time samples that achieve the highest SNR values.
The difference is that we can choose more than one POI. With
more POIs, the effort for building the templates will increase, but
the attack results will be better. Normally the attacker decides on
the number of POIs based on experience.
Let .qPOI denote the total number of chosen POIs, and let
.t1 , t2 , . . . , tqPOI denote the time samples that have been identified

as POIs. For our illustrations, we will discuss the results of using


just one POI and using three POIs. It follows from Example 4.2.19
that when the target signal is the exact value of .v, the three POIs
are given by

t1 = 392,
. t2 = 218, t3 = 1328.

And when the target signal is .wt (v), we have

.t1 = 392, t2 = 1309, t3 = 1304.

Template Step b Build the templates. Let us fix a particular target signal value
and only consider inputs to the cryptographic algorithm that result
in traces belonging to the corresponding set .As (see P-DPA
Step 6 from Sect. 4.3.2.1). Let .Lt,s denote the random variable
representing the leakage for such encryption computations at time
sample t. Then the random vector

. Ls := (Lt1 ,s , Lt2 ,s , . . . LtqPOI ,s ) (4.28)

can be modeled by a Gaussian random vector. By Defini-


tion 1.7.10, to find the PDF of a Gaussian random vector, we
need to identify its mean vector and covariance matrix. Using our
profiling traces from set .As , we can compute an approximation
for the mean vector, denoted .μs , using sample means of .Ltu ,s :
⎛ ⎞
μs := lt1 ,s , lt2 ,s , . . . , ltqPOI ,s .
.

Similarly, an approximation for the covariance matrix is then given


by .Qs , where the .(u1 , u2 )-entry of .Qs is the sample covariance
between .Ltu1 ,s and .Ltu2 ,s (.1 ≤ u1 , u2 , ≤ tqPOI ). The pair .(μs , Qs )
is called a template. With our profiling traces, we can compute
.Msignal templates.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 263

For our illustrations, when the target signal is .v, we will have
16 templates. And when the target signal is .wt (v), we will have
five templates.
Template Step c Statistical analysis. In this step, we would like to compute a
probability for each key hypothesis given the attack traces. For
a fixed key hypothesis .k̂i , we divide the .Mp attack traces from P-
DPA Step 10 into .Msignal sets, .A1 , A2 , . . . , AMsignal , depending
on the hypothetical target intermediate value .v̂ ij obtained in P-
DPA Step 11. In particular, for an attack trace .𝓁j , let .sij denote the
index of the set that it belongs to. Namely

𝓁j ∈ Asij
. given key hypothesis k̂i .

We are only⎛ interested in ⎞the leakages at the POIs for each attack
j j j
trace .𝓁j = l1 , l2 , . . . , lq . Define
⎛ ⎞
j j j
𝓁j,POI := lt1 , lt2 , . . . , ltq
. . (4.29)
POI

With the mean vector .μsij and the covariance matrix .Qsij
obtained in Template Step b, we can compute the probability of
.𝓁j given .k̂i using the PDF of the Gaussian random vector (see

Definition 1.7.10) .Lsij :

1
P (𝓁j |k̂i ) = P (Lsij = 𝓁j,POI ) =
. qPOI √
(2π ) det Qsij
2

⎧ ⎫
1
exp − (𝓁j,POI −μsij )T Q−1
sij (𝓁j,POI −μ sij ) .
2
(4.30)

Furthermore, we can assume the measurements are independent


and compute the probability of a set of .M̂p (.1 ≤ M̂p ≤ Mp )
traces given the key hypothesis .k̂i :

⎧ | ⎫ || M̂p
M̂p ||
P {𝓁j }j =1
. k̂
| i = P (𝓁j |k̂i ). (4.31)
j =1

By the generalized Bayes’ theorem (Theorem1.7.2), the probabil-


ity of the key hypothesis .k̂i given a set of .M̂p (.M̂p ≤ Mp ) traces
is given by
264 4 Side-Channel Analysis Attacks and Countermeasures

⎧ | ⎫
M̂p ||
⎧ | ⎫ P {𝓁j }j =1 | k̂i P (k̂i )
|
k̂i || {𝓁j }j =1
M̂p
.P = ⎧ | ⎫ .
Σ
Mk
M̂p ||
P {𝓁j }j =1 | k̂m P (k̂m )
m=1

Typically, the key hypothesis follows a uniform distribution in the


key space, and we have

P (k̂m ) = P (k̂i )
.

in the above equation, which gives


⎧ | ⎫
M̂p ||
⎧ | ⎫ P {𝓁j }j =1 | k̂i
|
.P k̂i | {𝓁j }M̂p = (4.32)
| j =1 ⎧ | ⎫.
Σ
Mk
M̂p ||
P {𝓁j }j =1 | k̂m
m=1

For the attack, we expect the correct key hypothesis to have the
highest probability. In other
⎧ words,
| we are
⎫ mainly interested in the
|
| M̂p
ordering of the values .P k̂i | {𝓁j }j =1 . Since the denominators
are the same for all key hypotheses in Eq. 4.32, we can ignore
them. Then Eq. 4.32 is reduced to Eq. 4.31, which can be further
simplified by leaving out the common term (see Eq. 4.30)

1
. qPOI .
(2π ) 2

And we get

M̂p
|| ⎧ ⎫
1 1 T −1
. √ exp − (𝓁j,POI − μsij ) Qsij (𝓁j,POI − μsij ) .
det Qsij 2
j =1

By taking the natural logarithm, the ordering does not change, and
we have

M̂p
1Σ ( )
. − ln det Qsij + (𝓁j,POI − μsij )T Q−1
sij (𝓁j,POI − μsij ).
2
j =1

Finally, we define the probability score of .k̂i , denoted P.k̂i , to be


4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 265

M̂p
Σ ( )
Pk̂i = −
. ln det Qsij + (𝓁j,POI − μsij )T Q−1
sij (𝓁j,POI − μsij ).
j =1
(4.33)
The higher the score, the more likely the hypothesis is equal to the
correct key.
Remark 4.3.4 Since the computation of covariances grows quadratically with the
number of chosen POIs, in practice, it is also common to assume leakages at
different time samples are independent. In this case, the covariance matrix .Qs
in Template Step b becomes a diagonal matrix.
First, let us choose the target signal to be the exact value of .v. We have built
16 templates. Three POIs (time samples .392, 218, 1328) were chosen as described
in Template Step a. Thus for each template, the mean vector has length 3, and
the covariance matrix has dimension .3 × 3. For example, the template for .L1 ,
corresponding to the intermediate value .v = 0, is given by

μ1 = (−0.04924, −0.04246, −0.07146),


.

⎛ ⎞
1.6110 × 10−6 −6.2968 × 10−9 −1.0592 × 10−7
Q1 = ⎝−6.2968 × 10−9 2.2925 × 10−6 3.7191 × 10−7 ⎠ .
−1.0592 × 10−7 3.7191 × 10−7 2.2567 × 10−6

As another example, the template for .L12 , corresponding to the intermediate value
v = B, is given by
.

μ12 = (−0.04996, −0.05241, −0.07221),


.

⎛ ⎞
1.6390 × 10−6 1.6328 × 10−7 6.3454 × 10−8
Q12 = ⎝1.6328 × 10−7 2.0256 × 10−6 1.7985 × 10−7 ⎠ .
6.3454 × 10−8 1.7985 × 10−7 2.1778 × 10−6

The probability scores for each key hypothesis are shown in Fig. 4.42, where the
blue line corresponds to the correct key hypothesis .k̂10 = 9. We can see that with
just a few traces, the correct key hypothesis can be distinguished from the other key
hypotheses.
Next, we take the target signal to be the Hamming weight of .v, .wt (v). Then
we have five templates. The POIs were chosen as described in Template Step a:
.392, 1309, 1304. The template for .L1 , corresponding to .wt (v) = 0, is given by

μ1 = (−0.04245, 0.08036, −0.03465)


.

⎧ ⎫
2.2925 × 10−6 −8.7422 × 10−8 1.9156 × 10−7
Q1 = .
−8.7422 × 10−8 1.4864 × 10−6 −4.9987 × 10−8
266 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.42 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of .v, the 0th
Sbox output. Three POIs (time samples .392, 218, 1328) were chosen. The blue line corresponds to
the correct key hypothesis .k̂10 = 9

Fig. 4.43 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by .wt (v), the Hamming weight
of the 0th Sbox output. Three POIs (time samples .392, 1309, 1304) were chosen. The blue line
corresponds to the correct key hypothesis .k̂10 = 9

The probability scores for each key hypothesis are shown in Fig. 4.43. Similar to
Fig. 4.42, with just .2, 3 traces we can distinguish the correct key hypothesis from
the rest.
Attack results on other nibbles For now, we have seen practical demonstrations
of how the 0th nibble of the PRESENT first round key can be recovered. As we have
mentioned, DPA attacks work in a divide-and-conquer manner, recovering parts of
the key in parallel using the same set of traces. As an example, we will detail the
attack that recovers the first nibble of the first round key for PRESENT.
In P-DPA Step 1, our target cryptographic implementation is the same as before.
The profiling traces from P-DPA Step 2 will still be the Random dataset. The chosen
part of the key, k, in P-DPA Step 3 is now the first nibble of the first round key.
Consequently, the target intermediate value, .v, in P-DPA Step 4 will be the first
Sbox output. We have the same relation between k, p, and .v:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 267

Fig. 4.44 SNR for each time sample, computed using Random dataset. The signal is given by the
exact value of the 1st Sbox output

v = SBPRESENT (k ⊕ p),
.

where k and p denote the 1st nibble of the first round key and that of the plaintext.
For the target signal in P-DPA Step 5, let us choose the exact value of .v. Following P-
DPA Step 6–P-DPA Step 8, the SNR values are shown in Fig. 4.44. We will choose
one POI in Template Step a, which is given by the time sample corresponding to the
highest point in the figure-404.
Following Template Step b, 16 templates were computed. For example, the
template corresponding to the 1st Sbox output .v = 0 is given by

μ1 = −0.039027,
. σ12 = 2.1679112 × 10−6 .

As for attack traces in P-DPA Step 10, we use the same traces—Random plaintext
dataset. Then according to P-DPA Step 11 and Template Step c, the probability
scores for each key hypothesis are shown in Fig. 4.45. By Eq. 4.1, the correct value
of the 1st nibble of the first round key is given by 8. We can see that similar to the
template-based DPA attacks on the 0th key nibble (see Figs. 4.42 and 4.43), with
just a few traces, we can recover the correct key nibble value.
As another example, the attack results for attacking the 6th nibble of the first
round key are shown in Fig. 4.46, where by profiling, we have identified .POI = 464.
By Eq. 4.1, the correct value of the 6th nibble of the first round key is given by 3.

4.3.2.4 Success Rate and Guessing Entropy

Comparing Figs. 4.42 and 4.43 to Figs. 4.39 and 4.40, we cannot draw a clear
conclusion about which attack method is better. In fact, a different ordering of the
traces in Random plaintext dataset may affect our attack results. For example, by
268 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.45 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of the 1st
Sbox output. One POI (time samples 404) was chosen. The blue line corresponds to the correct
key hypothesis .8

Fig. 4.46 Probability scores (Eq. 4.33) for each key hypothesis computed with different numbers
of traces from Random plaintext dataset. The target signal is given by the exact value of the 1st
Sbox output. One POI (time samples 464) was chosen. The blue line corresponds to the correct
key hypothesis 3

arranging the traces in reverse order, we get Figs. 4.47 and 4.48 instead of Figs. 4.39
and 4.40.
To have a fair comparison between different attack methods (e.g., different
choices of leakage models, POIs, etc.), we introduce the notion of success rate and
guessing entropy [SMY09].

Note
In this part, our aim is to evaluate the DUT and our implementation against
DPA attacks with different settings. Thus, we assume we have the knowledge
of the key for the evaluation after the attack.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 269

Sample correlation coefficient


p
Fig. 4.47 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the identity leakage model and the Random plaintext dataset arranged in
reverse order. The blue line corresponds to the correct key hypothesis .k̂10 = 9
Sample correlation coefficient


p
Fig. 4.48 Sample correlation coefficients .ri,POI (.i = 1, 2, . . . , 16) for .POI = 392. Computed
following Eq. 4.23 with the Hamming weight leakage model and the Random plaintext dataset
arranged in reverse order. The blue line corresponds to the correct key hypothesis .k̂10 = 9

Fix a number of attack traces .M̂p , and for each profiled DPA attack that we
have discussed, we can assign a score to each key hypothesis after the attack: For
leakage model-based DPA attacks, the score of a key hypothesis .k̂i is given by
the absolute value of the corresponding sample correlation coefficient (Eq. 4.23),
and for template-based DPA attacks, the score of a key hypothesis is given by its
M̂p
corresponding probability score (Eq. 4.33). Let sc.i denote the score for the key
hypothesis .k̂i . We have
⎧| |
⎪ | |
⎪|r M̂p | leakage model-based DPA attack, where r M̂p is computed


⎪ | i,POI | i, POI

M̂p following Eq. 4.23
sci
. =


⎪Pk̂
⎪ template-based DPA attack, where Pk̂i is computed following

⎩ i Eq. 4.33.
(4.34)
270 4 Side-Channel Analysis Attacks and Countermeasures

We further define .scoreM̂p to be a vector consisting of the scores obtained for each
key hypothesis with our DPA attack, sorted in descending order:
⎧ ⎫
M̂ M̂ M̂ M̂ M̂
scoreM̂p = sci1 p , sci2 p , . . . , sciMp ,
. where scij p ≥ scij +1
p
k

for j = 1, 2, . . . , Mk − 1.

M̂p M̂p
The key rank of a key hypothesis .k̂i , denoted rank. , is given by the index of .sci
k̂i
in .scoreM̂p . In particular, let .k̂c denote the correct key hypothesis. We have

M̂p M̂p
.rank = index of scc in scoreM̂p . (4.35)
k̂c

With the same number of traces, we may also get different key ranks for the correct

key hypothesis due to the different plaintexts/measurements. We consider .rank p
k̂c
as a random variable whose randomness comes from the different plaintexts and
measurements.

The ultimate goal of the attack is to achieve .rank p = 16 so that we can retrieve
k̂c
the correct key hypothesis. Thus, we say that an attack is successful with .M̂p traces
M̂p
if rank. = 1. Then the success rate of an attack method with .M̂p traces, denoted
k̂c
M̂p
SR.M̂p , is defined to be the probability that rank. = 1:
k̂c
⎧ ⎫
M̂p
SRM̂p
. = P rank = 1 . (4.36)
k̂c

Empirically, we can estimate the value of .SRM̂p by computing the frequency of


M̂p
rank
. = 1 among a certain number of attacks.
k̂c
For another metric, the guessing entropy for an attack method with .M̂p traces,
M̂p
denoted .GEM̂p , is given by the expectation of the random variable .rank :
k̂c
⎡ ⎤

GEM̂p = E rank p .
. (4.37)
k̂c

6 We note that if the key rank is low enough, it is possible to use key enumeration algorithms
M̂p
[VCGRS13] that enable the key recovery even in the case when .rank > 1.
k̂c
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 271

With the terminologies from Sect. 1.8.2, we can approximate .GEM̂p with a point
M̂p
estimator (see Remark 1.8.4) given by the sample mean of .rank .
k̂c
Furthermore, when we vary the number of traces .M̂p used for computing
M̂p
rank
. , we will get different key ranks for the correct key hypothesis. Thus the
k̂c
M̂p
probability for the random variable .rank = 1 and its expectation will also vary.
k̂c
To analyze how .SRM̂p and .GEM̂p change with increasing values of .M̂p , we compute
estimations for .SRM̂p and .GEM̂p according to Algorithm 4.1.
The input of Algorithm 4.1 takes two user-specified values max_trace and
no_of_attack. max_trace is the maximum number of traces (or the biggest value
of .M̂p ) we would like to use for estimating .SRM̂p and .GEM̂p . In line 3, sizes
of .Ssr and .Sge are set to be max_trace.+1 so that the .M̂p th entry of each array
corresponds to the estimation for .SRM̂p and .GEM̂p , respectively. For a fixed value
of .M̂p , no_of_attack is the number of attacks to simulate, or equivalently, the
M̂p
number of elements in the sample of .rank for computing the sample mean (i.e.,
k̂c

estimation of .GEM̂p ) and frequency of .rank p = 1 (i.e., estimation of .SRM̂p ). The
k̂c
set of attack traces from P-DPA Step 10 is denoted by dataset (line 2). For each
value of .M̂p between 2 and max_trace (line 4), we simulate no_of_attack attacks
(line 6). Thus we randomly select .M̂p ×no_of_attack traces from dataset. Those
traces are stored in an array A (line 5). Each simulated attack takes .M̂p traces from
the array A without repetition (line 7). The key rank of the correct key hypothesis
is computed following Eq. 4.35 and the attack steps described in the earlier parts of
the section. .Sge [M̂p ] stores the sum of the key ranks of the correct key hypothesis
for each attack (line 10); then the averaged value is computed as an estimate for the
guessing entropy .GEM̂p (line 14). When the key rank of the correct key hypothesis
is 1, .Ssr [M̂p ] is increased by 1 (line 12). At the end .Ssr [M̂p ] divided by the number
of total simulated attacks gives the frequency of successful attacks (line 13).
As discussed in Sect. 4.2.3, by comparing Figs. 4.12 and 4.13 (or Figs. 4.14
and 4.15), we notice that with more traces, it is more likely for us to capture infor-
mation about the inputs (or intermediate values) from the side-channel leakages.
Naturally, we expect the value of .SRM̂p to be higher and the value of .GEM̂p to
be lower when .M̂p is bigger. And the attack method that achieves .SRM̂p = 1 or
GEM̂p = 1 with smaller .M̂p is considered to be a better attack.
.

Now we are ready to compare our attack methods with attack traces from the
Random plaintext dataset. We have discussed in Example 4.2.19 that by analyzing
the Random dataset, we identified one POI for the identity leakage model and for
the Hamming weight leakage model: 392. For comparison, we also consider the
attack with a different POI 1328 for the identity leakage model and 1304 for the
Hamming weight leakage model.
272 4 Side-Channel Analysis Attacks and Countermeasures

Algorithm 4.1: Computation of estimations for guessing entropy and success


rate
Input: max_trace, no_of_attack // “max_trace” is the maximum number of
traces we would like to use for estimating SRM̂p and GEM̂p ; for a
fixed value of M̂p , “no_of_attack” is the number of attacks, or

equivalently, the number of elements in one sample of rank p .
k̂c
Output: Estimations of success rate SRM̂p and estimations of guessing entropy GEM̂p for
M̂p = 2, 3, . . . , max_trace
1 Follow P-DPA Step 1–P-DPA Step 11 from Sect. 4.3.2.1 to do the profiling and set up the
attacks (Template Step a and Template Step b from Sect. 4.3.2.3 apply if we focus on a
template-based DPA)
2 Let dataset denote the set of attack traces obtained in P-DPA Step 10
3 zero array of size max_trace+1 Ssr , Sge // variables to store estimations of
success rate and guessing entropy, initialized to zero
4 for M̂p = 2, M̂p ≤ max_trace, M̂p + + do
randomly choose
5 array of size M̂p ×no_of_attack A ←−−−−−−−−− dataset// randomly
choose “M̂p ×no_of_attack” traces from “dataset” and store in A
6 for i = 0, i < no_of_attack, i + + do
7 array of size M̂p B = A[i × M̂p : (i + 1) × M̂p ]// take M̂p traces from
set A without repetition for each ith attack
8 Using the dataset B as attack traces, follow P-DPA Step 12–P-DPA Step 13 from
Sect. 4.3.2.1 (Template Step c from Sect. 4.3.2.3 applies if we focus on a
template-based DPA) to get the score of each key hypothesis given by Eq. 4.34
M̂p
9 rk = rank // Key rank of the correct key hypothesis as given
k̂c
in Eq. 4.35
10 Sge [M̂p ] + = rk
11 if rk == 1 then
12 Ssr [M̂p ] + = 1

13 Ssr [M̂p ] = Ssr [M̂p ]/no_of_attack// compute the frequency of successful


attacks
14 Sge [M̂p ] = Sge [M̂p ]/no_of_attack// compute the sample mean
15 return Ssr , Sge // Ssr [Mp ] (respectively, Sge [Mp ]) contains the estimation
for SRM̂p (respectively, GEM̂p ).

As for template-based DPA, we consider two target signals: .v and .wt (v). For
each target signal, we look at two choices of POIs: one POI (392) and three POIs
(.392, 218, 1328 for .v and .392, 1309, 1304 for .wt (v)). When three POIs are chosen,
we also analyze the case when leakages at those POIs are assumed to be independent
(see Remark 4.3.4).
We note that when just one POI is considered, .Ls from Eq. 4.28 becomes
j
a normal random variable .Ls . .𝓁j,POI (Eq. 4.29) becomes one single point .lPOI .
According to the PDF of a normal random variable (Eq. 1.37), Eq. 4.30 becomes
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 273

Fig. 4.49 Estimations of success rate computed following Algorithm 4.1 for profiled DPA attacks
based on the stochastic leakage model, the identity leakage model, and the Hamming weight
leakage model using the Random plaintext dataset as attack traces

⎧ j ⎫
j 1 (lPOI − μsij )2
P (𝓁j |k̂i ) = P (Lsij =
. lPOI ) =/ exp − ,
2σs2ij π 2σs2ij

where .μsij and .σs2ij are estimations (template) for the mean and variance of .Lsij .
Consequently, the score of .k̂i in Eq. 4.33 is given by

M̂p
Σ j
(lPOI − μsij )2
Pk̂i = −
. ln(σs2ij ) + .
σs2ij
j =1

Following Algorithm 4.1, we can compute estimations of .SRM̂p and .GEM̂p for
our profiled DPA attacks with different settings. We have chosen

no_of_attack = 100,
. max_trace = 50.

For a fair comparison, for a given value of .M̂p , the same traces are used for all
attacks.
The results for leakage model-based profiled DPA are shown in Figs. 4.49
and 4.50. We have seen in Figs. 4.39, 4.40, and 4.41 that with the Hamming weight
or the stochastic leakage models, we can distinguish the correct key using fewer
traces as compared to using the identity leakage model. As expected, we can see
from Fig. 4.49 that fewer traces are needed for SR to reach 1 with the Hamming
weight or the stochastic leakage models. Furthermore, we can also see that attack
results for the Hamming weight or the stochastic leakage models are similar, with
the stochastic leakage model giving slightly better performance. Similarly, Fig. 4.50
shows that fewer traces are needed for GE to reach 1 using the Hamming weight or
the stochastic leakage models as compared to the identity leakage model. Moreover,
the results also demonstrate that the choice of POI is important for the attack. When
the chosen POI has a lower SNR, the attack will need many more traces.
274 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.50 Estimations of guessing entropy computed following Algorithm 4.1 for profiled DPA
attacks based on the stochastic leakage model, the identity leakage model, and the Hamming weight
leakage model using the Random plaintext dataset as attack traces

Fig. 4.51 Estimations of success rate computed following Algorithm 4.1 for template-based DPA
attacks using the Random plaintext dataset as attack traces and the Random dataset as profiling
traces

The results for template-based DPA are shown in Figs. 4.51 and 4.52. Note that
in this case the results are shown for up to 20 traces instead of 50 for leakage model-
based DPA attacks, since much fewer traces are needed for a successful attack. We
have the following observations:
• When the target signal is given by .v, the attack requires fewer traces as compared
to the case when the target signal is given by .wt (v). This is expected as for the
former case we have 16 templates while for the latter we have 5. Of course, the
attack results demonstrated that we had enough traces for profiling to get good
templates. Without enough profiling traces, different attack results might appear.
• Assuming independence between the leakages at different POIs does not affect
the attack results significantly. Especially for the case when the target signal is
given by .v with three POIs, those two lines are overlapping.
• Using three POIs gives better results than just one POI.
• Compared to Figs. 4.49 and 4.50, template-based DPA, in general, performs
better than leakage model-based DPA. This is not surprising as more information
is retrieved from the profiling traces using template-based attacks.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 275

Fig. 4.52 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks using the Random plaintext dataset as attack traces and the Random dataset as
profiling traces

Fig. 4.53 Estimations of success rate computed following Algorithm 4.1 for leakage model-based
and template-based DPA attacks with the Random plaintext dataset as attack traces

For easy comparison, we have also plotted the results for template-based DPA
with one POI and leakage model-based DPA in Figs. 4.53 and 4.54

4.3.3 Side-Channel Assisted Differential Plaintext Attack

Side-channel assisted differential plaintext attack (SCADPA) [BJB18] aims to


recover a middle round key of an SPN cipher (see Fig. 3.2) with chosen plaintext and
leakages from power traces. The motivation for such an attack is that the developer
might choose to protect only the first two/three and the last two/three rounds of a
cipher implementation in order to increase the speed (see, e.g., [THM07, SP06]).
Before we continue our discussion on SCADPA, we introduce the notion of
difference distribution table of an Sbox.
276 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.54 Estimations of guessing entropy computed following Algorithm 4.1 for leakage model-
based template-based DPA attacks with the Random plaintext dataset as attack traces

Definition 4.3.1 For an Sbox SB.: Fω2 1 → Fω2 2 , the (extended) difference distribu-
tion table (DDT)7 of SB is a two-dimensional table T of size .(2ω1 − 1) × 2ω2 such
that for any .0 < δ < 2ω1 and .0 ≤ Δ < 2ω2 , the entry of T at the .Δth row and .δth
column is given by
{ }
.T [Δ, δ] = a | a ∈ Fω2 1 , SB(a ⊕ δ) ⊕ SB(a) = Δ .

We refer to .δ as the input difference and .Δ as the output difference.


Example 4.3.6 The difference distribution table for PRESENT Sbox SB.PRESENT
(Table 3.11) is detailed in Table 4.1. The row corresponding to output difference
.Δ = 0 is omitted since it is empty. For example,

SBPRESENT (9⊕3)⊕SBPRESENT (9) = SBPRESENT (A)⊕E = 1111⊕1110 = 0001 = 1.


.

Hence, 9 is in the entry corresponding to .δ = 3 and .Δ = 1.

Remark 4.3.5 Suppose we know the input difference and output difference for a
particular Sbox input. Then with the DDT we can deduce the possible values of the
input. For example, if we know one PRESENT Sbox input .a with input difference
A gives output difference 2. Then by Table 4.1, .a = 5 or .F. We will utilize such
observations for SCADPA attacks and for certain fault attacks in Sect. 5.1.
Attack assumption of SCADPA For SCADPA, we have the following assump-
tions for the attacker’s knowledge and ability:
• The attacker does not have knowledge of the exact details of the implementation.
However, the attacker knows certain basic parameters of the implemented

7 In the original definition of DDT [BS12], the entries are .|T [Δ, δ]|, i.e., the cardinalities of
.T [Δ, δ].
Table 4.1 Difference distribution table for PRESENT Sbox (Table 3.11). The columns correspond to input difference .δ, and the rows correspond to output
difference .Δ. The row for .Δ = 0 is omitted since it is empty

❍❍ δ 1 2 3 4 5 6 7 8 9 A B C D E F
Δ ❍
1 9A 36 078F 5E 1C 24BD
2 8E 34 09 5F 1D 67AB 2C
3 CDEF 46 12 3B 0A 58 79
4 47 8D 35AC 0B 2F 169E
5 CDEF 0145 2389 67AB
6 9B CDEF 37 06 25 18 4A
7 67AB 03 8C 5D 2E 49 1F
8 17 AD 6F 4E 2389 0C 5B
9 0145 9D BE 2A 7C 3F 68
A 02 56 BF 9C 7D 1A 48 3E
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers

B 8B 27 35AC 169E 4F 0D
C 8a 26 0145 9F BC 7E 3D
D 2389 57 AF 4C 1B 6D 0E
E 13 AE 24BD 6C 59 078F
F 24BD 169E 078F 35AC
277
278 4 Side-Channel Analysis Attacks and Countermeasures

algorithm, e.g., whether the implementation is round-based or bit-sliced. Such


information may also be deduced by the attacker with visual inspection of the
traces.
• The attacker can query encryptions with chosen plaintext and a fixed unknown
master key.
• We consider observable leakages in our analysis. Specifically, the adversary can
deduce from the side-channel information if a particular intermediate value is
different between two distinct encryption operations. Optionally, the attacker
may enhance the clarity of side-channel measurements by employing techniques,
such as averaging, denoising, filtering, etc.

The goal of the attacker is to recover a middle-round key. SCADPA can be


applied to any SPN cipher that has been proposed up to now.
We first give the definition of several basic notations. Suppose our target SPN
cipher has in total Nr rounds. We consider the encryption of two plaintext blocks,
denoted by .S0 and .S0' . The corresponding cipher states at the end of round i are
represented by .Si and .Si' , respectively. A small part (e.g., a bit, a nibble, and a byte)
of the XOR difference between intermediate values of those two encryptions is said
to be active if it is nonzero. The exact value of this small part is called a differential
value.
Example 4.3.7 Let us consider AES-128. Fig. 4.55 shows a possible sequence of
XOR differences between the cipher states of two encryptions, where colored squares
correspond to active bytes. The two plaintexts .S0 and .S0' differ in the four main
diagonal bytes. After AddRoundKey and SubBytes operations, those 4 active bytes
remain active. Then ShiftRows will move the positions of those 4 active bytes. In
this particular case, MixColumns operation changes 4 active bytes to just 1 active
byte. Finally, after AddRoundKey, this active byte remains.
Example 4.3.8 In this example, we consider PRESENT. Figure 4.56 shows an
example of how the XOR differences between the cipher states can change in the
first three rounds. We adopt terminologies from DDT (Definition 4.3.1) and refer to
the value of the active nibble corresponding to the input (respectively, output) of an
active Sbox as the input difference (respectively, output difference) of this Sbox.
The plaintext pair .S0 and .S0' differs in the 0th–15th bits, corresponding to the
rightmost four Sbox inputs. In this particular case, the output differences of those
four Sboxes are all equal to 1. In other words, the differential value of the 0th–15th

Fig. 4.55 A possible sequence of XOR differences between the cipher states of two encryptions,
where colored squares correspond to active bytes. AK, SB, SR, and MC stand for AddRoundKey,
SubBytes, ShiftRows, and MixColumns, respectively
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 279

Fig. 4.56 An example of how the XOR differences between the cipher states can change after each
round operation of PRESENT. The output differences of the four active Sboxes in round 1 are 1.
The output difference of the single active Sbox in round 2 is also 1

bits of sBoxLayer output is 1111. Thus, after the first round, .S1 ⊕ S1' has 4 active
bits which correspond to the 0th Sbox input of round 2. Then we again get an output
difference 1 for this Sbox, giving us just 1 active bit in .S2 ⊕ S2' . Consequently, we
have one active nibble after the sBoxLayer operation in round 3.
A cipher state can be written as the concatenation of several small parts of the
same bit length .ω. In particular, let .𝓁 = n/ω, where n is the block length of the SPN
cipher. We have

Si = si0 ||si1 || . . . ||si𝓁−1 ,


. Si' = si0
' '
||si1 '
|| . . . ||si𝓁−1 , (4.38)

where each .sij and .sij' is a binary string of length .ω. A differential characteristic for
round i, denoted .ΔSi , is a binary string of length .𝓁:

ΔSi = (Δsi0 , Δsi1 , . . . , Δsi𝓁−1 ) ∈ F𝓁2 .


.

We say that the intermediate values of two encryptions .Si and .Si' achieve the
differential characteristic .ΔSi if
{
' =0 if Δsij = 0
.sij ⊕ sij ∀j = 0, 1, . . . , 𝓁 − 1.
/= 0 if Δsij = 1

A sequence of .ΔSi s

ΔS0 , ΔS1 , . . . , ΔSr ,


. where r ≤ Nr
280 4 Side-Channel Analysis Attacks and Countermeasures

is called a differential pattern. If .wt (ΔSr ) = 1, we say that the differential pattern
converges in round r. A plaintext pair is said to achieve a differential pattern if the
corresponding intermediate values achieve each of the differential characteristics in
this differential pattern.
Example 4.3.9 [Differential pattern—AES] Continuing Example 4.3.7, we choose
ω = 8, and then .𝓁 = 128/8 = 16. Figure 4.55 corresponds to the following
.

differential pattern:

ΔS0 , ΔS1 = 1000010000100001, 1000000000000000.


. (4.39)

Since .wt (ΔS1 ) = 1, this differential pattern converges in round 1.


For example, let us take the following pair of plaintexts:

S0 = 4C3C3F54C7AAD34E607110C753C5E990,
.

S0' = 033C3F54C725D34E607131C753C5E90F,

with the master key

34463146344638383341464542413731.
. (4.40)

Then

.S0 ⊕ S0' = 4F000000008F0000000021000000009F,

achieves the differential characteristic .ΔS0 from Eq. 4.39. After one round of AES,
we have

S1 = 1F1DABAE4071BDD502563FBF63841BAE,
.

S1' = C81DABAE4071BDD502563FBF63841BAE,

and

. S1 ⊕ S1' = D7000000000000000000000000000000.

Hence .S1 and .S1' achieve the differential characteristic .ΔS1 from Eq. 4.39. The
differential value for the 2 active bytes in .S1 ⊕ S1' is D7. We can conclude that
the pair of plaintexts .S0 and .S0' achieves the differential pattern given in Eq. 4.39.
Remark 4.3.6
• Following the convention for AES intermediate value representations
(see [NIS01]), the string of hexadecimal values is transferred to the 4 .×
4 matrix of bytes (see Eq. 3.2) column by column. For example, .S0 =
4C3C3F54C7AAD34E607110C753C5E990 in the matrix format is as follows:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 281

⎛ ⎞
4C C7 60 53
⎜3C AA 71 C5⎟
.⎜ ⎟.
⎝3F D3 10 E9⎠
54 4E C7 90

• When PRESENT is considered, we write the indices .j = 0, 1, . . . , 𝓁 − 1 in


reverse order following the notations for PRESENT cipher (see Sect. 3.1.3).
Example 4.3.10 [Differential pattern—PRESENT] Continuing Example 4.3.8, let
ω = 1, then .𝓁 = 64/1 = 64. Fig. 4.56 corresponds to the differential pattern:
.

.ΔS0 = 000000000000FFFF, ΔS1 = 000000000000000F,


ΔS2 = 0000000000000001. (4.41)

Since .wt (ΔS2 ) = 1, this differential pattern converges in round 2.


For example, let us take the following pair of plaintexts:

S0 = DCFC2D56F32EC070,
. S0' = DCFC2D56F32E3F8F,

with the master key

1234567812345678.
. (4.42)

Then

.S0 ⊕ S0' = 000000000000FFFF,

which achieves the differential characteristic .ΔS0 as given in Eq. 4.41. After the first
round, we get

S1 = 0A93D18CAF9C888B,
. S1' = 0A93D18CAF9C8884,

which achieves the differential characteristic .ΔS1 from Eq. 4.41 since .4 ⊕ B = F. In
other words, the differential value for the active nibble in .S1 ⊕ S1' is F. Finally, after
the second round, we get

S1 = C09B5DFC8AF48EF3,
. S2' = C09B5DFC8AF48EF2,

which achieves the differential characteristic .ΔS2 from Eq. 4.41.


Now, let us fix a differential characteristic .ΔS0 . With .2Mp chosen plaintexts, we
can construct .22Mp −1 plaintext pairs that achieve the differential characteristic .ΔS0 .
Suppose the probability for .ΔS0 to result in a differential pattern that converges
in round r is .2−pr . Then if we would like to get at least one pair of plaintext that
282 4 Side-Channel Analysis Attacks and Countermeasures

achieves a differential pattern starting with .ΔS0 , and converging in round r, we


should choose .Mp plaintexts such that

pr + 1
Mp =
. . (4.43)
2
Example 4.3.11 [Probability of convergence—AES] Let us consider AES and the
differential characteristic .ΔS0 given by

ΔS0 = 1000010000100001.
. (4.44)

We would like to compute the probability that .ΔS0 results in a differential pattern
that converges in round 1, namely

P (wt (ΔS1 ) = 1|ΔS0 = 1000010000100001) .


.

If we take any plaintext pair that achieves differential characteristic .ΔS0 , after
AddRoundKey and SubBytes operations, those 4 active bytes in the main diagonal
will remain active. ShiftRows changes their positions to be all in the first column.
Then after MixColumns and AddRoundKey, any byte in the first column can
be active. Thus, all the possible differential characteristics .ΔS1 following the
differential characteristic .ΔS0 are of the form

ΔS1 = x0 000x1 000x2 000x3 ,


. (4.45)

where .x = (x0 , x1 , x2 , x3 ) ∈ F42 and .x /= 0. There are in total four possible


differential characteristics .ΔS1 satisfying .wt (ΔS1 ) = 1, given by four values of
.x that satisfy .wt (x) = 1. Those four differential patterns are shown in Fig. 4.57. We

have seen one of them in Fig. 4.55 (see Example 4.3.7).


Furthermore, intermediate values .S1 and .S1' that can achieve .ΔS1 in Eq. 4.45
satisfy

S1 ⊕ S1' = a 0 000a 1 000a 2 000a 3 000,


.

where .a i ∈ F82 for .i = 0, 1, 2, 3 and .a i0 /= 0 for some .i0 ∈ {0, 1, 2, 3}. Then there
are in total

(28 )4 − 1 = 232 − 1
.

possible values for .S1 ⊕ S1' . Out of which,

4 × (28 − 1) ≈ 210
. satisfy wt (ΔS1 ) = 1.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 283

Fig. 4.57 Illustration of how active bytes change for all four differential patterns that start with
.ΔS0 = 1000010000100001 and converge in round 1. Blue squares correspond to active bytes. AK,
SB, SR, and MC stand for AddRoundKey, SubBytes, ShiftRows, and MixColumns, respectively

There are in total .232 − 1 possible differential values for the 4 active bytes before
MixColumns operation. According to Remark 3.1.3, any value of .S1 ⊕ S1' comes
from exactly one differential value for those 4 active bytes. Suppose differential
values of those 4 active bytes follow a uniform distribution on .F32 2 . Then the
probability of any value of .S1 ⊕ S1' to occur is .≈ 2−32 . Consequently, we have

210
P (wt (ΔS1 ) = 1|ΔS0 = 1000010000100001) ≈
. = 2−22 .
232
In this case, .pr = 22. By Eq. 4.43,

22 + 1
Mp =
. = 11.5.
2

Thus, we need .211.5 chosen plaintexts to get a differential pattern that starts with
.ΔS0 as given in Eq. 4.44 and converges in round 1.

Example 4.3.12 [Probability of convergence—PRESENT] In this example, we


consider PRESENT and the following differential characteristic:

ΔS0 = 000000000000FFFF.
. (4.46)

Let SB denote the PRESENT Sbox. We would like to compute the probability of a
differential pattern that starts with .ΔS0 and converges in round 2, namely

P (wt (ΔS2 ) = 1|ΔS0 = 000000000000FFFF).


.
284 4 Side-Channel Analysis Attacks and Countermeasures

Let .SBij denote the j th Sbox in round i. Recall that the 0th Sbox is the right-most
Sbox (see Fig. 3.9). Let .δSBi and .ΔSBi denote the input and output differences of
j j
Sbox .SBij , respectively.
For .ΔS2 to have Hamming weight 1, we need to have just one active Sbox in
round 2 with output difference having Hamming weight 1. Let .SB2j0 be the single
active Sbox in round 2.
By the design of pLayer (see Table 3.12), the four active Sboxes in round 1

SB10 ,
. SB11 , SB12 , SB13

influence the following four Sboxes in round 2:

.SB20 , SB24 , SB28 , SB212 .

We also notice that the j th bit of all the four Sboxes in round 1 goes to the .(4 ∗ j )th
Sbox in round 2. Since none of the output differences of those four Sboxes in round
1 is equal to 0, to have just one active Sbox in round 2, the output differences of
those four active Sboxes in round 1 should all be the same with Hamming weight
1. This implies that the input difference of the single active Sbox in round 2, .SB2j0 ,
is F. Furthermore, by Eq. 4.46, those four active Sboxes in round 1 all have input
difference F.
According to Table 4.1, for input difference F, the possible output differences
with Hamming weight 1 are 1 and 4. By counting the number of elements in each
entry of column F in Table 4.1, we can get that the probability for the output
difference to be 1, given that the input difference is F, is .4/16 = 1/4. The same
result holds for output difference 4. The probability that all output differences of the
four active Sboxes in round 1 are equal to 1 is then given by

⎛ | ⎞ ⎧ 1 ⎫4
|
P ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 1|δSB1 = δSB1 = δSB1 = δSB1 = F =
.
0 1 2 3 0 1 2 3 4
= 2−8 .

Similarly, we have

⎛ | ⎞ ⎧ 1 ⎫4
|
P ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 4|δSB1 = δSB1 = δSB1 = δSB1 = F =
.
0 1 2 3 0 1 2 3 4
= 2−8 .
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 285

The probability for the single active Sbox in round 2 to have output difference with
Hamming weight 1 is given by
⎧ ⎧ ⎫ | ⎫ ⎧ | ⎫
| |
P wt ΔSB2 = 1|δSB2 = F = P ΔSB2 = 1|δSB2 = F
.
j0 j0 j0 j0
⎧ | ⎫
| 1 1
+ P ΔSB2 = 4|δSB2 = F = + = 2−1 .
j0 j0 4 4

When the output differences of

SB10 ,
. SB11 , SB12 , SB13

are all equal to 1 (respectively, 4), the single active Sbox in round 2 is given by .SB20
(respectively, .SB28 ). We have

. P (wt (ΔS2 ) = 1|ΔS0 = 000000000000FFFF)


⎛ ⎛ ⎞ | ⎞ ⎛ ⎛ ⎞ | ⎞
| |
= 2−8 P wt ΔSB2 = 1|δSB2 = F + 2−8 P wt ΔSB2 = 1|δSB2 = F
0 0 8 8

−1 −8 −1 −9 −9 −8
=2 ×2
8
+2 ×2 =2 +2 =2 .

In this case, we have pr=8. By Eq. 4.43,

8+1
. Mp = = 4.5.
2

Thus we need .24.5 chosen plaintexts to get a differential pattern that starts with .ΔS0
as given in Eq. 4.46 and converges in round 2.
From the discussions above, we can see that there are in total four such
differential patterns, corresponding to

. ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 1, ΔSB2 = 1,


0 1 2 3 0

ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 1, ΔSB2 = 4,


0 1 2 3 0

ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 4, ΔSB2 = 1,


0 1 2 3 8

ΔSB1 = ΔSB1 = ΔSB1 = ΔSB1 = 4, ΔSB2 = 4.


0 1 2 3 8

We have seen the first one in Fig. 4.56 (see Example 4.3.8). The remaining three are
shown in Figs. 4.58, 4.59, and 4.60, respectively.
In SCADPA, the attacker queries the encryption with pairs of plaintexts that
achieve a target differential characteristic .ΔS0 and potentially result in a differential
pattern that converges in round r. .ΔS0 and the round number r are chosen so that
the probability of convergence is not too small. Then by comparing side-channel
286 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.58 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converging in round 2. The output differences of the four
active Sboxes in round 1 are 1. The output difference of the single active Sbox in round 2 is 4

Fig. 4.59 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converges in round 2. The output differences of the four
active Sboxes in round 1 are 4. The output difference of the single active Sbox in round 2 is 1

leakages of a middle round from both encryptions for a pair of plaintexts, the
attacker tries to confirm if the convergence is achieved and identify the differential
characteristic .ΔSr when convergence happens. Thus, we need to choose .ΔS0 and r
in a way that we can find a point for side-channel observation so that the leakages
can tell us whether the convergence has happened, and if yes, what is the value of
.ΔSr .

Example 4.3.13 [Point for side-channel observation—AES] Let us consider AES


with .ω = 8. As an attacker, we choose the target differential characteristic .ΔS0 =
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 287

Fig. 4.60 An illustration of how the XOR differences between the cipher states can change after
each round operation for PRESENT such that the pair of plaintexts achieves a differential pattern
starting with .ΔS0 given in Eq. 4.46 and converges in round 2. The output differences of the four
active Sboxes in round 1 are 4. The output difference of the single active Sbox in round 2 is 4

1000010000100001. Then we query the encryption with plaintext pairs that achieve
this .ΔS0 . For each plaintext, we take, say, .Np traces and use the averaged trace as
the leakages for this plaintext. By averaging, the noise can be reduced. Then the
difference between averaged traces of each pair of plaintext is computed.
As discussed in Example 4.3.11, there are four differential patterns that start with
.ΔS0 and converge in round 1. They are given by the following four values of .ΔS1 :

1000000000000,
. 0000100000000, 0000000010000, 0000000000001,

corresponding to the single active byte at the end of round 1 being the first, second,
third, and fourth bytes in the first column. Figure 4.61 shows how the active bytes
change from round 1 to round 3 for all four differential patterns. In the second round,
SubBytes does not change the position of this single active byte. ShiftRows changes
its position to a different column unless this active byte is the first byte. Due to
the property of MixColumns operation (see Remark 3.1.3), this single active byte
will influence 4 bytes, leading to 4 active bytes in one single column of the cipher
state. Finally, AddRoundKeys in round 2 and SubBytes operation in round 3 will
not change the position or number of active bytes.
As discussed in Example 4.3.11, all possible differential characteristics .ΔS1 are
of the form as given in Eq. 4.45. In case .wt (ΔS1 ) /= 1, we will have more than 1
active byte at the end of round 1, which will be in more than one column after the
SubBytes and ShiftRows operations in round 2. Consequently, there will be at least
two active columns at the end of round 2. We can then conclude that

ΔS1 = 1000000000000 ⇐⇒ ΔS2 = 1111000000000000,


.
288 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.61 Illustration of how active bytes change from round 1 to round 3 of AES computation,
for differential patterns that start with .ΔS0 = 1000010000100001

ΔS1 = 0000100000000 ⇐⇒ ΔS2 = 0000111100000000,


ΔS1 = 0000000010000 ⇐⇒ ΔS2 = 0000000011110000,
ΔS1 = 0000000000001 ⇐⇒ ΔS2 = 0000000000001111.

Suppose the SubBytes operation is implemented column-wise from the first


column to the fourth column. Then when we take the trace difference for a pair
of plaintexts, we would expect to see peaks around time samples corresponding to
active columns and relatively small differences around time samples corresponding
to columns that are not active during the SubBytes operation in round 3. By
identifying the active columns, we can deduce the value of .ΔS1 . In particular, the
point of side-channel observation should be SubBytes operation in round 3. Note
that we assume using SPA or other methods, the attacker can infer the timing for
each operation.
As an example, with the master key from Eq. 4.40 and the experimental setup as
described in Sect. 4.1. We adopted the TinyAES8 implementation for AES, which
is widely used for academic purposes. Measurements for the following four pairs of
plaintexts were taken:

. 4C3C3F54C7AAD34E607110C753C5E990,
033C3F54C725D34E607131C753C5E90F; (4.47)

8 https://github.com/kokke/tiny-AES-c
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 289

. 3B06201F5EAA0BD6794C249610FBE927,
5F06201F5E750BD6794CB79610FBE995; (4.48)

. 2D2A49F26A79655214056A7B5F35A9E9,
D12A49F26ACC655214052D7B5F35A9C6; (4.49)

. 0EDB19A25C7EF1FDDED31178EE6E7478,
FADB19A25C06F1FDDED30E78EE6E7415. (4.50)

Np = 100 traces were collected for each plaintext. All pairs of plaintexts achieve
.

the same differential characteristic .ΔS0 = 1000010000100001. The .ΔS1 values are
given by

1000000000000,
. 0000100000000, 0000000010000, 0000000000001,

respectively. Illustrations of the active bytes change for each pair correspond to the
four rows of Fig. 4.61.
In Fig. 4.62, the difference between the averaged traces of each pair of the
plaintexts is in red, blue, green, and yellow, respectively. We have also plotted
the averaged traces for the first plaintext in Eq. 4.47 (in gray), for the purpose of
identifying the round operations. Similar to Fig. 4.3, we can find the rough time
interval for the SubBytes operation in round 3, which is colored in pink. This is the
point for our side-channel observation. After zooming in, we get Fig. 4.63. Recall
that the SubBytes operation was implemented column-wise starting from the first
column. By the choice of the plaintext pairs, the red, blue, green, and yellow traces
correspond to a single active column (see Fig. 4.61) at the first, second, third, and

Fig. 4.62 The difference between the averaged traces of plaintext pairs from Eqs. 4.47, 4.48, 4.49,
and 4.50 is in red, blue, green, and yellow, respectively. The averaged trace for the first plaintext
in Eq. 4.47 is in gray. With this gray plot, similar to Fig. 4.3 we can find the rough time interval for
the SubBytes operation in round 3, which is colored in pink
290 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.63 Zoom in to the SubBytes computation (pink area) in Fig. 4.62. The difference between
the averaged traces of plaintext pair from Eqs. 4.47, 4.48, 4.49, and 4.50 is in red, blue, green,
and yellow, respectively. They correspond to a single active column at the first, second, third, and
fourth positions, respectively, during the SubBytes operation in round 3

fourth positions, respectively. This agrees with what we see in Fig. 4.63—the four
colored peaks are in sequential order.
Example 4.3.14 [Point for side-channel observation—PRESENT] In this example,
we look at PRESENT encryption and let SB denote the PRESENT Sbox. We take
.ω = 1, and we choose

ΔS0 = 000000000000FFFF.
.

We aim to find a pair of plaintexts .S0 and .S1 that achieves a differential pattern
starting with .ΔS0 and converging in round 2. For each plaintext, we take .Np traces
and use the averaged trace as the leakages for this plaintext. Then the difference
between averaged traces of each pair of plaintext is computed. We assume that
the sBoxLayer operation is implemented nibble-wise, starting from the 0th nibble
(right-most) to the 15th nibble (left-most).
Convergence in round 2 means that there is just 1 active bit at the end of round
2. Consequently, we will have just one active Sbox before pLayer in round 3.
On the other hand, suppose there is just one active Sbox in round 3. As discussed
in Example 4.3.12, with .ΔS0 , the four active Sboxes in round 1 are

SB10 ,
. SB11 , SB12 , SB13 .

And they will influence four Sboxes in round 2:

. SB20 , SB24 , SB28 , SB212 .

By the design of pLayer we know each of those four Sboxes from round 2 will affect
four Sboxes in round 3 as shown below:
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 291

SB20 : influences SB30 ,


. SB34 , SB38 , SB312 ,
SB24 : influences SB31 , SB35 , SB39 , SB313 ,
SB28 : influences SB32 , SB36 , SB310 , SB314 ,
SB212 : influences SB33 , SB37 , SB311 , SB315 .

In particular, they all influence different Sboxes in round 3. Since there is just one
active Sbox in round 3, there is just one active Sbox in round 2. We also note that
different bits of the output of an Sbox in round 2 go to different Sboxes in round
3, and we can then conclude that there is just 1 active bit at the end of round 2.
Moreover, with the position of the active Sbox in round 3, we can further identify
the position of the active bit in round 2 with our knowledge of pLayer.
Thus, by observing the leakages around sBoxLayer in round 3, we will be able
to see if the convergence has happened and identify the value of .ΔS2 .
As an example, let us take the master key to be the one given by Eq. 4.42. We
also take the plaintext pair from Example 4.3.10, namely

.S0 = DCFC2D56F32EC070, S0' = DCFC2D56F32E3F8F. (4.51)

The experimental setup is as described in Sect. 4.1, and measurements were done
for three rounds of PRESENT computations. .Np = 2000 traces were collected for
each plaintext. Recall that this pair of plaintext achieves the following differential
pattern:

ΔS0 = 000000000000FFFF, ΔS1 = 000000000000000F,


.

ΔS2 = 0000000000000001.

In particular, there is one single active Sbox SB.30 before the pLayer operation of
round 3.
For comparison, we also collected 2000 traces for each of the following four
plaintexts:

8F5F8BD2E7CF5989,
. 8F5F8BD2E7CFA676 (4.52)

and

F2DCDC8341D45F79,
. F2DCDC8341D4A086, (4.53)

where the first pair of plaintext (Eq. 4.52) achieves the same differential character-
istics .ΔS0 and .ΔS1 , but at the end of round 2, the differential characteristic is given
by

0000000100000000.
.
292 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.64 The difference between the averaged traces of .S0 and .S0' from Eq. 4.51 (in red), plaintext
pair from Eq. 4.52 (in blue), and plaintext pair from Eq. 4.53 (in green). The averaged trace for .S0 is
in gray. With this gray plot, similar to Fig. 4.3 we can find the rough time interval for the sBoxLayer
operation in round 3, which is colored in pink

In this case, we have one single active Sbox SB.38 before the pLayer operation of
round 3.
The second pair of plaintext (Eq. 4.53) also achieves the same .ΔS0 and .ΔS1 ,
while the differential characteristic at the end of round 2 is given by

0001000100010000.
.

Then for this pair of plaintext, there are three active Sboxes (SB.34 , SB.38 , SB.312 ) before
the pLayer operation of round 3.
In Fig. 4.64, the difference between the averaged traces of .S0 and .S0' (Eq. 4.51),
plaintext pair from Eq. 4.52, and plaintext pair from Eq. 4.53 are in red, blue, and
green, respectively. We have also plotted the averaged traces for .S0 (in gray) for
the purpose of identifying the round operations. Similar to Fig. 4.3, we can find the
rough time interval for the sBoxLayer operation in round 3, which is colored in
pink. This time interval corresponds to our point of side-channel observation. After
zooming in, we get Fig. 4.65.
Recall that the sBoxLayer is implemented nibble-wise. From the above discus-
sions, we know that the red, blue, and green traces correspond to active Sboxes

.SB30 ; SB38 ; SB34 , SB38 , SB312

before round 3 pLayer operation, respectively. This agrees with what we see in
Fig. 4.65. There is a single peak in the red line and the blue line, while the green line
has three peaks. The peak in the red line (.SB30 ) is at the beginning of the sBoxLayer.
The first peak of the green line (.SB34 ) is between the peaks of the red (.SB30 ) and blue
(.SB38 ) lines. The peak of the blue line coincides with the second peak of the green
line (.SB38 ). The last peak of the green line (.SB312 ) is in the last quarter of the whole
time interval.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 293

Fig. 4.65 Zoom in to the sBoxLayer computation (pink area) in Fig. 4.64. The difference between
the averaged traces of .S0 and .S0' from Eq. 4.51 (in red), plaintext pair from Eq. 4.52 (in
blue), and plaintext pair from Eq. 4.53 (in green). They correspond to active Sboxes .SB30 ; .SB38 ;
3 3 3
.SB4 , SB8 , SB12 before pLayer of round 3

β0 Eα Eα α α
β1 AK 9α SR 9α MC AK
β2 SB Dα Dα
β3 Bα

S0 ⊕ S0' S1 ⊕ S1'

Fig. 4.66 An illustration of differential values for the differential pattern .ΔS0 =
1000010000100001 and .ΔS1 = 1000000000000000

Our ultimate goal is to recover information about the secret keys. Thus, another
criterion for choosing .ΔS0 and r is that the possible key hypotheses can be reduced
once we find a pair of plaintexts that achieves a converging differential pattern, and
we know the value of .ΔSr .
Example 4.3.15 [Reduce key hypotheses—AES] Let us consider AES with
.ω = 8. As an attacker, we choose the target differential characteristic .ΔS0 =
1000010000100001. Then we query the encryption with plaintext pairs that achieve
this .ΔS0 . Suppose with the help of side-channel leakages, we have identified a pair
of plaintexts .S0 and .S1 that gives a differential pattern converging in round 1 with
.ΔS1 = 1000000000000000. Let .α be the differential value of the single active byte

at the end of round 1. Then, using InvMixColumns (see Eq. 3.7), the differential
value of the 4 active bytes right after the SubBytes operation in round 1 is given by

0E · α,
. 09 · α, 0D · α, 0B · α.

Let .β0 , β1 , β2 , and .β0 be the differential values of the 4 active bytes in the main
diagonal of the plaintexts. An illustration is shown in Fig. 4.66.
294 4 Side-Channel Analysis Attacks and Countermeasures

We represent the master key of AES (which is also the whitening key used at the
beginning of the encryption) as a matrix:
⎛ ⎞
k00 k01 k02 k03
⎜k10 k11 k12 k13 ⎟
.K = ⎜ ⎟.
⎝k20 k21 k22 k23 ⎠
k30 k31 k32 k33

We represent the plaintext .S0 as the following matrix (note that this representation
follows the same notation as in Eq. 3.2, which is different from the notations in
Eq. 4.38):
⎛ ⎞
s00 s01 s02 s03
⎜s10 s11 s12 s13 ⎟
.S0 = ⎜ ⎟.
⎝s20 s21 s22 s23 ⎠
s30 s31 s32 s33

Then we have

SBAES (s00 ⊕ k00 ⊕ β1 ) ⊕ SBAES (s00 ⊕ k00 ) = 0E · α


.

SBAES (s11 ⊕ k11 ⊕ β2 ) ⊕ SBAES (s11 ⊕ k11 ) = 09 · α


SBAES (s22 ⊕ k22 ⊕ β3 ) ⊕ SBAES (s22 ⊕ k22 ) = 0D · α
SBAES (s33 ⊕ k33 ⊕ β4 ) ⊕ SBAES (s33 ⊕ k33 ) = 0B · α.

Thus,

s00 ⊕ k00 ,
. s11 ⊕ k11 , s22 ⊕ k22 , s33 ⊕ k33

are AES Sbox inputs that give output differences

0E · α,
. 09 · α, 0D · α, 0B · α

with input differences

β1 ,
. β2 , β3 , β4 ,

respectively. Then, by using the difference distribution table for AES Sbox and
with the knowledge of the plaintexts, we can reduce the key hypotheses (see
Remark 4.3.5).
As an example, let us take the master key to be the one given by Eq. 4.40.
Continuing Example 4.3.13, with side-channel leakages, we have identified the
following pair of plaintexts that achieves the differential pattern mentioned above,
namely,
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 295

S0 = 4C3C3F54C7AAD34E607110C753C5E990,
.

S0' = 033C3F54C725D34E607131C753C5E90F.

In this case, we have

β1 = 4C ⊕ 03 = 4F
.

β2 = AA ⊕ 25 = 8F
β3 = 10 ⊕ 31 = 21
β4 = 90 ⊕ 0F = 9F.

And

s00 = 4C,
. s11 = AA, s22 = 10, s33 = 90.

Thus,

4C ⊕ k00 ,
. AA ⊕ k11 , 10 ⊕ k22 , 90 ⊕ k33

are the AES Sbox inputs that give output differences

0E · α,
. 09 · α, 0D · α, 0B · α

with input differences

4F,
. 8F, 21, 9F,

respectively. To find the possible values of .k00 , k11 , k22 , k33 , we first find values of
α such that the following entries of the AES Sbox DDT are nonempty:
.

(0E · α, 4F),
. (09 · α, 8F), (0D · α, 21), (0B · α, 9F).

There are in total 13 of them, as shown in Table 4.2. Each of those values gives a
few hypotheses for .k00 ⊕ 4C, k11 ⊕ AA, k22 ⊕ 10, k33 ⊕ 90.
Consequently, we can find all the possible values for the 4 key bytes, as shown
in Table 4.3. The correct master key in Eq. 4.40 and the corresponding correct value
of .α are marked in blue. We note that the remaining number of key hypotheses is
given by

24 × 12 + 23 × 4 = 224,
.

while the number of all possible key hypotheses for those 4 bytes is

(28 )4 = 232 .
.
296 4 Side-Channel Analysis Attacks and Countermeasures

Table 4.2 In the first α k00 ⊕ 4C k11 ⊕ AA k22 ⊕ 10 k33 ⊕ 90


column, we list the possible
values of .α such that the 1A 16,59 65,EA CF,EE 62,FD
following entries of AES 29 96,D9 3E,B1 85,A4 78,E7
Sbox DDT are nonempty 42 28,67 58,D7 59,78 16,89
.(0E · α, 4F), (09 · 5D AB,E4 40,CF 81,A0 45,DA
α, 8F), (0D · α, 21), (0B · 66 3,4C 2C,A3 D8,F9 1D,82
α, 9F). The corresponding
hypotheses for 71 AF,E0 78,F7 DE,FF 39,A6
.k00 ⊕ 4C, k11 ⊕ AA, k22 ⊕ 74 82,CD 5D,D2 7,26 4E,D1
10, k33 ⊕ 90 are listed in the 95 7,48 43,CC 87,A6 65,FA
second, third, and fourth 9C 97,D8 0,3D,8F,B2 44,65 7F,E0
columns, respectively. The CC 1D,52 37,B8 93,B2 5F,C0
correct value of .α is marked
in blue. A detailed analysis is D7 37,78 63,EC 56,77 3E,A1
shown in Example 4.3.15 E7 3A,75 7B,F4 1B,3A 63,FC
EB BB,F4 34,BB CD,EC 54,CB

Table 4.3 Possible values of α k00 k11 k22 k33


.α and the corresponding key
hypotheses for 1A 5A,15 CF,40 DF,FE F2,6D
.k00 , k11 , k22 , k33 , the main 29 DA,95 94,1B 95,B4 E8,77
diagonal of the AES master 42 64,2B F2,7D 49,68 86,19
key. The correct key bytes are 5D E7,A8 EA,65 91,B0 D5,4A
marked in blue. A detailed 66 4F,00 86,9 C8,E9 8D,12
analysis is shown in
Example 4.3.15 71 E3,AC D2,5D CE,EF A9,36
74 CE,81 F7,78 17,36 DE,41
95 4B,04 E9,66 97,B6 F5,6A
9C DB,94 AA,97,25,18 54,75 EF,70
CC 51,1E 9D,12 83,A2 CF,50
D7 7B,34 C9,46 46,67 AE,31
E7 76,39 D1,5E B,2A F3,6C
EB F7,B8 9E,11 DD,FC C4,5B

We can see that the attack can significantly reduce the key hypotheses.
Example 4.3.16 [Reduce key hypotheses—PRESENT] Now we look at PRESENT
encryption. Take .ω = 1, and let

ΔS0 = 000000000000FFFF.
. (4.54)

We aim to find a pair of plaintexts .S0 and .S1 that achieve a differential pattern
starting with .ΔS0 and converging in round 2. Suppose by analyzing the side-
channel leakages, we have identified such a pair of plaintexts .S0 and .S1 that gives a
differential pattern converging in round 2 with

ΔS2 = 0000000000000001.
. (4.55)
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 297

Since there is only 1 active bit (bit 0) at the end of round 2, we know by the design of
PRESENT that this means there is only one active Sbox in round 2—Sbox .SB20 (see
Fig. 4.56). By analyzing the pLayer operation, we know that the output differences
of Sboxes .SB10 , SB11 , SB12 , SB13 are all equal to 1. By our choice of plaintexts, we
also know that the input differences of those Sboxes are all equal to F. According to
PRESENT Sbox DDT in Table 4.1, the inputs of those four Sboxes are among 2, 4,
B, and D. In other words, let

S0 = b63 b62 . . . b1 b0 .
.

And let

K1 = κ63
.
1 1
κ62 . . . κ01

denote the first round key. Then

.bj +3 bj +2 bj +1 bj ⊕ κj1+3 κj1+2 κj1+1 κj1 ∈ {2, 4, B, D} , for j = 0, 4, 8, 12.


(4.56)
With the knowledge of the plaintexts, we can reduce the key hypotheses. In
particular, the remaining number of key hypotheses for the 0th–15th bits of .K1 is

44 = 28 = 256,
.

while the total number of all possible key hypotheses for those 16 bits is .216 .
As an example, let us take the master key to be the one given by Eq. 4.42. We can
compute that the first round key is given by

K1 = 0000123456781234.
. (4.57)

Continuing Example 4.3.14, suppose with side-channel leakages, we have identified


the following pair of plaintexts that achieves the differential pattern starting with
.ΔS0 in Eq. 4.54 and converging in round 2 with .ΔS2 from Eq. 4.55:

.S0 = DCFC2D56F32EC070, S0' = DCFC2D56F32E3F8F.

In this case, Eq. 4.56 gives

. 0 ⊕ κ31 κ21 κ11 κ01 ∈ {2, 4, B, D} , 7 ⊕ κ71 κ61 κ51 κ41 ∈ {2, 4, B, D} ,
1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} , C ⊕ κ 1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} .
0 ⊕ κ11 10 9 8 15 14 13 12

We can then reduce all the possible key hypotheses for the 0th–15th bits of .K1 :

. κ31 κ21 κ11 κ01 ∈ {2, 4, B, D} , κ71 κ61 κ51 κ41 ∈ {5, 3, C, A} ,
298 4 Side-Channel Analysis Attacks and Countermeasures

1 κ 1 κ 1 κ 1 ∈ {2, 4, B, D} , κ 1 κ 1 κ 1 κ 1 ∈ {E, 8, 7, 1} ,
κ11 10 9 8 15 14 13 12

where the correct key nibbles given by Eq. 4.57 are marked in blue.
Up to now, we have seen how SCADPA can reduce the key hypotheses on 4 bytes
of AES master key and 4 nibbles of the first round key for PRESENT. In general,
the steps for SCADPA are as follows:
SCADPA Step 1 Choose the target cryptographic implementation. SCADPA
applies to all SPN ciphers that have been proposed so far. As
running examples, we will continue to discuss the attacks on AES-
128 and PRESENT.
SCADPA Step 2 Choose the value .ω. Based on our chosen cipher, we need
to decide the value of .ω for our attack. This value is highly
dependent on the cipher design. In general, for AES-like ciphers,
we would choose .ω to be the same as the size of the Sbox. And
for bit permutation-based ciphers (e.g., PRESENT), we choose .ω
to be 1.
SCADPA Step 3 Identify a target differential characteristic .ΔS0 , a round
number r for convergence, and a point for side-channel obser-
vation. We would like to look for plaintext pairs that achieve a
differential pattern starting with .ΔS0 and converging in round
r. We also need to decide on a point for side-channel leakage
analysis during the computation after round r. The choice of .ΔS0 ,
r, and the point for side-channel observation should satisfy the
following conditions:
• The probability of convergence is not too small. In particular,
if the probability is .2−pr , we will need .2Mp chosen plaintexts
for the attack, where .Mp = 0.5pr + 0.5.
• Using side-channel leakages at the chosen point of mea-
surement, we should be able to confirm if the convergence
has appeared for the differential pattern between a pair of
plaintexts. Furthermore, it is possible to identify the value of
.ΔSr in case the convergence appears.

• The possible key hypotheses can be reduced once we find a pair


of plaintexts that achieves a converging differential pattern and
obtain the value of .ΔSr .
SCADPA Step 4 Choose plaintexts. We choose .2Mp distinct plaintexts so that each
pair of them achieves the target differential characteristic .ΔS0 .
SCADPA Step 5 Side-channel measurement and observation. With each plain-
text, we measure .Np traces. The average trace of those .Np traces
is computed for each plaintext. For each pair of plaintexts, we take
the difference of the corresponding average traces and analyze the
difference trace at the chosen point of observation. Once we find
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 299

one difference trace that indicates the convergence has occurred,


we deduce the value of .ΔSr from the measurements and carry on
to the next step.
SCADPA Step 6 Reduce key hypotheses. Once we identify a pair of plaintexts
that archives a converging differential pattern, we can reduce the
key hypotheses using the knowledge of .ΔSr and the plaintexts.
Example 4.3.17 In summary, a SCADPA attack on AES-128 starts with choosing

ω = 8,
. ΔS0 = 1000010000100001, r = 1,

and the point for side-channel observation being the SubBytes operation in round 3.
Then we query AES encryption with .211.5 (see Example 4.3.11) chosen plaintexts
such that each pair of them achieves the differential characteristic .ΔS0 . With side-
channel leakages, we can deduce if convergence has happened, and if yes, we record
the value of .ΔS1 (see Example 4.3.13). Finally with a similar computation as in
Example 4.3.15, we reduce the key hypotheses for the 4 bytes in the main diagonal
of the master key.
Similar attacks can be carried out on the other “diagonals” of the master key to
reduce the key hypotheses of the whole master key. In particular, the other values of
.ΔS0 can be

.0100001000011000, 0010000110000100, 0001100001000010.

The possible differential patterns for each .ΔS0 are shown in Fig. 4.67, where each
figure represents four different differential patterns starting with the same .ΔS0 . The
blue-colored squares represent active bytes, and only one of those 4 colored bytes
is active in the last two cipher states (so that the differential pattern converges in
round 1).

Example 4.3.18 As for SCADPA attack on PRESENT, we start by choosing

.ω = 1, ΔS0 = 000000000000FFFF, r = 2,

and point for side-channel observation being the sBoxLayer operation in round
3. Then we query PRESENT encryption with .24.5 (see Example 4.3.12) chosen
plaintexts such that each pair of them achieves the differential characteristic .ΔS0 .
With side-channel leakages, we can deduce if convergence has happened, and if yes,
we record the value of .ΔS2 (see Example 4.3.14). Finally with a similar computation
as in Example 4.3.16, we reduce the key hypotheses for the 0th–15th bits of the first
round key. We have also computed that the remaining number of key hypotheses
will be .28 instead of the original .216 .
Similar attacks can be carried out on the other bits of the first round key to reduce
the key hypotheses of the whole round key. In particular, the other values of .ΔS0
can be
300 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.67 The possible differential patterns for AES encryption with .ΔS0 equal to
.1000010000100001, 0100001000011000, 0010000110000100, 0001100001000010, respectively.
Each figure represents four different differential patterns starting with the same .ΔS0 . The blue-
colored squares represent active bytes and only one of those 4 colored bytes is active in the last
two cipher states

Fig. 4.68 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
00000000FFFF0000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 4th, 6th, 32nd, or 34th bit

.00000000FFFF0000, 0000FFFF00000000, FFFF000000000000.

The possible differential patterns for each of the three values of .ΔS0 are shown in
Figs. 4.68, 4.69, and 4.70. Each figure shows four differential patterns that converge
in round 2. Each differential pattern has 1 active nibble at the end of round 1 and a
single active bit at the end of round 2.
4.3 Side-Channel Analysis Attacks on Symmetric Block Ciphers 301

Fig. 4.69 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
0000FFFF00000000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 8th, 10th, 36th, or 38th bit

Fig. 4.70 The possible differential patterns for PRESENT encryption that start with .ΔS0 =
FFFF000000000000 and converge in round 2. There are in total four patterns—the single active
bit at the end of round 2 can be the 12th, 14th, 40th, or 42nd bit

Remark 4.3.7 As mentioned in Example 4.3.13, for the attack on AES, we


assume the SubBytes operation is implemented column-wise from the first column
to the fourth column. We note that a different ordering of the columns in the
implementation is also vulnerable to the attack, provided the attacker knows the
ordering of the columns. Similarly, for our attack on PRESENT, we have mentioned
in Example 4.3.14 that the sBoxLayer operation is implemented nibble-wise from
the 0th nibble to the 15th nibble. A different ordering of the nibbles still can be
attacked as long as the attacker has the knowledge of the specific ordering.
302 4 Side-Channel Analysis Attacks and Countermeasures

4.4 Side-Channel Analysis Attacks on RSA and RSA


Signatures

In this section, we will discuss one SPA and one DPA attack on implementations of
RSA and RSA signatures.
Following the same notations from Sect. 3.3, let .p, q be two distinct odd primes.
.n = pq and .e ∈ Z
∗ −1 mod ϕ(n) is the private key.
ϕ(n) are the public keys. .d = e
Furthermore, let

d𝓁d −1 d𝓁d −2 . . . d1 d0
.

be the binary representation of d.


We will show how SPA and DPA can be used to recover the value of d during the
computation of

a d mod n
. (4.58)

for some .a ∈ Zn . For both attacks, we focus on one particular method for
implementing the modular exponentiation—the left-to-right square and multiply
algorithm (Algorithm 3.8). A similar SPA attack can also be applied to the right-to-
left square and multiply algorithm (Algorithm 3.7). We note that the attacks can be
carried out during either the decryption of RSA or the signature signing procedure
of RSA signatures.
For the experiments, we have set the values of the parameters as given in
Examples 3.3.2:9

.p = 29, q = 41, n = 1189, ϕ(n) = 1120, e = 3, d = 747. (4.59)

Then our implementation of Algorithm 3.8 can be described by Algorithm 4.2.


Remark 4.4.1 We note that since .ϕ(n) = (p−1)(q−1) is even and .gcd(d, ϕ(n)) =
1, d is odd. In particular .d0 = 1.

4.4.1 Simple Power Analysis

We have seen that DPA exploits the relationship between leakages at specific
time samples and the data being processed in the DUT. SPA, on the other hand,
analyzes leakages along the time axis, exploiting relationships between leakages

9 Note that for easy illustration, the values we choose for p and q are much smaller than practical

values.
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 303

Algorithm 4.2: Left-to-right square and multiply algorithm for computing


modular exponentiation (see Algorithm 3.8) with parameters from Eq. 4.59
Input: a// a ∈ Z1189
Output: a 747 mod 1189
1 n = 1189
2 dbin = [1, 1, 0, 1, 0, 1, 1, 1, 0, 1]// binary representation of d = 747, d0 = 1,
d1 = 1
3 𝓁d = length of dbin// bit length of d
4 t =1
5 for i = 𝓁d − 1, i ≥ 0, i − − do
6 t = t ∗ t mod n
// ith bit of d is 1
7 if di = 1 then
8 t = a ∗ t mod n

9 return t

and operations. Similar to profiled DPA, SPA requires knowledge of the exact
implementation.
We have seen in the analysis of Fig. 4.3 that different operations can be deduced
from observing the power traces. An SPA attack on the square and multiply
algorithm works with a similar method—we examine the traces to figure out if both
square and multiplication are executed in one loop from line 5 (the corresponding
bit of d is 1) or not (the corresponding bit of d is 0). Following Kerckhoffs’ principle
(see Definition 2.1.3), we assume the attacker has the knowledge of Algorithm 4.2
except for the values of bits of d in line 2.
With the experimental setting as described in Sect. 4.1, we measured one power
trace for the computation of Algorithm 4.2 on our DUT. The trace is shown in
Fig. 4.71. We can see ten similar patterns. By examining Algorithm 4.2, we have
two guesses:
Guess a Each pattern corresponds to one modular operation (modular square from
line 6 or modular multiplication from line 8).
Guess b Each pattern corresponds to one loop from line 5.
Let S denote the modular square operation from line 6 and M the modular
multiplication from line 8. We observe that the loop in line 5 contains either one
square operation (S) or one square followed by one multiplication operation (SM).
We also have the following correspondence between operations in loop i and the ith
bit of the secret key d:

loop i contains only S ⇐⇒ di = 0,


. loop i consists of SM ⇐⇒ di = 1.

We further notice that there are mainly two types of patterns in Fig. 4.71, one
with a single cluster of peaks and one with more than one cluster of peaks. They are
colored in green and blue in Fig. 4.72, respectively.
304 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.71 One trace corresponding to the computation of Algorithm 4.2. We can see ten similar
patterns

Fig. 4.72 Highlighted two types of patterns from Fig. 4.71. One pattern with a single cluster of
peaks (colored in green) and one with more than one cluster of peaks (colored in blue)

Let us first assume that Guess a is correct. Based on the above observations, we
have two possibilities to consider:
• The (green colored) single peaked patterns correspond to modular square opera-
tion (S), and the (blue colored) multiple peaked patterns correspond to modular
multiplication operation (M).
• The (green colored) single peaked patterns correspond to modular multiplication
operation (M), and the (blue colored) multiple peaked patterns correspond to
modular square operation (S).
We know that .d0 = 1 (see Remark 4.4.1). Then we can deduce that the last blue-
colored pattern in Fig. 4.72 does not represent a single modular square operation
(S). On the other hand, the start of the computation will always be a modular square
operation, which then indicates that the first blue-colored pattern corresponds to S.
We have reached a contradiction, and we conclude that Guess a is not correct.
Next, we assume Guess b is correct. Similarly, we have two possibilities to
consider:
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 305

• The (green colored) single peaked patterns represent a single modular square
operation (S), i.e., the corresponding bit of d is 0, and the (blue colored) multiple
peaked patterns represent SM and the corresponding bit of d is 1.
• The green-colored patterns correspond to SM, and the blue-colored patterns
correspond to S.
As discussed above, the end of the computation does not stop with .d0 = 0, and
thus the blue-colored patterns represent SM, i.e., the corresponding bit of d is 1.
Consequently, the green-colored patterns correspond to loops with the bit of d being
0. We can then read out the value of bits .di (.i = 𝓁d − 1, . . . , 0, 1) from Fig. 4.72:

1 0 1 1 1 0 1 0 1 1.
.

Finally, we recover the secret key

d = 10111010112 = 747.
.

One might argue that the first green pattern in Fig. 4.72 may also be a multiple
peaked blue pattern. We note that this pattern is shorter than the other blue patterns.
Hence it is more likely to correspond to one operation instead of two. Nevertheless,
in a realistic attack, one could use brute force to recover this bit.
Remark 4.4.2 By the design of the Montgomery powering ladder (Algorithm 3.9),
there is always a multiplication followed by a square operation, making it safe
against our SPA attack presented above.

4.4.2 Differential Power Analysis

For DPA attacks on RSA implementations, we focus on Montgomery’s method


for implementing modular multiplication MonPro (see Algorithm 3.17). Following
Eq. 4.59 and Example 3.5.17, we have

p = 29, q = 41, n = 1189, ϕ(n) = 1120, e = 3,


. (4.60)
d = 747, r = 2048, r −1 = 717, n̂ = 1235.

Then our implementation of Montgomery left-to-right square and multiply algo-


rithm (Algorithm 3.20) can be described by Algorithm 4.4. Also, our implementa-
tion of MonPro (Algorithm 3.17) becomes Algorithm 4.3.
We have implemented Algorithm 4.4 in our DUT. With experimental settings as
described in Sect. 4.1, one trace is shown in Fig. 4.73. This trace is different from
Fig. 4.72—we cannot see two distinct types of patterns. If we take a closer look at
the computation of MonPro in Algorithm 4.3, we can see that the main difference
between a square and a multiply is in line 3, which does not involve modular n as
306 4 Side-Channel Analysis Attacks and Countermeasures

Algorithm 4.3: MonPro, Montgomery product algorithm with parameters from


Eq. 4.59
Input: a, b// a, b ∈ Z1189
Output: 717ab mod 1189
1 n̂ = 1235
2 n = 1189
3 t = ab
4 m = t n̂AND2047
5 u = (t + mn) >> 11
6 if u ≥ n then
7 u=u−n
8 return u

Algorithm 4.4: Montgomery left-to-right square and multiply algorithm with


parameters from Eq. 4.59. MonPro is given by Algorithm 4.3
Input: a// a ∈ Z1189 ;
Output: a 747 mod 1189
1 n = 1189, r = 2048
2 dbin = [1, 0, 1, 1, 1, 0, 1, 0, 1, 1]// binary representation of d = 747, d0 = 1,
d1 = 1
3 𝓁d = length of dbin// bit length of d
4 tr = r mod n
5 ar = ar mod n
6 for i = 𝓁d − 1, i ≥ 0, i − − do
7 tr = MonPro(tr , tr )// tr = tr ×Mon tr .
8 if dbin[i] = 1 then
9 tr = MonPro(tr , ar )// tr = tr ×Mon ar .

10 t = MonPro(tr , 1)// t = tr ×Mon 1 = tr ∗ r −1 mod n.


11 return t

compared to lines 6 and 8 in Algorithm 4.2. This missing modular n operation might
be the main reason for the missing pattern structure in Fig. 4.73.
Nevertheless, we can still gain important information from the trace. First, we
note that there are 18 similar patterns in Fig. 4.73. By examining Algorithm 4.4,
similar to Guess a and Guess b from Sect. 4.4.1, we can assume each of those 18
patterns corresponds to either one execution of MonPro or one loop from line 6.
Since there is one extra MonPro operation in line 10, we know that the last pattern
will not represent a loop. If each of the other patterns corresponds to one loop, we
will have a secret key of bit length 17, which is longer than the bit length of n
(bit length of 1189 is 10) and hence impossible. We conclude that there is a high
possibility that each pattern corresponds to one execution of MonPro. Since when
.di = 1 there are two executions of MonPro and when .di = 0 there is one execution

of MonPro, our observations reveal that


4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 307

Fig. 4.73 One trace corresponding to the computation of Algorithm 4.4. We can see 18 similar
patterns

𝓁d + wt (d) = 17.
.

We follow similar attack steps as DPA attacks on symmetric block ciphers


presented in Sect. 4.3.1.1. However, we will only describe one particular attack on
RSA (originally proposed in [AFV07]), while Sect. 4.3.1.1 outlines attack steps for
a generic DPA attack on any symmetric block ciphers.
DPA-RSA Step 1 Identify the target cryptographic implementation. As men-
tioned above, we focus on the left-to-right square and multiply
algorithm with Montgomery’s method for modular multiplica-
tion. In particular, our attack will be on an implementation of
Algorithm 4.4. We remark that to have a better signal, most part
of Algorithm 4.3 was implemented in ARM assembly.
DPA-RSA Step 2 Experimental setup and measure leakages. With the same
experimental setting as in Sect. 4.1, we have measured .M =
10,000 traces, each for a random input .a ∈ Z1189 . Let .aj
(.j = 1, 2, . . . , M) denote the j th input with corresponding
j j j
power trace .𝓁j = (l1 , l2 , . . . , lq ), where the total number of time
samples in one trace is .q = 9500.
DPA-RSA Step 3 Choose the part of the key to recover. In this attack, we aim to
recover the full secret key d.
DPA-RSA Step 4 Choose the target intermediate value. Our target intermediate
value is the 0th byte of the value .ar , defined in line 5 of
Algorithm 4.4. We note that .ar is only used in the algorithm when
.di = 1 (line 9), and thus we expect the correlation between the

leakages and information related to .ar to be higher when line 9


is executed. Consequently, we will know that .di = 1 for the
corresponding loop. Since in practice .ar is a big integer, it is
more reasonable to focus on just part of .ar . For our experiments,
.ar ∈ Z1189 has bit length at most 11. We will focus on the 0th

byte (bits .0, 1, 2 . . . , 7) of .ar .


308 4 Side-Channel Analysis Attacks and Countermeasures

DPA-RSA Step 5 Compute the hypothetical signal for each target intermediate
value. Our attack does not rely on finding the best key hypothesis
that achieves the highest absolute correlation coefficient as in
DPA attacks on symmetric block ciphers. The information we
exploit is that when the absolute correlation coefficient between
leakages and the target intermediate value is high, the corre-
sponding loop has secret key bit .= 1. For each of the M inputs .aj ,
we compute the target intermediate value, denoted .v j , as follows:

v j = bits 0, 1, 2, . . . , 7 of aj r mod n,
. j = 1, 2, . . . , 10,000.
(4.61)
As we have seen in Sect. 4.3.2.1, the Hamming weight leakage
model (Eq. 4.4) is a good estimate for leakages of our DUT. We
compute the hypothetical signal corresponding to .aj , denoted
.Hj , as follows:

( )
Hj = wt v j ,
. j = 1, 2, . . . , 10,000. (4.62)

DPA-RSA Step 6 Statistical analysis. We view the hypothetical signal as a random


variable .H that varies when the input a changes. For a fixed
time sample t, we also consider the leakage at t as a random
variable .Lt . Then our computations from DPA-RSA Step 5 and
our traces from 11 give us a sample for this pair of random
variables .(H, Lt ):
{ }
j
. (Hj , lt ) | j = 1, 2, . . . , 10,000 .

To see at what time samples the leakages are correlated to .H,


the same as in DPA Step 7, we adopt the notion of correlation
coefficient (Definition 1.7.11). And for each time sample t,
we compute the sample correlation coefficient (Example 1.8.1),
denoted by .rt , of .H and .Lt :
ΣM j
− H)(lt − lt )
j =1 (Hj
.rt := /
ΣM /Σ ,
M j
j =1 (Hj − H) j =1 (lt − lt )
2 2

M = 10,000, t = 1, 2, . . . , 9500. (4.63)

Example 4.4.1 For our experiments, we have

a1 = 900,
. a2 = 1083, a3 = 881, a4 = 852.

Then
4.4 Side-Channel Analysis Attacks on RSA and RSA Signatures 309

a1 r mod n = 900 × 2048 mod 1189 = 250 = FA,


.

a2 r mod n = 1083 × 2048 mod 1189 = 499 = 1F3,


a3 r mod n = 881 × 2048 mod 1189 = 575 = 23F,
a4 r mod n = 852 × 2048 mod 1189 = 633 = 279.

According to Eq. 4.61, we have

v 1 = FA,
. v 2 = F3, v 3 = 3F, v 4 = 79.

Then the hypothetical signals from DPA-RSA Step 5 are given by (Eq. 4.62)

H1 = wt (FA) = wt (11111010) = 6,
.

H2 = wt (F3) = wt (11110011) = 6,
H3 = wt (3F) = wt (00111111) = 6,
H3 = wt (79) = wt (01111001) = 5.

The sample correlation coefficients for all time samples are shown in Fig. 4.74.
We can see a sequence of 18 patterns. To recover the secret key, we need the help of
SPA. We have discussed before that there are 18 patterns in Fig. 4.73, and each of
them most likely corresponds to one execution of MonPro.
If we put Figs. 4.73 and 4.74 together, we get Fig. 4.75. We can see that the 18
patterns corresponding to sample correlation coefficients and those corresponding
to leakages coincide. Thus, we can assume each pattern in Fig. 4.74 represents one
execution of MonPro.
Let us then take a closer look at Fig. 4.74. We can see there are mainly two types
of patterns: one with a lower peak and one with a higher peak and a small high peak
Sample correlation coefficient

Fig. 4.74 Sample correlation coefficients .rt (Eq. 4.63) for time samples .t = 1, 2, . . . , 9500. We
can see a sequence of 18 patterns
310 4 Side-Channel Analysis Attacks and Countermeasures

Sample correlation coefficient

Fig. 4.75 Sample correlation coefficients from Fig. 4.73 (in red) with one power trace from
Fig. 4.74 in gray. We can see that the 18 patterns corresponding to sample correlation coefficients
and those corresponding to leakages coincide
Sample correlation coefficient

Fig. 4.76 There are mainly two types of patterns in Fig. 4.74: one with a lower peak and one with
a higher peak and a small high peak at the end of the pattern. In this figure, they are highlighted in
green and blue, respectively

at the end of the pattern. They are highlighted in green and blue, respectively, in
Fig. 4.76.
We know that the last pattern in Fig. 4.76 corresponds to line 10 in Algorithm 4.4.
Then each of the remaining 17 patterns represents the computation of either line 9
or line 7. Let S and M denote the modular square and modular multiplication
computations in lines 7 and 9, respectively. Since .ar is only used in M, we can
assume that a higher peaked (blue-colored) pattern corresponds to M. Consequently,
a lower peaked (green-colored) pattern corresponds to S. Using Fig. 4.76, we
can deduce the sequence of square and multiply operations in one execution of
Algorithm 4.4:

SMSSMSMSMSSMSSMSM.
.

A loop in Algorithm 4.4 contains either a single S or SM. We can then map this
sequence of operations into different loops (separated by spaces)
4.5 Countermeasures Against Side-Channel Analysis Attacks 311

. SM S SM SM SM S SM S SM SM.

Furthermore, a loop containing a single S corresponds to the secret bit .= 0, while a


loop containing SM corresponds to the secret bit .= 1. We can then read out the bits
of the secret key d:

1 0 1 1 1 0 1 0 1 1,
.

and we can reconstruct the key

.d = 1011101011 = 747.

4.5 Countermeasures Against Side-Channel Analysis Attacks

In this section, we will discuss a few implementation-level SCA countermeasures


for both symmetric block cipher and RSA implementations. We have seen how the
dependency of a device’s leakages (power consumption) on data and operations
can be exploited to recover the secret keys of a cryptographic implementation.
The goal of the countermeasures that we will see is to make the leakage of the
DUT independent of the operations or the intermediate values of the executed
cryptographic implementation. We will detail two types of countermeasures—
hiding and masking/blinding.
The goal of a hiding-based countermeasure is to remove the operation or data
dependency of leakages. This can be done by changing the leakage of the DUT in
a way that every operation requires a similar (balance the leakages) or a random
(randomize the leakages) amount of energy. On the other hand, the goal of a
masking/blinding-based countermeasure is to remove the data dependency of the
leakages by randomizing the intermediate values that the DUT is processing. The
rationale is that since the value being processed in the DUT is randomized and
independent of the intermediate value of the cryptographic computation, we cannot
capture information on the actual intermediate value from the leakages. In practice,
both types of countermeasures are used.

4.5.1 Hiding

As mentioned above, hiding-based countermeasure aims to either randomize or


balance the leakages of the DUT for different operations or data. In this section,
we will discuss two countermeasures that aim to balance the leakages—one for
symmetric block ciphers and one for RSA implementations.
312 4 Side-Channel Analysis Attacks and Countermeasures

4.5.1.1 Encoding-Based Countermeasure for Symmetric Block Ciphers

In Sect. 4.3.2.2, we have discussed the stochastic leakage model. In this part, we
will show a countermeasure that is based on analyzing the stochastic leakage of the
DUT [MSB16].
Recall that with the stochastic leakage model, we can characterize the leakage at
a single time sample. For the countermeasure, we also focus on one time sample.
The coefficients (see Eq. 4.24) of the stochastic leakage model will be estimated
using the measured traces. Based on the estimated leakage model, we choose a
binary code (Definition 1.6.1) that results in a lower SNR at this particular time
sample and makes the attacks require more effort.
In Sects. 4.3.1 and 4.3.2 we have seen attacks based on the Hamming weight
leakage model on PRESENT implementations. Thus, to provide more protection,
we further require the codewords in our code to have the same Hamming weight,
as shown in [HBK23]. In this way, attacks based on the Hamming weight of the
intermediate values will not be possible.
The steps for the countermeasure are as follows:
Code-SCA Step 1 Identify the target instruction and target intermediate value.
As the stochastic leakage model is specific to one time sample,
we need to first decide what is the most vulnerable instruction
and which intermediate value needs to be protected the most.
Let .v = vmv −1 vmv −2 . . . v1 v0 denote the target intermediate
value of bit length at most .mv . In general, it is recommended
that the implementation is done in assembly to identify the most
vulnerable instruction.
For our illustrations, we will choose the instruction MOV
for our microcontroller, and we focus on the PRESENT Sbox
output. Hence .mv = 4. The operation we implemented is then

MOV
. r0 a, (4.64)

where r0 represents a register and .a is the input of the target


instruction.
Code-SCA Step 2 Choose the code length .nC and the Hamming weight .wH of
each codeword. We would like to choose a code to represent our
secret intermediate value .v such that instead of processing .v with
our DUT, the corresponding codewords will be used. Clearly, the
size of the binary code will be .2mv in order to represent all values
of .v. The length of the binary code should be at least .mv + 1 so
that it allows us to choose which word to use as our codewords.
Longer length in general not only gives us more freedom but also
causes more overhead. Let .nC denote our chosen code length.
As mentioned before, we would also like the codewords in
our binary code to have the same Hamming weight, making
4.5 Countermeasures Against Side-Channel Analysis Attacks 313

attacks based on the Hamming weight( ) leakage model impossi-


n
ble. Note that there are in total . wnCH words in .F22 C that have
Hamming weight .wH . One criterion for the choice of .nC and
.wH is then

⎧ ⎫
nC
. > 2mv , (4.65)
wH

so that we will have enough codewords to represent all values of


v. .

In summary, we are looking for an .(nC , 2mv )-binary code


such that each codeword has Hamming weight .wH . And .nC and
.wH should satisfy Eq. 4.65.

For our experiments, we choose .nC = 8 and .wH = 6. We are


interested in .(8, 16)-binary codes such that each codeword has
Hamming weight 6.
Code-SCA Step 3 Experimental setup and trace measurement. In this step, we
will collect two datasets, denoted .T1 and .T2 . Using our DUT, we
repeatedly run the target instruction with random inputs. One
trace is measured for each input. We first take .M1 inputs with
random values from .Fm v
2 , which give us the dataset .T1 . Then
using .M2 inputs with values from .Fn2 C , we get .T2 . Suppose each
trace contains q time samples. Note that we assume the traces in
those two datasets are well aligned.
.T1 represents how leakages behave when random values of .v

are being processed by the DUT. It will be used to identify the


POI in Code-SCA Step 4, which is the time of the computation
that is supposed to be the most vulnerable.
To choose a good binary code, we would like to profile leak-
ages when the input is a codeword. Let .x = xnC −1 xnC −2 . . . x1 x0
n
be a word from .F22 C of bit length at most .nC . Recall from
Sect. 4.3.2.2 (Eq. 4.24) that the stochastic leakage model spec-
ifies the leakage is related to the value .x being processed in the
device as follows:

C −1

L(x) =
. αs xs + noise, (4.66)
s=0

where .noise ∼ N(0, σ 2 ) denotes the noise with mean 0 and vari-
ance .σ 2 . Estimations for the coefficients .αs (.s = 0, 1, . . . , nC −
1) will be computed with profiling traces from .T2 . We can see
that it is important for traces in .T1 and .T2 to be aligned so that
the profiling of .T2 is carried out with the correct POI.
314 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.77 An example of a trace from dataset .T1 , obtained in Code-SCA Step 3, which corre-
sponds to MOV instruction surrounded by NOPs

Fig. 4.78 SNR values for each time sample computed with dataset .T1 obtained in Code-SCA Step
3. The highest point is our .POI = 430

For our experiment, we measured .M1 = 10,000 traces for the


MOV instruction with inputs between 0 and F and .M2 = 10,000
traces with inputs between 00 and FF. Each trace contains .q =
600 time samples. The MOV instruction was surrounded with NOP
operations. Figure 4.77 shows what one trace from .T1 looks like.
We can see that the operation MOV happens between time samples
400 and 450. Traces from .T2 look very similar.
Code-SCA Step 4 Identity the POI. With the traces from .T1 obtained in Code-
SCA Step 3, we compute the SNR for each time sample
following similar methods as in P-DPA Step 6–P-DPA Step 8
from Sect. 4.3.2.1. Our target signal is the exact value of .v, thus
.Msignal = 2 v (P-DPA Step 6). The number of time samples is
m

.qpf = q (P-DPA Step 7). The POI is taken to be the time sample

with the highest SNR.


With our .10,000 traces from .T1 , the SNR values for each time
sample are shown in Fig. 4.78. And our POI is 430.
4.5 Countermeasures Against Side-Channel Analysis Attacks 315

Code-SCA Step 5 Estimate coefficients for the stochastic leakage model. Fol-
lowing SLM Step a–SLM Step c in Sect. 4.3.2.2, we compute
the estimations for .αs in Eq. 4.66 using dataset .T2 . With the
notations from Sect. 4.3.2.2, we have .Mpf = M2 . The POI was
identified in Code-SCA Step 4. Note that the target intermediate
value is not the value .v from Code-SCA Step 1, but words
2n
from .F2C . Let .α̂s denote the estimated value for .αs (.s =
0, 1, . . . , nC − 1).
Using our dataset .T2 for MOV instruction, .POI = 430, and
.nC = 8, we get the following estimations .α̂s for .αs :

α̂0 ≈ −0.00245761, α̂1 ≈ −0.00130026, α̂2 ≈ −0.00135884,


. α̂3 ≈ −0.00122801, α̂4 ≈ −0.00131569, α̂5 ≈ −0.00213467,
α̂6 ≈ −0.00209748, α̂7 ≈ −0.00221288.
(4.67)
Code-SCA Step 6 Compute the estimated signal for each word with Hamming
weight .wH . Using the estimated values, .α̂s , for the coeffi-
cients, according to the stochastic leakage model (Eq. 4.66),
we can compute the estimated signal of each word .x =
n
xnC −1 xnC −2 . . . x1 x0 from .F22 C , denoted SG.(x),

c −1

SG(x) =
. α̂s xs . (4.68)
s=0

We can identify each integer between 0 and .2nC −1 with a unique


n
binary string from .F22 C using the integer’s binary representation
and compute its estimated signal with Eq. 4.68. Let Words be
the table of integers between 0 and .2nC − 1 whose binary
representation has Hamming weight .wH . Then the table .TSG of
estimated signals is constructed such that

TSG [a] = SG(Words[a]),


. a = 0, 1, . . . , 2nC − 1. (4.69)

For our experiments,

Words
. =[3F, 5F, 6F, 77, 7B, 7D, 7E, 9F, AF, B7, BB, BD, BE, CF,
D7, DB, DD, DE, E7, EB, ED, EE, F3, F5, F6, F9, FA, FC].
(4.70)

For example, with .α̂s from Eq. 4.67, we have


316 4 Side-Channel Analysis Attacks and Countermeasures

Σ
5
TSG [0] = SG(3F) = SG(00111111) =
. α̂i ≈ −0.009795.
i=0

The table .TSG can be found in Table E.1 (Appendix E).


Code-SCA Step 7 Find the optimal code. Finally, we search for an optimal
m
.(nC , 2 v )-binary code whose codewords all have Hamming

weight .wH using Algorithm 4.5 (see [MSB16, Algorithm 1]).


The input of the algorithm consists of .mv , the maximum bit
length of the target intermediate value from Code-SCA Step
1; .nC and .wH , the code length and the Hamming weight for
each codeword chosen in Code-SCA Step 2; Words, the table
of integers between 0 and .2nC − 1 with Hamming weight .wH
obtained in Code-SCA Step 6; and .TSG , the table of estimated
signals for each integer from Words as specified in Eq. 4.69. As
discussed in Code-SCA Step 2, the total number of codewords
is .2mV (line 1), and the total number ( nC of
) binary strings of length
.nC and Hamming weight .wH is .
wH (line 2). Firstly, we sort
the values in .TSG in ascending order and save the sorted values
in .Tsorted , where .Tsorted [0] contains the lowest value from .TSG
(line 6). The array I records the corresponding integer in Words
for each estimated signal in .Tsorted (lines 7 and 8). Next, the
difference between the values in .Tsorted [j + code_size − 1]
and .Tsorted [j ] is stored in the j th entry of the array D, for
.j = 0, 1, . . . ,total_word-code_size (lines 9 and 10). The index

of the smallest value in D is denoted by ind (line 11). Finally, the


binary code consists of codewords that correspond to estimated
signals in the range .D[ind] and .D[ind+code_size−1] (lines 12
and 13).
With .mv = 4, .nC = 8, .wH = 6, Words from Eq. 4.70, and
.TSG in Table E.1, we get the following .(8, 16)-binary code

C(8,16) := [B7, D7, BD, AF, DD, 77, CF, BB,


.

DB, 7D, 6F, 7B, F6, FC, EE, FA]. (4.71)

The sorted table .Tsorted (line 6 of Algorithm 4.5) can be found in


Table E.2 (Appendix E), where the codewords are highlighted in
blue.
We will argue that Algorithm 4.5 indeed outputs an optimal .(nC , 2mv )-binary
code that achieves a lower SNR, according to the stochastic leakage model with
coefficients .α̂s obtained in Code-SCA Step 5. First, let .A = {a1 , a2 , . . . , aβ } be a
set of .β (.β ≥ 2) real numbers. Define
4.5 Countermeasures Against Side-Channel Analysis Attacks 317

Algorithm 4.5: Finding the optimal code for encoding countermeasure against
SCA
Input: mv , nC , wH , Words, TSG // mv is the maximum bit length of the
target intermediate value identified in Code-SCA Step 1; nC is the
code length and wH is the Hamming weight for each codeword chosen
in Code-SCA Step 2; Words is the table of integers between 0 and
2nC − 1 with Hamming weight wH as discussed in Code-SCA Step 6; TSG
is the table of estimated signals for each integer from Words as
specified in Eq. 4.69.
Output: An (nC , 2mv )-binary code with each codeword having Hamming weight wH
1 code_size = 2mv // number of codewords in our code
( )
2 total_word = wnCH // total number of words of length nC and Hamming
weight wH
3 array of size total_word−code_size+1 D
4 array of size total_word I
// C will store the codewords
5 array of size code_size C
// Tsorted [0] contains the lowest value from TSG
6 Tsorted = TSG sorted in ascending order
7 for j = 0, j < total_word, j + + do
// I records the corresponding word in Words for each estimated
signal in Tsorted
8 I [j ] = Words [index of Tsorted [j ] in TSG ]
9 for j = 0, j ≤ total_word − code_size, j + + do
// the j th entry of D is given by the difference between the value
in Tsorted [j + code_size − 1] and Tsorted [j ]
10 D[j ] = Tsorted [j + code_size − 1] − Tsorted [j ]
// ind is the index of the smallest value in D
11 ind = arg minj D[j ]
// the code consists of codewords that correspond to estimated signals
in the range D[ind] and D[ind + code_size − 1]
12 for j = 0, j < code_size, j + + do
13 C[j ] = I [ind + j ]
14 return C

{ }
d(A) := max |ai − aj | | ai , aj ∈ A
.

to be the largest absolute difference between elements in A. We also define the


variance of values in A, denoted .Var(A), by (see Eq. 1.35)


β
Var(A) :=
. (ai − a)2 ,
β
i=1
318 4 Side-Channel Analysis Attacks and Countermeasures

where .a is the average of .ai given by


β
a=
. ai .
β
i=1

It is easy to see that

(ai − a)2 ≤ d(A)2 ,


.

and hence

Var(A) ≤ d(A)2 .
.

Now, let C be an .(nC , 2mv )-binary code, and define

A(C) := {SG(c) | c ∈ C}
.

to be the set of estimated signals for codewords in C. When C is used for encoding
the target intermediate value, the variance of the signal at POI is then given by

. Var(XPOI ) = Var(A(C)).

The goal of Algorithm 4.5 is to find a C such that .d(A(C)) is the minimum among
all .(nC , 2mv )-binary codes whose codewords have Hamming weight .wH . According
to the above discussions, we can conclude that the SNR of the code found by
the algorithm is also relatively small. Even though this code may not be the one
that achieves the lowest SNR, another code with a lower SNR will have a bigger
.d(A(C)), which might be exploited to improve the attack results.

To see how effective is the countermeasure, we have simulated template-based


DPA attacks (see Sect. 4.3.2.3) on the 0th Sbox output of PRESENT. The dataset .T1
obtained in Code-SCA Step 3 is used as the profiling traces to build templates for
unprotected implementation. A total of .10, 000 traces were collected with random
inputs from .C(8,16) (Eq. 4.71) as profiling traces to build templates for attacks on
protected implementations.
To get the attack traces for the unprotected implementation, for each plaintext
nibble p and a fixed key nibble 9 (the same as the 0th nibble of the key in Eq. 4.1),
we precomputed the Sbox output (see Table 3.11)

v = SBPRESENT (p ⊕ 9).
.

Then we carried out the measurement for the operation described in Eq. 4.64 with .v
as the input .a. .100, 000 traces were collected for random plaintext nibbles p. Attack
traces for the protected implementation were obtained in a similar manner. Instead
of .v, we pass the corresponding codeword from .C(8,16) , .C(8,16) [v], as input .a in
Eq. 4.64. We have also measured .100, 000 traces with random plaintext nibbles.
4.5 Countermeasures Against Side-Channel Analysis Attacks 319

The attacks follow steps from Sect. 4.3.2.3, where we have set the target signal
to be the exact value of .v (or the corresponding codeword). Since we only focus on
one POI, according to Template Step b, we have computed a mean leakage for each
value of .v for unprotected implementation and for each value of codeword in .C(8,16)
for the protected implementation. The mean leakage values for different .v are given
by

. {−0.01055, −0.00943, −0.00680, −0.00772, −0.00698, −0.00778, −0.00656,


−0.00748, −0.00677, −0.00764, −0.00641, −0.00732, −0.00649, −0.00732,
−0.00619, −0.00716} .

The mean leakage values for different codewords are

. {−0.01064, −0.01054, −0.01036, −0.01037, −0.01032, −0.01058, −0.01038,


−0.01031, −0.01023, −0.01027, −0.01036, −0.01034, −0.01021, −0.01008,
−0.01014, −0.00998} .

It is easy to see that the differences between mean leakages in the first set are
bigger compared to those in the second set. If we compute the variance between
mean leakages in those two sets, we get

1.2067 × 10−6 and 2.8459 × 10−8 .


.

This shows that it is more difficult to distinguish between the leakages of codewords
in .C(8,16) than that of different values of .v. Since DPA attacks rely on exploiting the
difference between leakages for different data, we expect the protected implemen-
tation to be more challenging to attack with DPA.
The attack results are shown in Figs. 4.79 and 4.80. Computations of estimations
for success rates and guessing entropy followed Algorithm 4.1, where we have set

max_trace = 1000,
. no_of_attack = 100.

We can see that the unprotected implementation can be broken with about 150
traces, while the protected implementation cannot be broken with even 1000 traces.
We note that the number of traces required for a successful attack on unprotected
implementation is more than what we have obtained in Sect. 4.3.2.3 (see Figs. 4.51
and 4.52). This is expected as the highest SNR we have for MOV instruction
(Fig. 4.78) is much less than that for one round of PRESENT (Fig. 4.20).
For comparison, we have also repeated the same steps for the proposed counter-
measure with different values of .wH = 2, 3, 4, 5. The template-based DPA attack
results are shown in Figs. 4.81 and 4.82. We can see that all the codes increase the
number of traces needed for a successful attack. And the code with .wH = 4 behaves
the best.
320 4 Side-Channel Analysis Attacks and Countermeasures

0.8
Success rate

0.6

0.4

0.2

0
100 200 300 400 500 600 700 800 900 1,000
Number of traces

Unprotected wH = 6

Fig. 4.79 Estimations of success rate computed following Algorithm 4.1 for template-based DPA
attack on the MOV instruction taking the PRESENT Sbox output as an input. The black line
corresponds to unprotected intermediate values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all codewords have Hamming weight 6

10

8
Guessing entropy

100 200 300 400 500 600 700 800 900 1,000
Number of traces

Unprotected wH = 6

Fig. 4.80 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black line
corresponds to unprotected intermediate values. The blue line corresponds to encoded intermediate
values with the binary code .C(8,16) (Eq. 4.71), where all codewords have Hamming weight 6

We note that the presented countermeasure focuses on one instruction; therefore


the chosen binary code is only optimal for the target instruction and target
intermediate value. Nevertheless, the whole cipher state can be encoded, giving
a certain level of protection to all the other instructions. A method of encoding
4.5 Countermeasures Against Side-Channel Analysis Attacks 321

0.8
Success rate

0.6

0.4

0.2

0
100 200 300 400 500 600 700 800 900 1,000
Number of traces

Unprotected wH = 2 wH = 3
wH = 4 wH = 5 wH = 6

Fig. 4.81 Estimations of success rate computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black
line corresponds to unprotected intermediate values. The other lines correspond to encoded
intermediate values with .(8, 16)-binary codes obtained following Code-SCA Step 1–Code-SCA
Step 7, where we have set .wH = 2, 3, 4, 5, 6

10
Guessing entropy

100 200 300 400 500 600 700 800 900 1,000
Number of traces

Unprotected wH = 2 wH = 3
wH = 4 wH = 5 wH = 6

Fig. 4.82 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attack on the MOV instruction taking the PRESENT Sbox output as an input. The black
line corresponds to unprotected intermediate values. The other lines correspond to encoded
intermediate values with .(8, 16)-binary codes obtained following Code-SCA Step 1—Code-SCA
Step 1, where we have set .wH = 2, 3, 4, 5, 6
322 4 Side-Channel Analysis Attacks and Countermeasures

the whole encryption computation will be discussed in Sect. 5.2.1. Different codes
might work for different devices, but as implementers, we would have access to
the device we want to protect and can choose the best code that is suitable for the
device.10

4.5.1.2 Square and Multiply Always

In Sect. 4.4.1 we have seen one SPA attack on RSA implementations that exploits
the part of the square and multiply algorithm where multiplication is carried out only
when the secret key bit is 1. A natural countermeasure is that we always compute
multiplication no matter what the value of the secret key bit is. Such an algorithm is
called the square and multiply always algorithm [Cor99].
We keep the notations from Sect. 3.3. Let .n = pq be the product of two distinct
odd primes. Let .d ∈ Z∗ϕ(n) be the secret key of RSA/RSA signatures. We would like
to compute

a d mod n
.

for some .a ∈ Zn .
Recall that we have presented the right-to-left (Algorithm 3.7) and left-to-right
(Algorithm 3.8) square and multiply algorithms. Correspondingly, we have the
right-to-left and left-to-right square and multiply always algorithms, detailed in
Algorithms 4.6 and 4.7, respectively. In both algorithms, the modular multiplication
computation is always carried out. And when the secret bit is 0, the result is
discarded (line 6 in Algorithm 4.6 and line 7 in Algorithm 4.7).
As an illustration, let us consider our attack presented in Sect. 4.4.1. With the
square and multiply always countermeasure, Algorithm 4.2 becomes Algorithm 4.8.
With the same experimental setting as described in Sect. 4.1, we have measured one
trace for the computation of Algorithm 4.8 with our DUT. To make sure line 10 will
be executed, we have turned off the compiler optimization. The trace is shown in
Fig. 4.83. We can see that we still observe ten patterns, the same as in Sect. 4.4.1.
But in this case, all of them have more than one peak cluster. We know from the
discussions in Sect. 4.4.1 that this is because each of the patterns corresponds to
one loop from line 5 and each loop contains one modular square (line 6) and one
modular multiplication operation (line 8 or line 10). Thus, we cannot repeat the
same attack as presented in Sect. 4.4.1. However, we can deduce that the secret key
has bit length 10. In practical settings, the bit length will be much bigger. To the best
of our knowledge, this information alone cannot reveal the secret key.
On the other hand, we will show that the square and multiply always algorithm is
still vulnerable to the DPA attack presented in Sect. 4.4.2. With square and multiply
always countermeasure, Algorithm 4.4 becomes Algorithm 4.9.

10 Naturally, creating a different code for every device would be impractical for serial production.
4.5 Countermeasures Against Side-Channel Analysis Attacks 323

Algorithm 4.6: Right-to-left square and multiply always algorithm for com-
puting modular exponentiation. A hiding-based countermeasure against SCA
attacks
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n) has bit length 𝓁d
Output: a d mod n
1 result = 1, t = a
2 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
3 if di = 1 then
i
// mutiply by a 2
4 result = result ∗ t mod n
5 else
// ith bit of d is 0, compute multiplication and discard the
result
6 tmp = result ∗ t mod n
i+1
// t = a 2
7 t = t ∗ t mod n
8 return result

Algorithm 4.7: Left-to-right square and multiply always algorithm for com-
puting modular exponentiation. A hiding-based countermeasure against SCA
attacks
Input: n, a, d// n ∈ Z, n ≥ 2; a ∈ Zn ; d ∈ Zϕ(n)
Output: a d mod n
1 t =1
2 for i = 𝓁d − 1, i ≥ 0, i − − do
3 t = t ∗ t mod n
// ith bit of d is 1
4 if di = 1 then
5 t = a ∗ t mod n
6 else
// ith bit of d is 0, compute multiplication and discard the
result
7 tmp = a ∗ t mod n

8 return t

With the same experimental setting as in Sects. 4.1 and 4.4.2, one trace for
computation of Algorithm 4.9 is shown in Fig. 4.84. We note that there are 21 similar
patterns in the figure. By examining Algorithm 4.9, we can guess that each of them
might correspond to one loop from line 6 or one execution of MonPro. If the former
is true, we will have a private key d with a bit length bigger than the bit length of n.
Thus, we can conclude that most likely each of them corresponds to one execution
of MonPro. Then the last one corresponds to line 12, and the remaining 20 tells us
that .𝓁d = 10.
324 4 Side-Channel Analysis Attacks and Countermeasures

Algorithm 4.8: Protected implementation of Algorithm 4.2. Left-to-right


square and multiply always algorithm for computing modular exponentiation
(see Algorithm 3.8) with parameters from Eq. 4.59
Input: a// a ∈ Z1189
Output: a 747 mod 1189
1 n = 1189
2 dbin = [1, 1, 0, 1, 0, 1, 1, 1, 0, 1]// binary representation of d = 747, d0 = 1,
d1 = 1
3 𝓁d = length of dbin// bit length of d
4 t =1
5 for i = 𝓁d − 1, i ≥ 0, i − − do
6 t = t ∗ t mod n
// ith bit of d is 1
7 if di = 1 then
8 t = a ∗ t mod n
9 else
// ith bit of d is 0, compute multiplication and discard the
result
10 tmp = a ∗ t mod n

11 return t

Fig. 4.83 One trace corresponding to the computation of Algorithm 4.8. We can see ten similar
patterns

Following the attack steps as in Sect. 4.4.2, we have collected .M = 10,000


traces, each with .q = 10,800 time samples (11). The sample correlation coefficients
(see DPA-RSA Step 6) are shown in Fig. 4.85, where the trace from Fig. 4.84 is
in gray in the background. We can see that there are 21 patterns in the sample
correlation coefficient plot which coincide with those from the trace plot in
Fig. 4.84—each corresponds to one execution of MonPro.
We also see mainly two different patterns, one with a higher positive peak
cluster and one with a lower positive peak cluster. They are colored in blue and
green, respectively, in Fig. 4.86. Clearly, the one with a higher positive peak cluster
corresponds to the computation of line 9 or line 11 since it results in bigger
4.5 Countermeasures Against Side-Channel Analysis Attacks 325

Algorithm 4.9: Montgomery left-to-right square and multiply always algo-


rithm with parameters from Eq. 4.59. MonPro is given by Algorithm 4.3
Input: a// a ∈ Zn ;
Output: a 747 mod 1189
1 n = 1189, r = 2048
2 dbin = [1, 0, 1, 1, 1, 0, 1, 0, 1, 1]// binary representation of d = 747, d0 = 1,
d1 = 1
3 𝓁d = length of dbin// bit length of d
4 tr = r mod n
5 ar = ar mod n
6 for i = 𝓁d − 1, i ≥ 0, i − − do
7 tr = MonPro(tr , tr )// tr = tr ×Mon tr .
8 if dbin[i] = 1 then
9 tr = MonPro(tr , ar )// tr = tr ×Mon ar .
10 else
// ith bit of d is 0, compute multiplication and discard the
result
11 tmp = MonPro(tr , ar )

12 t = MonPro(tr , 1)// t = tr ×Mon 1 = tr ∗ r −1 mod n.


13 return t

Fig. 4.84 One trace corresponding to the computation of Algorithm 4.9. We can see 21 similar
patterns. Each of them corresponds to one execution of MonPro

correlation coefficient with .ar . The green-colored patterns correspond to line 7,


except for the last one, which corresponds to line 12. Our attack will work if we
can distinguish between line 9 and line 11. We take a closer look at those blue-
colored patterns. We can see that some of them have a high peak at the end—they
are colored in lighter blue. The others do not have this high peak—they are colored
in darker blue. We know that the first secret bit .d𝓁−1 = 1; thus we deduce that those
lighter blue-colored patterns correspond to line 9 and the darker blue-colored ones
correspond to line 11. Consequently, each lighter blue-colored pattern indicates a
secret bit .= 1, and each darker blue-colored pattern indicates a secret bit .= 0. We
can write down the bits of the secret key as follows:
326 4 Side-Channel Analysis Attacks and Countermeasures

Sample correlation coefficient

Fig. 4.85 Sample correlation coefficients computed following attack steps from Sect. 4.4.2 with
.10, 000 traces for the computation of Algorithm 4.9. The trace from Fig. 4.84 is gray in the
background. We can see that there are 21 patterns in the sample correlation coefficient plot that
coincide with those from Fig. 4.84—each corresponds to one execution of MonPro
Sample correlation coefficient

Fig. 4.86 There are mainly two types of patterns in the sample correlation coefficient plot from
Figure 4.85—one with a higher peak cluster (colored in blue) and one with a lower peak cluster
(colored in green). Among the blue-colored patterns, we further divide them into two types—one
with a high peak at the end (in lighter blue) and one without this peak (in darker blue)

1 0 1 1 1 0 1 0 1 1,
.

and the secret key d is given by

d = 1011101011 = 747.
.

4.5.2 Masking and Blinding

As mentioned before, the goal of masking/blinding is to randomize the intermediate


values being processed in the DUT. When such a countermeasure is applied to
symmetric block ciphers, we refer to it as masking. And when it is applied to public
4.5 Countermeasures Against Side-Channel Analysis Attacks 327

key cryptosystem implementations, following the convention, it is called blinding


instead.
Let .v be the secret intermediate value that we would like to mask. The masked
value, denoted .v m , is concealed by a random value .m, called a mask, with a binary
operation .· such that

v m = v · m.
.

When the binary operation .· is given by bitwise XOR, we have a Boolean


masking. When .· is a modular addition or modular multiplication, we have an
arithmetic masking. We will discuss Boolean masking for symmetric block ciphers
in Sects. 4.5.2.1–4.5.2.3 and arithmetic masking for RSA implementations in
Sect. 4.5.2.4.

4.5.2.1 Introduction to Boolean Masking

As one can imagine, the cryptographic algorithm needs to be changed a bit for us to
carry out computations with the masked intermediate values and keep track of all the
masks. So that at the end of the encryption, we can remove the masks to output the
original ciphertext. In general, a masking scheme specifies how masks are applied
to the plaintext and intermediate values, as well as how they are removed from the
ciphertext. There are a few principles we follow for a masking scheme design:
• All intermediate values should be masked during the computation. In particular,
we would apply masks to the plaintext (and the key).
• We assume the attacker does not have knowledge of the masks—otherwise, the
attacker can carry out similar DPA attacks by making hypotheses about the key
values as in Sects. 4.3.1 and 4.3.2.
• When some intermediate values are to be XOR-ed with each other (e.g., in AES
MixColumns operation), different masks should be applied to each of them.
Otherwise, the same valued masks will cancel out.
• Each encryption has a different set of randomly generated masks.
For any function f , the mask that is applied to an input of f is called the input
mask of f . The corresponding mask for the output is called the output mask of f .
Definition 4.5.1 Let .f : Fm m2
2 → F2 be a function, where .m1 and .m2 are positive
1

integers. f is said to be linear (w.r.t. .⊕) if for any .x, y ∈ Fm1


2 , we have

.f (x ⊕ y) = f (x) ⊕ f (y).

f is nonlinear if it is not linear.


328 4 Side-Channel Analysis Attacks and Countermeasures

Example 4.5.1
• AddRoundKey operation in AES (Sect. 3.1.2) round function is a linear function.
In fact, bitwise XOR with a round key is a linear function in general.
• DES (Sect. 3.1.1) Sboxes are nonlinear functions. Any Sbox proposed so far for
symmetric block ciphers is nonlinear.
• pLayer in PRESENT (Sect. 3.1.3) round function is linear.
• MixColumns operation in AES is linear (see Remark 3.1.3).
With Boolean masking, it is easy to keep track of the masks with linear operations.
Let f be a linear function, and take any input of f , .v, with a corresponding mask
.m; we have

. f (v ⊕ m) = f (v) ⊕ f (m).

Thus, when the input mask is .m, the output mask is given by .f (m). One of the main
challenges in designing a masking scheme is to find ways to keep track of masks for
nonlinear operations.

4.5.2.2 Boolean Masking for AES-128

In this part, we will discuss a masking scheme for AES-128. The scheme was first
proposed in [HOM06], see also [MOP08, Section 9.2.1].
The only nonlinear operation in AES encryption is SubBytes. Let SB denote
AES Sbox. We will consider a table lookup implementation (see Sect. 3.2.1) for the
SubBytes operation. We choose an input mask .min, SB and an output mask .mout, SB
for SB. Then we generate a table that implements the masked Sbox, denoted SB.m ,
such that

SBm (v ⊕ min, SB ) = SB(v) ⊕ mout, SB .


. (4.72)

The masking scheme works as follows. Firstly, at the beginning of each


encryption:
• We randomly generate six independent masks with values from .F82 , denoted by

min, SB ,
. mout, SB , m0 , m1 , m2 , m3 .

min, SB and .mout, SB will be the input and output masks for AES Sbox computa-
.

tion. .m0 , m1 , m2 , m3 will be used as input masks for MixColumns operation.


• Compute the lookup table for masked Sbox as given in Eq. 4.72.
• Calculate .m'0 , m'1 , m'2 , m'3 from .m0 , m1 , m2 , m3 using the MixColumns operation
(see Eq. 3.6):
4.5 Countermeasures Against Side-Channel Analysis Attacks 329

⎛ '⎞ ⎛ ⎞⎛ ⎞
m0 02 03 01 01 m0
⎜m' ⎟ ⎜01 02 03 01⎟ ⎜m1 ⎟
. ⎜ 1⎟ = ⎜ ⎟⎜ ⎟ (4.73)
⎝m' ⎠ ⎝01 01 02 03⎠ ⎝m2 ⎠ .
2
m'3 03 01 01 02 m3

Let us keep the matrix representation of the AES cipher state as in Eq. 3.2.
During the encryption, the masking scheme continues as follows. We apply masks
' ' ' '
.m , m , m , m to the plaintext such that the 4 bytes in row .i + 1 are masked with
0 1 2 3
'
.m . Then the cipher state before the initial AddRoundKey is of the format
i

⎛ ⎞
s00 ⊕ m'0 s01 ⊕ m'0 s02 ⊕ m'0 s03 ⊕ m'0
⎜s10 ⊕ m' s11 ⊕ m'1 s12 ⊕ m'1 s13 ⊕ m'1 ⎟
.⎜ 1 ⎟. (4.74)
⎝s20 ⊕ m' s21 ⊕ m'2 s22 ⊕ m'2 s23 ⊕ m' ⎠
2 2
s30 ⊕ m'3 s31 ⊕ m'3 s32 ⊕ m'3 s33 ⊕ m'3

We will not detail the masking scheme for the key schedule. For a round key K, we
use the following matrix representation:
⎛ ⎞
k00 k01 k02 k03
⎜k10 k11 k12 k13 ⎟
.⎜ ⎟.
⎝k20 k21 k22 k23 ⎠
k30 k31 k32 k33

We assume that the round keys, except for the last round key, are all masked such
that the bytes in row .i + 1 are masked with .m'i ⊕ min, SB . Then for a round key K,
the representation of its masked value in the matrix format will be
⎛ ⎞
k00 ⊕ m'0 ⊕ min, SB k01 ⊕ m'0 ⊕ min, SB k02 ⊕ m'0 ⊕ min, SB k03 ⊕ m'0 ⊕ min, SB
⎜k10 ⊕ m' ⊕ min, SB k11 ⊕ m'1 ⊕ min, SB k12 ⊕ m'1 ⊕ min, SB k13 ⊕ m'1 ⊕ min, SB ⎟
.⎜ 1 ⎟.
⎝k20 ⊕ m' ⊕ min, SB k21 ⊕ m'2 ⊕ min, SB k22 ⊕ m'2 ⊕ min, SB k23 ⊕ m'2 ⊕ min, SB ⎠
2
k30 ⊕ m'3 ⊕ min, SB k31 ⊕ m'3 ⊕ min, SB k32 ⊕ m'3 ⊕ min, SB k33 ⊕ m'3 ⊕ min, SB
(4.75)
After the initial AddRoundKey, according to Eqs. 4.74 and 4.75, the cipher state
becomes
⎛ ⎞
s00 ⊕ min, SB s01 ⊕ min, SB s02 ⊕ min, SB s03 ⊕ min, SB
⎜s10 ⊕ min, SB s11 ⊕ min, SB s12 ⊕ min, SB s13 ⊕ min, SB ⎟
.⎜ ⎟, (4.76)
⎝s20 ⊕ min, SB s21 ⊕ min, SB s22 ⊕ min, SB s23 ⊕ min, SB ⎠
s30 ⊕ min, SB s31 ⊕ min, SB s32 ⊕ min, SB s33 ⊕ min, SB

where each byte is masked with .min, SB .


330 4 Side-Channel Analysis Attacks and Countermeasures

For round 1–round 9, the changes in cipher states after each operation of AES-
128 with masked implementation are detailed below:
• SubBytes. The SubBytes operation is performed using the table designed for
SB.m . By Eq. 4.72, after the SubBytes operation; each byte of the cipher state is
masked by .mout, SB :
⎛ ⎞
s00 ⊕ mout, SB s01 ⊕ mout, SB s02 ⊕ mout, SB s03 ⊕ mout, SB
⎜s10 ⊕ mout, SB s11 ⊕ mout, SB s12 ⊕ mout, SB s13 ⊕ mout, SB ⎟
.⎜ ⎟. (4.77)
⎝s20 ⊕ mout, SB s21 ⊕ mout, SB s22 ⊕ mout, SB s23 ⊕ mout, SB ⎠
s30 ⊕ mout, SB s31 ⊕ mout, SB s32 ⊕ mout, SB s33 ⊕ mout, SB

• ShiftRows. ShiftRows does not change the masks, each byte of the cipher state
is still masked by .mout, SB .
• MixColumns. Before MixColumns, we change the masks of the cipher state by
XOR-ing the four bytes in row .i + 1 with .m'i ⊕ mout, SB . In this way, the input of
MixColumns is of the format
⎛ ⎞
s00 ⊕ m0 s01 ⊕ m0 s02 ⊕ m0 s03 ⊕ m0
⎜s10 ⊕ m1 s11 ⊕ m1 s12 ⊕ m1 s13 ⊕ m1 ⎟
.⎜ ⎟.
⎝s20 ⊕ m2 s21 ⊕ m2 s22 ⊕ m2 s23 ⊕ m2 ⎠
s30 ⊕ m3 s31 ⊕ m3 s32 ⊕ m3 s33 ⊕ m3

By the choice of .m'i (see Eq. 4.73), the cipher state at the output of MixColumns
is the same as in Eq. 4.74:
⎛ ⎞
s00 ⊕ m'0 s01 ⊕ m'0 s02 ⊕ m'0 s03 ⊕ m'0
⎜s10 ⊕ m' s11 ⊕ m'1 s12 ⊕ m'1 s13 ⊕ m'1 ⎟
.⎜ 1 ⎟.
⎝s20 ⊕ m' s21 ⊕ m'2 s22 ⊕ m'2 s23 ⊕ m'2 ⎠
2
s30 ⊕ m'3 s31 ⊕ m'3 s32 ⊕ m'3 s33 ⊕ m'3

• AddRoundKey. After the AddRoundKey of the round, the cipher state becomes
the same as the input of this round, as given in Eq. 4.76:
⎛ ⎞
s00 ⊕ min, SB s01 ⊕ min, SB s02 ⊕ min, SB s03 ⊕ min, SB
⎜s10 ⊕ min, SB s11 ⊕ min, SB s12 ⊕ min, SB s13 ⊕ min, SB ⎟
.⎜ ⎟.
⎝s20 ⊕ min, SB s21 ⊕ min, SB s22 ⊕ min, SB s23 ⊕ min, SB ⎠
s30 ⊕ min, SB s31 ⊕ min, SB s32 ⊕ min, SB s33 ⊕ min, SB

We can repeat the above for every round from round 1 to round 9. Finally, the input
of round 10 is in the form of Eq. 4.76. After SubBytes and ShiftRows in round 10,
the cipher state will be the same as in Eq. 4.77. Thus we require that each byte of the
last round key is masked by .mout, SB . In this way, we will get unmasked ciphertext.
4.5 Countermeasures Against Side-Channel Analysis Attacks 331

Table 4.4 Relation between the output bits of Sboxes from the Quotient group .Qj i and the input
bits of Sboxes from the corresponding Remainder group .Rj i+1 . For example, the 0th input bit of
SB.i+1
j +4 in .Rj
i+1 comes from the first output bit of .SBi in .Qj i
4j
\\ i
\\ Qj SBi SBi4j +1 SBi4j +2 SBi4j +3
Rj i+1\\ \
4j

SBi+1
j (0, 0) (1, 0) (2, 0) (3, 0)
SBi+1
j +4 (0, 1) (1, 1) (2, 1) (3, 1)
SBi+1
j +8 (0, 2) (1, 2) (2, 2) (3, 2)
SBi+1
j +12 (0, 3) (1, 3) (2, 3) (3, 3)

4.5.2.3 Boolean Masking for PRESENT

We will present two methods for masking PRESENT encryption. Let SB denote the
PRESENT Sbox (Table 3.11) for the rest of this part.
Before we go into details of the masking scheme, we introduce the notion of
Quotient group and Remainder group. We number the Sboxes in the ith round of
PRESENT as .SBi0 , SBi1 , . . . , SBi15 , where .SBi0 is the right-most Sbox in Fig. 3.9.
Those Sboxes can be grouped in two different ways: the Quotient group and the
Remainder group:
{ } { }
Qj i := SBi4j , SBi4j +1 , SBi4j +2 , SBi4j +3 , Rj i := SBij , SBij +4 , SBij +8 , SBij +12 ,
.

where .j = 0, 1, 2, 3. Such a grouping allows us to relate the bits for each Sbox
output in round i to bits of each Sbox input in round .i + 1 in a certain way through
pLayer, as shown in Table 4.4. In particular, we observe that:
• Bits of the 0th Sbox (.SBi4j ) output in Quotient group .Qj i are permuted to the
0th bits of Sbox inputs in the corresponding Remainder group .Rj i+1 ;
• Bits of the first Sbox (.SBi4j +1 ) output in .Qj i are permuted to the first bits of
Sbox inputs in .Rj i+1 .
• Bits of the second Sbox (.SBi4j +2 ) output in .Qj i are permuted to the second bits
of Sbox inputs in .Rj i+1 .
• Bits of the third Sbox (.SBi4j +3 ) output in .Qj i are permuted to the third bits of
Sbox inputs in .Rj i+1 .
An illustration is shown in Fig. 4.87.
Hence pLayer can be considered as four identical parallel bitwise operations
where each is a function .p : F16 2 → F2 that takes one Quotient group output
16

and permutes it to the corresponding Remainder group input.


The first masking scheme follows a similar methodology as the masking scheme
for AES presented in Sect. 4.5.2.2. Given an input mask .min and an output mask
.mout for the PRESENT Sbox, we compute a table T that implements the masked

Sbox such that


332

Fig. 4.87 An illustration of the relation between Sbox outputs in a Quotient group to Sbox inputs in the corresponding Remainder group. Sboxes in Quotient
groups .Q0i , .Q1i , .Q2i , .Q3i and their corresponding Remainder groups .R0i+1 , .R1i+1 , .R2i+1 , .R3i+1 are in orange, blue, green, and red colors, respectively
4 Side-Channel Analysis Attacks and Countermeasures
4.5 Countermeasures Against Side-Channel Analysis Attacks 333

T [v ⊕ min ] = SB(v) ⊕ mout .


. (4.78)

At the beginning of each encryption:


• We randomly generate two independent masks .min , mout with values from .F42 .
• Compute lookup table T (given in Eq. 4.78) for the masked Sbox.
• Calculate .m3 , m2 , m1 , m0 from .mout with the pLayer operation

m3 , m2 , m1 , m0 = p(mout , mout , mout , mout ).


. (4.79)

Let us represent the intermediate values of PRESENT encryption as

b15 , b14 , . . . , b1 , b0 ,
. (4.80)

where each .bj denotes a nibble of the cipher state. At the start of the encryption, we
mask the ith four nibbles of the plaintext with .mi , mi , mi , mi (.i = 0, 1, 2, 3). This
means the cipher state at the input of round 1 is given by

b15 ⊕ m3 , . . . , b12 ⊕ m3 , b11 ⊕ m2 , . . . , b8 ⊕ m2 , b7 ⊕ m1 , . . . , b4 ⊕ m1 ,


.
b 3 ⊕ m0 , . . . , b 0 ⊕ m0 .
(4.81)
The cipher state changes for each round of PRESENT are as follows:
• addRoundKey. We assume the key schedule is changed so that the ith (.i =
0, 1, 2, 3) four nibbles of each round key, except for the last round key, are
masked by

mi ⊕ min , mi ⊕ min , mi ⊕ min , mi ⊕ min .


.

Then after the addRoundKey operation, the cipher state is of the following
format:

b15 ⊕ min , b14 ⊕ min , . . . , b1 ⊕ min , b0 ⊕ min ,


.

where each nibble is masked by .min .


• sBoxLayer. By our design of the masked Sbox lookup table (Eq. 4.78), after
sBoxLayer, each nibble of the cipher state will be masked by .mout :

.b15 ⊕ mout , b14 ⊕ mout , . . . , b1 ⊕ mout , b0 ⊕ mout .

• pLayer. After the pLayer computation, according to our discussion above about
Quotient group, Remainder group, and Eq. 4.79, the cipher state will become (see
Fig. 4.87)
334 4 Side-Channel Analysis Attacks and Countermeasures

Table 4.5 An example of T2, which specifies the output mask .mout,SB for each input mask .min,SB
of PRESENT Sbox [SBM18] such that all possible values of .min ⊕ mout appear
.min,SB 0 1 2 3 4 5 6 7 8 9 A B C D E F
.mout,SB = T2[min,SB ] E 4 F 9 0 3 D 5 7 8 A 2 B 1 6 C
.min,SB ⊕ mout,SB E 5 D A 4 6 B 2 F 1 0 9 7 C 8 3

b15 ⊕ m3 , . . . , b12 ⊕ m3 , b11 ⊕ m2 , . . . , b8 ⊕ m2 , b7 ⊕ m1 , . . . , b4 ⊕ m1 ,


.
b 3 ⊕ m0 , . . . , b 0 ⊕ m0 ,

which is the same as in Eq. 4.81. Thus the above can be repeated for all 31 rounds.
We assume the last round key has the same masks as the plaintext. Then after the
final addRoundKey operation, we will get unmasked ciphertext.
The second masking scheme for PRESENT is detailed in [SBM18]. Different
from the masked AES Sbox lookup table, this time we compute a lookup table,
denoted T1, such that for any .v ∈ F42 , any input mask .min ∈ F42 , and the
corresponding output mask .mout ∈ F42 for PRESENT Sbox,

T1[v ⊕ min , min ] = SB(v) ⊕ mout .


. (4.82)

We also need another table T2 that helps us to keep track of the masks

T2[min ] = mout ,
. min = 0, 1, . . . , F. (4.83)

In this way, we do not need to generate a masked Sbox lookup table whenever the
input mask for the Sbox changes. The size of T1 is .8 × 4, and the storage required
is .28 × 24 = 212 bits or .29 bytes. The table T2 requires 16 bits of memory. It is
suggested that T2 should be designed such that all possible values of .min ⊕ mout
appear. For example, one possible choice of T2 is given in Table 4.5, originally
presented in [SBM18].
In fact, in general, we have the following observations:
Remark 4.5.1 Let f be a function, and let .min,f denote its input mask with
corresponding output mask .mout,f . For any input .x of f , we have

(x ⊕ f (x)) ⊕ (min,f ⊕ mout,f ) = (x ⊕ min,f ) ⊕ (f (x) ⊕ mout,f ).


.

Thus, when choosing the input mask .min,f and its corresponding output mask
mout,f , we need to ensure that all possible values of .min,f ⊕ mout,f appear.
.

Otherwise, the distribution induced by .(x ⊕ f (x)) ⊕ (min,f ⊕ mout,f ) will not be
uniform, and the signal corresponding to the value of .x ⊕ f (x) cannot be properly
concealed, making it vulnerable to DPA attacks.
4.5 Countermeasures Against Side-Channel Analysis Attacks 335

Since the pLayer operation is linear, we can simply apply pLayer to the masks to
keep track of their changes. We use the same notation as in Eq. 4.80 for PRESENT
cipher state. At the beginning of one encryption, we randomly generate 16 masks,
each is applied to one nibble of the plaintext. Suppose the cipher state at the input
of round i is of the following format:

b15 ⊕ mi−1
.
i−1 i−1 i−1
15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .

The changes in cipher states of PRESENT for round i are as follows:


• addRoundKey. We do not apply masks to the round keys. Consequently, after
the addRoundKey operation, each nibble of the cipher state still has the same
mask:

b15 ⊕ mi−1
.
i−1 i−1 i−1
15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .

• sBoxLayer. Let
⎡ ⎤
.mi−1
j,out = T2 m i−1
j,in , j = 0, 1, . . . , 15,

denote the output mask for PRESENT Sbox corresponding to the input mask
.mi−1
j,in . Then after sBoxLayer, the cipher state is of the following format:

b15 ⊕ mi−1
.
i−1 i−1 i−1
15,out , b14 ⊕ m14,out , . . . , b1 ⊕ m1,out , b0 ⊕ m0,out ,

• pLayer. We apply the pLayer operation to both the cipher state and the mask for
the whole cipher state. The mask for the whole cipher state is the string obtained
by concatenating all 16 masks .mi−1
j,out :

mi−1
.
i−1 i−1 i−1
15,out , m14,out , . . . , m1,out , m0,out .

After pLayer, masks for each nibble of the cipher state will be changed and the
cipher state will become

. b15 ⊕ mi15,in , b14 ⊕ mi14,in , . . . , b1 ⊕ mi1,in , b0 ⊕ mi0,in ,

where

mi15,in , mi14,in , . . . , mi1,in , mi0,in = pLayer(mi−1


.
i−1 i−1 i−1
15,out , m14,out , . . . , m1,out , m0,out ).

Consequently, .mij,in will be the input mask for the j th Sbox in round .i + 1.
Finally, after 31 rounds, we have another addRoundKey operation, which does not
change the masks of the cipher state since the round keys are not masked. The cipher
state will be
336 4 Side-Channel Analysis Attacks and Countermeasures

b15 ⊕ m31
. 15,in , b14 ⊕ m14,in , . . . , b1 ⊕ m1,in , b0 ⊕ m0,in .
31 31 31

To get the unmasked ciphertext, we remove the masks by XOR-ing the cipher state
with

m31
.
31 31 31
15,in , m14,in , . . . , m1,in , m0,in .

An algorithmic description for masked PRESENT computation is given in Algo-


rithm 4.10. The changes in masks are recorded in the variable masks, and we remove
them at the end of the computation (line 12).

Algorithm 4.10: Masked implementation of PRESENT


Input: p, T1, T2, Ki (i = 1, 2, . . . , 32)// p is the plaintext for encryption; T1
is the table for masked Sbox as given in Eq. 4.82; T2 specifies the
output mask given the input mask for PRESENT Sbox as defined in
Eq. 4.83; Ki are round keys for PRESENT encryption
Output: ciphertext
1 randomly generate 16 masks m0 , m1 , . . . , m15
2 array of size 16 state = p ⊕ m15 , m14 , . . . , m1 , m0 // mask the j th nibble of the
plaintext with mj , each entry of the array is one masked nibble
3 array of size 16 masks = m15 , m14 , . . . , m1 , m0
4 for i = 0, i < 31, i + + do
5 state = addRoundKey(state, Ki )
6 for j = 0, j < 16, j + + do
// for each nibble
7 state[j ] = T1[state[j ], masks[j ]]// masked Sbox computation
8 masks[j ] = T2[masks[j ]]// record the output masks of Sbox
computation
9 state = pLayer(state)// apply pLayer to the cipher state
10 masks = pLayer(masks)// apply pLayer to the masks
11 state = addRoundKey(state, Ki )
12 state = state ⊕ masks
13 return state

As an illustration, we have implemented masked PRESENT following Algo-


rithm 4.10, where we used Table 4.5 to choose the output mask .mout,SB given the
input mask .min,SB for PRESENT Sbox. With the experimental setup described in
Sect. 4.1, we have collected four datasets, with similar settings as those datasets in
Sect. 4.1. All the datasets contain traces that capture one round of masked software
implementation of PRESENT encryption.
• Masked fixed dataset A: This dataset contains 100 traces with a fixed round key
FEDCBA0123456789 and a fixed plaintext ABCDEF1234567890.
• Masked fixed dataset B: This dataset contains 100 traces with a fixed round key
FEDCBA0123456789 and a fixed plaintext 84216BA484216BA4.
4.5 Countermeasures Against Side-Channel Analysis Attacks 337

Fig. 4.88 t-Values (Eq. 4.17) for all time samples .1, 2, . . . , 3600 computed with 50 traces from
Masked fixed dataset A and 50 traces from Masked fixed dataset B. The signal is given by the
plaintext value, and the fixed versus fixed setting is chosen. Blue dashed lines correspond to the
threshold .4.5 and .−4.5

• Masked random plaintext dataset: This dataset contains 20000 traces with a fixed
round key

FEDCBA0123456789
. (4.84)

and a random plaintext for each trace.


• Masked random dataset: This dataset contains .10,000 traces with a random
round key and a random plaintext for each trace.
In each case, the execution of the cipher is surrounded by nop instructions so that the
round operation patterns can be clearly distinguished from the provided plots. While
the raw traces are all 5000 time samples long, for plotting and analysis purposes, we
shorten them to 3600 time samples as the later parts correspond to nop instructions
and do not contain any useful information. We also note that for these datasets, we
reduced the number of collected time samples by a factor of 3.
Following the TVLA steps from Sect. 4.2.3, we have computed the t-values using
50 traces from Masked fixed dataset A and 50 traces from Masked fixed dataset B.
The intermediate value .v (TVLA Step 2) is chosen to be the plaintext value. Fixed
versus fixed setting (TVLA Step 3) is used. The results are shown in Fig. 4.88. We
can see that compared to Fig. 4.13, the t-values are much lower, and there are no
points with very high peaks to stand out from the rest. Even though some time
samples have t-values outside of the threshold, they are not far from it. This indicates
that the implementation should exhibit less leakage as compared to the unprotected
one.
To see how the implementation is resistant to DPA attacks, we have adopted
the template-based DPA as described in Sect. 4.3.2.3. Following steps from
Sect. 4.3.2.1, we take Masked random dataset as our profiling traces. The same
as in Sect. 4.3.2.3, the target part of the key (P-DPA Step 3) is the 0th nibble of
the first round key. The target intermediate value (P-DPA Step 4) is the 0th Sbox
338 4 Side-Channel Analysis Attacks and Countermeasures

Fig. 4.89 SNR computed with Masked random dataset. The signal is given by the exact value of
the 0th Sbox output

output of the first round. We consider the target signal (P-DPA Step 5) to be the
exact value of .v, since in Sect. 4.3.2.3 we have seen that this is a better choice than
taking .wt (v) to be the target signal (see Fig. 4.52). Consequently, we group our
profiling traces Masked random dataset into 16 sets (P-DPA Step 6). Using the
methodology from P-DPA Step 7 and P-DPA Step 8, we have computed the SNR
values for all 3600 time samples using Masked random dataset. The results are
shown in Fig. 4.89.
The time sample achieving the highest SNR is .t = 1929, which will be
our POI (Template Step a from Sect. 4.3.2.3). Following Template Step b from
Sect. 4.3.2.3, we have built the template for this POI. In particular, the mean values
.μs for .s = 0, 1, . . . , 15 (corresponding to .v = 0, 1, . . . , 16) are as follows:

. {−0.04463, −0.04415, −0.04443, −0.04401, −0.03993, −0.03977, −0.03977,


−0.03959, −0.04437, −0.04397, −0.04419, −0.04374, −0.03958,
−0.03948, −0.03947, −0.03932} .

We take the Masked random plaintext dataset as our attack traces (P-DPA Step
10). There are 16 key hypotheses .k̂i = i − 1 (.i = 0, 1, . . . , 15). Based on our
implementation, the hypothetical intermediate value should be given by

v̂ ij = SBPRESENT (k̂i ⊕ pj ⊕ m0,j ),


. i = 1, 2, . . . , 16, j = 1, 2, . . . , M̂p ,

where .pj is the 0th nibble of the plaintext corresponding to the j th trace, .m0,j is
the input mask applied to this nibble, and .M̂p ≤ 20000 is the number of traces used
for the attack. However, as mask values are unknown to the attacker, we will only
compute the unmasked hypothetical intermediate value (P-DPA Step 11)

v̂ ij = SBPRESENT (k̂i ⊕ pj ),
. i = 1, 2, . . . , 16, j = 1, 2, . . . , M̂p .
4.5 Countermeasures Against Side-Channel Analysis Attacks 339

Fig. 4.90 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks on the Masked random plaintext dataset (in black) and on the Random plaintext
dataset (in red)

Fig. 4.91 Estimations of guessing entropy computed following Algorithm 4.1 for template-based
DPA attacks on the Masked random plaintext dataset (in black) and on the Random plaintext
dataset (in red)

Since we chose the signal to be the exact value of .v, our leakage model will be the
identity leakage model. The hypothetical signals are given by (P-DPA Step 12)

.Hij = v̂ ij , i = 1, 2, . . . , 16, j = 1, 2, . . . , M̂p .

Following Template Step c, we can compute the probability score for each key
hypothesis. Then with Algorithm 4.1, we can calculate the estimations for guessing
entropy and success rate of the attack. We have set

max_trace = 60,
. no_of_attack = 100.

The results are shown in Figs. 4.90 and 4.91.


For comparison, we have also plotted the results for template-based DPA attack
on unprotected implementations with one POI (in red in both figures). We can see
that compared to the attacks on unprotected implementations, the same attack on
the masked PRESENT requires more traces for the attack to be successful.
340 4 Side-Channel Analysis Attacks and Countermeasures

In practice, more than one mask will be applied to provide better protection,
leading us to higher order masking. We will briefly introduce this notion in Sect. 4.6.

4.5.2.4 Blinding for RSA and RSA Signatures

As mentioned before, the application of masking in the context of a public


cryptosystem is called blinding. Normally an arithmetic mask is applied.
Let .p, q be two distinct odd primes, .n = pq be the RSA modulus, .d ∈ Z∗ϕ(n) be
the private key for RSA, and .e = d −1 mod ϕ(n) be the public key. The attacks we
have seen in Sect. 4.4 exploit leakages during the computation of

a d mod n
.

for some .a ∈ Zn . Those attacks can be during the RSA signature signing process or
RSA decryption. More attacks will be discussed in Sect. 4.6.
Given those attacks, it is recommended to blind the secret values during the
computation. It is also required that the masks and blinded values should be updated
frequently or even during the computations. In this case, it will be difficult for the
attacker to combine whatever partial information obtained from the leakages of the
previously blinded value and the newly leaked information.
In this part, we will discuss a few methods, including exponent blinding, message
blinding, and modulus blinding. The countermeasures are mostly designed against
DPA attacks. In particular, the message blinding method will be effective against the
DPA attack we have presented in Sect. 4.4.2. The original proposals can be found
in [BCDG10, KJJR11].
Exponent blinding First, we consider how we can randomize the secret exponent
d. One method is that we generate a random number .λ ∈ [0, 2𝓁 − 1]. Then instead
of computing

a d mod n,
.

we compute

a d+λϕ(n) mod n.
. (4.85)

Note it follows from Corollary 1.4.5 that

a d+λϕ(n) ≡ a d mod n.
.

Example 4.5.2 Let .p = 3, .q = 5, then .n = 15 and .ϕ(n) = 2 × 4 = 8. The same


as in Example 3.3.1, we choose .e = 3, and we get .d = 3. Take .a = 8 and .λ = 2.
We have computed in Example 3.3.1 that

.a d mod n = 83 mod 15 = 2.
4.5 Countermeasures Against Side-Channel Analysis Attacks 341

With the above countermeasure (Eq. 4.85), we have

a d+λϕ(n) mod n = 83+2×8 mod 15 = 819 mod 15 = 8 × (82 )9 mod 15


.

= 8 × 49 mod 15 = 8 × 4 × (42 )4 mod 15 = 32 mod 15 = 2.

Typically, for RSA modulus of bit length 1024, we take .𝓁 = 20 or 30 to guarantee a


reasonable overhead [BCDG10].
The next method takes a random number .λ and calculates

a λ × a d−λ mod n.
.

It is easy to see that

a λ × a d−λ mod n = a d mod n.


.

A third method generates a random number .λ such that .gcd(λ, ϕ(n)) = 1 and
calculates
−1 mod ϕ(n))
(a λ )d(λ
. mod n. (4.86)

Since

λ(d(λ−1 mod ϕ(n))) ≡ d mod ϕ(n),


.

it follows from Corollary 1.4.5 that


−1
.(a λ )d(λ mod ϕ(n))
mod n = a d mod n.

Example 4.5.3 The same as in Example 4.5.2, let .p = 3, .q = 5, .n = 15, .d = 3,


a = 8. We have computed that
.

a d mod n = 2.
.

Choose .λ = 3, which is coprime to .ϕ(n) = 8. Then by the extended Euclidean


algorithm,

8 = 3 × 2 + 2,
. 3 = 2 + 1 =⇒ 1 = 3 − 2 = 3 − (8 − 3 × 2) = 3 × 3 − 8,

we have

λ−1 mod ϕ(n) = 3−1 mod 8 = 3.


.
342 4 Side-Channel Analysis Attacks and Countermeasures

According to Eq. 4.86,


−1
(a λ )d(λ
.
mod ϕ(n))
mod n = (83 )3×3 mod 15 = (512)9
mod 15 = 29 mod 15 = 512 mod 15 = 2.

Another exponent blinding method considers a CRT-based RSA implementation.


Recall from Sect. 3.5.1.3 that to compute

a d mod n,
.

following CRT-based RSA, we can first calculate

ap := a d mod (p−1) mod p,


. aq := a d mod (q−1) mod q. (4.87)

Then .a d mod n is given by

ap yq q + aq yp p mod n,
. or equivalently ap + ((aq − ap )yp mod q)p,

where

yq = q −1 mod p,
. yp = p−1 mod q.

The countermeasure takes two random numbers .λ1 , λ2 , and instead of computing
ap , .aq with Eq. 4.87, we calculate
.

ap = a d+λ1 (p−1) mod p,


. aq = a d+λ2 (q−1) mod q.

It follows from Corollary 1.4.3 that

a d+λ1 (p−1) mod p = a d mod (p−1) mod p, a d+λ2 (q−1) mod q = a d mod (q−1) mod q.
.

Example 4.5.4 The same as in Example 4.5.2, let

p = 3,
. q = 5, n = 15, d = 3, a = 8.

We have computed in Example 3.5.5 that

ap = a d mod (p−1) mod p = 83 mod 2 mod 3 = 2,


.

aq = a d mod (q−1) mod q = 83 mod 4 mod 5 = 2.


4.5 Countermeasures Against Side-Channel Analysis Attacks 343

Take .λ1 = 2, .λ2 = 3, with our countermeasure, we have

ap = a d+λ1 (p−1) mod p = 83+2×2 mod 3 = 87 mod 3 = 27 mod 3 = 128 mod 3 = 2


.

aq = a d+λ2 (q−1) mod q = 83+3×4 mod 5 = 815 mod 5 = 315 mod 5 = 3×(32 )7 mod 5
= 3 × 47 mod 5 = 3×4×(42 )3 mod 5 = 12 mod 5 = 2.

Our SCA attacks from Sect. 4.4 rely on exploiting the leakages to get the
value of each bit of the secret exponent d. We can see that for all the exponent
blinding methods above, assuming an attack on one RSA decryption (or RSA
signatures singing) execution, with the same methods, we can only recover the
value of .d+a random number or .d×a random number, making the real value of d
concealed from the attacker. On the other hand, if two computations with different
masks are attacked, the secret key can be recovered. For example, with the first
countermeasure, if we know the values for

d + λ1 ϕ(n),
. d + λ2 ϕ(n),

then we can get

(λ1 − λ2 )ϕ(n).
.

Since .λ1 and .λ2 have bit length 20 or 30, we can factorize .(λ1 − λ2 )ϕ(n) by trying
all possible values of .λ1 and .λ2 .
Message blinding We can also mask the value a. In this way, DPA attacks (e.g.,
the attack in Sect. 4.4.2) that rely on knowing certain intermediate values related to
a cannot be carried out.
Take a random number .λ such that .gcd(λ, n) = 1, and compute

a1 = λe mod n,
. a2 = λ−1 mod n.

To get

a d mod n,
.

we calculate

(((aa1 )d mod n)a2 ) mod n.


.

Since

ed ≡ 1 mod ϕ(n),
.
344 4 Side-Channel Analysis Attacks and Countermeasures

by Corollary 1.4.5,

λed ≡ λ mod n =⇒ λed−1 mod n = 1.


.

Then

(((aa1 )d mod n)a2 ) mod n = (((aλe mod n)d mod n)(λ−1 mod n)) mod n
.

= ((a d mod n)(λed−1 mod n)) mod n = a d mod n.

The first mask .a1 randomizes the input of the computation, and the second mask .a2
corrects the output to the expected result.
Example 4.5.5 Keep the same parameters as in Example 4.5.2:

p = 3,
. q = 5, n = 15, e = 3, d = 3, a = 8, ϕ(n) = 8.

We know that .a d mod n = 2. Take .λ = 4, which is coprime with n. Then with the
message blinding countermeasure above, we have

.a1 = λe mod n = 43 mod 15 = 64 mod 15 = 4.

By the extended Euclidean algorithm,

15 = 4 × 3 + 3,
. 4 = 3 + 1 =⇒ 1 = 4 − 3 = 4 − (15 − 4 × 3) = 4 × 4 − 15

and

a2 = λ−1 mod n = 4−1 mod 15 = 4.


.

Finally,

(((aa1 )d mod n)a2 ) mod n = (((8 × 4)3 mod 15) × 4) mod 15


.

= ((23 mod 15) × 4) mod 15


= 32 mod 15 = 2.

Modulus blinding When the modulus is random during the computations, similar
to random values of a, DPA attacks such as the one in Sect. 4.4.2 cannot be carried
out as the attacker does not know the modulus to derive the target intermediate
values.
For blinding the modulus n, we generate a random number .λ and compute

(a d mod (λn)) mod n.


.
4.6 Further Reading 345

It is easy to see that

(a d mod (λn)) mod n = a d mod n.


.

Example 4.5.6 Keep the same parameters as in Example 4.5.2:

p = 3,
. q = 5, n = 15, e = 3, d = 3, a = 8, ϕ(n) = 8.

We know that .a d mod n = 2. Let .λ = 4, then we have

.(a d mod (λn)) mod n = (83 mod (4 × 15))


mod 15 = (512 mod 60) mod 15 = 32 mod 15 = 2.

Remark 4.5.2 Note that the message blinding and the modulus blinding methods
we have presented can also be used in a similar way to protect the computation of

a d mod p,
. a d mod q,

in CRT-based RSA implementations.

4.6 Further Reading

Leakage model We note that the Hamming distance, Hamming weight, identity
(Sect. 4.2.1), and stochastic leakage (Sect. 4.3.2.2) models all assume there are no
differences in the leakage when the value in a bit switches from 0 to 1 or from 1 to
0. Improved models can be found in, e.g., [PSQ07, GHP04].
Leakage assessment TVLA (see Sect. 4.2.3) was first proposed in 2011
[GGJR+ 11]. More discussions on how to set the threshold 4.5 can be found
in [DZD+ 18]. Another prominent leakage assessment method is Person’s χ 2 -
test [SM15], which is normally used as a replacement for TVLA when analyzing
multivariate and horizontal leakages.
Simple power analysis We have seen that by visual inspection of the power traces,
the attacker can gain information about the operations being executed on the device.
SPA was first introduced in [KJJ99], which is also the very first proposal of power
analysis attacks. The authors mentioned that programs involving conditional branch
operations depending on secret parameters are at risk. Later this idea was applied to
develop an SPA attack on RSA [MDS99b] (see Sect. 4.4.1).
[Nov02] (see also [KJJR11, Section 3.3]) proposes an attack that exploits
vulnerability in Garner’s algorithm for CRT-based RSA. The authors demonstrate
that with SPA, we can identify if a mod p > a mod q. Then with adaptive chosen
ciphertext and binary search, the value of p can be recovered. [FMP03] shows
346 4 Side-Channel Analysis Attacks and Countermeasures

that with only known messages, assuming p and q have different lengths, in case
q < p/2𝓁 , p and q can be recovered by performing 60 × 2𝓁 signatures on average.
A lower bound of 𝓁 is specified in the paper.
SPA has also been used to obtain the Hamming weight of operands [MS00] or
attack AES key schedule [Man03]. Similar to profiled DPA, we can carry out a
profiled SPA attack, see, e.g., [Man03, Section5.3].
Differential power analysis A DPA attack on DES can be found in, e.g.,
[MDS99a]. For AES, detailed descriptions are given in [MOP08, Chapter 6].
For DPA attacks on RSA, [MDS99b] lists different variants of DPA on RSA,
where some can be considered as extended SPA attacks. [dBLW03] proposes a
DPA attack on CRT-based implementation using Garner’s algorithm. The target
intermediate value is the remainder after the modular reduction with one of the
primes. [AFV07] studies more attacks on other intermediate values of CRT-based
RSA. We have elaborated one of the methods in Sect. 4.4.2.
We also refer the readers to [MOP08, Sta10, KJJR11] for more discussions on
SPA and DPA.
Template attacks The idea of template attacks was first introduced in [CRR03]. In
Sect. 4.3.2.3, we discussed how templates can be used for DPA on symmetric block
ciphers. In a similar manner, template-based attacks can also be applied to SPA
on symmetric block ciphers [MOP08, Section 5.3], and SCA on RSA [VEW12,
XLZ+ 18].
We note that the template attacks we have described used normal distributions
to approximate the distributions induced by leakages. One might refer to this
as a Gaussian template attack. A more generic method, MIA, can be found
in [GBTP08], where the authors aim to approximate the mutual information
between the hypothetical leakages and the actual measured leakages without making
assumptions on the leakage distribution.
SCADPA Side-channel assisted differential plaintext attack (SCADPA) was first
proposed in [BJB18] for PRESENT and in [BJHB19] for GIFT implementations.
It was later generalized to all SPN block ciphers in [BBH+ 20]. The attack presented
in Sect. 4.3.3 is based on this generalized attack. We refer the readers to the original
paper [BBH+ 20] for attacks on more ciphers and analysis of attack complexity.
More attacks Other side-channel attack methods exist for symmetric block
ciphers. For example, collision attacks [SWP03] identify the collision of
intermediate values between two encryptions using power traces to recover
the secret key. Algebraic side-channel attacks [RS09] express both the target
algorithm and its leakages as equations to achieve successful attacks with unknown
plaintext/ciphertext. Soft analytical SCA [VCGS14] constructs a graph for the
implementation and uses the belief propagation algorithm on this graph to efficiently
combine the information of all leakage points. DCSCA (differential ciphertext
SCA) [HBB21] targets GIFT cipher. The attack analyzes the statistical distribution
of intermediate values with the help of side-channel leakages to recover the last
4.6 Further Reading 347

round key. The authors also demonstrated the extension of the attack to GIFT-based
AEAD schemes.
Preprocessing of traces During the measurements, it can happen that the traces
contain too much noise. Or if there are certain countermeasures in place, the traces
can also be misaligned. There are various classical methods for preprocessing
traces. For example, moving average computes the average of the leakages from
a few time samples to smooth the signal. Principal component analysis [BHvW12]
aims to reduce the noise in the traces by projecting high-dimensional data to a
lower dimensional subspace while preserving the data variance. Elastic alignment
[vWWB11] aligns the traces by focusing on the synchronization of trace shape and
generating artificial samples. It can be used to counter jitter-based countermeasures.
The method is based on the dynamic time warping algorithm designed for speech
recognition [SC78].
Hiding-based countermeasure A hiding-based countermeasure aims to make the
leakage random or constant independent of the operation/data.
To randomize the leakage, we can insert random delays (jitters) [CK09] or
shuffle the execution order of independent operations. For example, shuffle Sboxes
in AES implementations [HOM06] randomize the sequence of square and multiply
operations in RSA [Wal02]. Another approach to randomizing the leakage proposes
to use residue number systems to allow randomizing the representation of finite field
elements for computing exponentiation [BILT04].
To make the leakage constant, different methodologies have been proposed on
different levels. For the cell level (or logic design level), we have, for example,
dual-rail precharge logic (DPL) [TV06] and dynamic and differential logic styles
[TAV02]. DPL has two phases: in the precharge phase, values in the wires are set to
a precharge value (either 0 or 1); then during the evaluation phase, one wire carries
the signal 0, and the other wire carries the signal 1. We note that this is equivalent
to using the binary code {01, 10} for encoding 0, 1. For the software level, we have
seen encoding-based countermeasures for symmetric block ciphers in Sect. 4.5.1.1
and square and multiply always algorithm for RSA in Sect. 4.5.1.2. The original
proposal of the square and multiply always algorithm can be found in [Cor99]. More
on encoding-based countermeasures can be found in, e.g., [CG16]. [CG16] uses
linear complementary dual code—a code C is a complementary dual code if C ∩
C ⊥ = {0} (see Definition 1.6.9). Another example of software level countermeasure
can be found in [HDD11], where the authors propose to use DPL in software for
symmetric block ciphers. See also [RGN13] for a DPL in software countermeasure
with provable security for bitsliced implementation of PRESENT.
Masking-based countermeasures Those countermeasures are designed to make
the leakage dependent on some random value. Masking was first proposed by
Goubin and Patarin [GP99] and Chari et al. [CJRR99] independently. It has
been proven that masking-based countermeasure is secure given that the source of
randomness is truly random [PR13]. Due to this sound mathematical basis, it has
become the most adopted countermeasure for symmetric block ciphers.
348 4 Side-Channel Analysis Attacks and Countermeasures

Instead of a naive lookup table implementation of a masked Sbox as presented in


Sects. 4.5.2.2 and 4.5.2.3, many other methods have also been proposed. We refer
the readers to [DCRB+ 16, OS05] for masked AES Sbox and [PMK+ 11, SBM18]
for masked PRESENT Sbox.
In Sects. 4.5.2.1–4.5.2.3 we have seen how Boolean masking, i.e., the inter-
mediate value is concealed by ⊕ with the mask(s), can be implemented for
AES and PRESENT. Section 4.5.2.4 focuses on arithmetic masking, where the
intermediate value is concealed by an arithmetic operation (modular addition or
modular multiplication). There are many other ways of applying the masks for
example, affine masking [VW01], polynomial masking [GM11], inner product
masking [BFGV12], etc. The exact operation for applying a mask is typically
chosen depending on the operations that are used in the cryptographic algorithm
that we would like to protect. Some cryptographic algorithms (e.g., AES) contain
both Boolean and arithmetic operations,11 and they can be protected using both
Boolean and arithmetic masking. But switching from one type of masking to another
is not a trivial task. We refer the readers to [Mes00, Gou01, CT03, AG01] for more
discussions on this topic.
Masking-based countermeasures can also be implemented on the hardware level,
for example, masking buses [BGM+ 03], Boolean masking of DLP [PM05], random
precharging [BGLT04]. However, it has been shown that masked gates in hardware
are vulnerable to DPA attacks due to glitches in CMOS circuits [MPG05]. This
leads to the development of threshold implementations [NRS11], which are based
on multiparty computation [CD+ 15], as a way to realize secure Boolean masking
in hardware.
Higher order masking In Sects. 4.5.2.1–4.5.2.3 we focused on masking schemes
with one mask only. In particular, the masked value v m is related to the original value
v and the mask m through a binary operation v m = v · m. In the language of secret
sharing [Bei11], we can say that the secret value v is represented by two shares,
v m and m. We can see that given only one of the two shares, no information about v
can be revealed. Instead of two shares (or one mask), we can also use several shares,
resulting in a higher order masking. In particular, a dth-order masking applies d − 1
masks to the secret value v.
Similarly, in Sects. 4.3.1 and 4.3.2, we only focused on one target intermediate
value. Such a DPA attack is also called a first-order DPA. When leakages of several
different intermediate values (e.g., at different time samples) are analyzed, we have
a higher order DPA [CJRR99]. The number of traces needed for a higher order
DPA to succeed is exponential in the standard deviation of the noise. The exponent
is given by d + 1, where d + 1 is the order of the masking (i.e., d masks are applied)
[PR13].
For the security of a Boolean masking. Let us take a secret value v of bit length
at most mv . The masked value is given by v ⊕ m, where m ∈ Fm v
2 . We can consider

11 AES Sbox is based on modular computations in a specific field—see Eq. 3.3.


4.6 Further Reading 349

the value of m as a discrete random variable. In case the distribution induced by this
random variable is uniform on Fm 2 , the distribution induced by the value of v ⊕ m
v
mv
is also uniform on F2 , regardless of the value of v. Thus, we expect the leakage
to be independent of v when only first-order DPA is carried out. The security proof
for first-order Boolean masking against first-order DPA can be found in [BGK04].
Results for higher order Boolean masking are given in [RP10]. However, the proofs
rely on the masks to be truly random, which is not easy to achieve in practice. For
example, our masked implementation from Sect. 4.5.2.3 can still be attacked with
first-order DPA (see Figs. 4.90 and 4.91). We also note that the choice of masks
should follow certain rules so that the masking scheme is more secure (see, e.g.,
[BGN+ 15])
Blinding Blinding was first suggested in [Koc96]. It was then later formalized by
J. S. Coron [Cor99]. It is worth noting that several patents have been published
about masking [KJJ10] and blinding [KJ01].
Various attacks on blinding have also been published. For example, [FV03] pro-
poses an attack on the left-to-right square and multiply algorithm that recovers a
blinded secret exponent with SPA. [WvWM11] discusses a DPA attack on the
square and multiply always algorithm and message blinding. [FRVD08] exploits
the leakage during the computation of the random exponent.
More about countermeasures For SCA countermeasures, except for those intro-
duced in this chapter, there are also many other techniques. In general, we can divide
them according to the levels of protection.
Protocol level countermeasures aim to design cryptographic protocols to survive
leakage analysis. For example, by limiting the number of communications that can
be performed with any given key, fewer measurements can be done by the attacker
for the same key or by rekeying [MSGR10].
Cryptographic primitive level countermeasures are proposals of new cipher
designs that are resistant to side-channel attacks.
Implementation level countermeasures were the focus of this chapter, where
we discussed some hiding and masking/blinding techniques in Sect. 4.5. There are
also other implementation-level countermeasures, for example, time randomization
[MMS01b] and encryption of the buses [BHT01].
Architecture level countermeasures refer to techniques that modify the archi-
tecture of the computation device. For example, [MMS01a] proposes to use
a nondeterministic processor to randomly change the sequence of the executed
program during each execution; [SVK+ 03] integrates secure instructions into a
nonsecure processor.
Hardware level countermeasures protect the implementations through external
means, for example, conforming glues [AK96], protective coating [TSS+ 06], and
detachable power supplies [Sha00].
Attacks on post-quantum cryptographic implementations Several papers pro-
pose SCA on post-quantum cryptosystems.
350 4 Side-Channel Analysis Attacks and Countermeasures

For example, in [UXT+ 22], side-channel leakage during the execution of


a pseudorandom function in the re-encryption of key encapsulation mechanism
decapsulation is exploited. With the leakage, the attacker gains information on
whether the public key decryption result is equivalent to the reference plaintext.
The authors in [GJJ22] propose to submit special ciphertexts to the decryption
oracle that correspond to cases of single errors. Through leakage in the additive
Fast Fourier Transform step used to evaluate the error locator polynomial, a single
entry of the secret key can be determined. A survey on SCA and FIA on Kyber
and Dilithium post-quantum schemes with novel countermeasures was presented
in [RCDB22].
Attacks on neural networks SCA techniques have also been adopted for attacking
neural network implementations. In such cases, normally a black-box scenario is
assumed and the attacker’s goal is to recover the secret parameters of the target
neural network. [BBJP19] demonstrated how a timing-based attack can recover
the architecture information. Then with DPA techniques, weights for each neuron
can be recovered. The provided experiments were done on the ARM Cortex-M3
microcontroller. [YMY+ 20] experimented on a hardware implementation of neural
networks on an FPGA. For a more comprehensive overview of the topic, we refer
the interested reader to [BBB+ 22].
Correspondingly, SCA countermeasures for cryptographic implementations have
also been adopted for protecting neural network implementations. For example,
masking methods have been utilized in [DCA20, DAP+ 22, DCSA22]. Threshold
implementation was proposed in [MBFC22] with a Trivium stream cipher to
generate the randomness. From the area of hiding countermeasures, shuffling
was implemented in [NY21] and desynchronization by adding a random jitter
in [BJHB23].

4.6.1 AI-Assisted SCA

AI-based methods have been applied for side-channel analysis in the past few years.
If we look at DPA (Sects. 4.3.1, 4.3.2, and 4.4.2), the key recovery is essentially
a classification problem. In particular, in a profiled setting, the profiling phase
corresponds to the training phase of an AI-based algorithm. During the attack phase,
the analysis of the leakage traces can be seen as a classification problem where
the goal of an attacker is to classify those traces based on the related data (e.g.,
a specific Sbox output value). Various AI-based techniques have been adopted for
SCA, e.g., k-nearest neighbor algorithm [MZMM16], random forest [LBM15],
support vector machines [HZ12], multilayer perception (MLP) [GHO15], and
convolutional neural networks (CNNs) [ZBHV20]. It has also been shown that, with
neural networks, protected implementations can be broken. For example, [WP20]
used autoencoder to break hiding countermeasures, while in [MPP16], the authors
successfully broke masking countermeasures with deep learning techniques.
4.6 Further Reading 351

As an example, let us consider the case of a neural network used for the
classification problem in a DPA attack on AES implementations (see Sect. 4.3.1).
The input of the network will then be (part of) the traces. The output layer will have
a softmax activation function, and each class corresponds to one possible value of
the target Sbox output, hence leading to one key byte hypothesis with the knowledge
of the plaintext. Then during the inference, for each input data, the network output
indicates the possibilities of the 256 values for the Sbox output, which gives a
possibility of each of the corresponding key byte hypotheses.
Success rate and guessing entropy Given a few, say .M̂p , data (trace), we can
compute a score for each key hypothesis by summing up the corresponding
probabilities predicted using each data. Then we can rank the key hypotheses
according to their scores with the one ranked the first having the highest score. Let

us denote the rank of the correct key hypothesis by .rkAIp . It is easy to see that we

can consider .rkAIp as a random variable whose randomness comes from different
plaintexts/measurements.
Recall that in Eq. 4.36, we have defined the success rate for a DPA attack. For
AI-based SCA attacks, we have an equivalent definition of success rate, namely

the probability that .rkAIp = 1. Similarly, we can also define the guessing entropy

(see Eq. 4.37) to be the expectation of the random variable .rkAIp . Same as for DPA
attacks, we can estimate success rate with the frequency of successful attacks among

a number of trials and estimate guessing entropy using the sample mean of .rkAIp .
In particular, for a fixed .M̂p , we randomly select .M̂p data from the test set and

carry out an attack with .M̂p traces, and then we compute .rkAIp . We repeat this

procedure for, e.g., 100 times, which gives us a sample of .rkAIp . Its mean is then
an estimation for the guessing entropy. An estimation for the success rate is the

frequency of .rkAIp = 1 among those 100 simulated attacks.
In most cases, the goal of AI-based SCA is to achieve a low guessing entropy or
a high success rate with as few traces as possible after training.
Different research topics in AI-assisted SCA Many different aspects of AI-
assisted SCA have been analyzed by researchers.
Firstly, there are a few publications on public datasets, which are used to evaluate
novel proposals of AI-based techniques. To name a few, ASCAD dataset [BPS+ 20,
BPS+ 21] contains power traces for software implementations of AES with masking
countermeasures and artificially introduced random jitters. AES_HD [BJP20]
dataset is EM traces corresponding to unprotected AES hardware implementation on
FPGA. AES_RD [CK09, CK10, CK18] dataset consists of power traces of software
implementations of AES with random delay.
The most studied direction is of course to achieve high success rates or low
guessing entropy. By examining the similarity of side-channel traces to time series
352 4 Side-Channel Analysis Attacks and Countermeasures

data (e.g., audio signals), [KPH+ 19] proposed a VGG15-like network together
with a regularization method achieved by adding noise to the traces. Zaid et
al. [ZBHV20] introduced a methodology for the design of CNNs in the SCA
context. The paper analyzed several datasets and constructed an optimal CNN
for each dataset. [WJB20] showed an improvement of [ZBHV20] using data
oversampling. [PCP20] used ensemble models to achieve good generalization from
the training set to the validation set for a given dataset. On the other hand, Won
et al. [WHJ+ 21] utilized Multi-scale Convolutional Neural Networks for SCA to
achieve the goal of integrating classical trace preprocessing techniques and attacking
several datasets without changing the network architectures.
Hyperparameter tuning, an important problem in AI algorithm development in
general, naturally attracted attention in the domain of SCA. Various methods have
been proposed, for example, Bayesian optimization and random search [WPP22],
reinforcement learning [RWPP21], and genetic algorithm for choosing architec-
tures [MPP16] or for choosing all hyperparameters [AGF21].
It has been shown that test accuracy in machine learning cannot properly assess
SCA performance [PHJ+ 19]. Because of this observation, many training strategies
are studied, for example, stopping criteria based on success rate [RZC+ 21] or based
on mutual information [PBP21].
Recently, non-profiled AI-based SCA has also gained attention in the research
community. For example, in [Tim19], the authors propose to train a neural network
for each key hypothesis. To do this, the attacker splits the traces based on the key
hypothesis, just like when carrying out DPA. The network that achieves the best
training metrics then reveals the actual key byte. This method was titled Differential
Deep Learning Analysis (DDLA).
Stream ciphers were targeted by a combination of machine learning, mixed
integer linear programming, and satisfiability modulo theory methods [KDB+ 22].
Furthermore, AI-based methods have also been adopted for the identification of
points of interest [LZC+ 21] and leakage assessment [MWM21].
Chapter 5
Fault Attacks and Countermeasures

Fault attacks are active attacks where the attacker tries to perturb the internal
computations by external means. Such attacks exploit a scenario where the attacker
has access to the device and can tamper with it.
Fault attacks can be achieved with different techniques, ranging from simple
clock/voltage glitches to sophisticated optical fault injections (see Sect. 6.2 for more
details).
The attacker’s goal is to recover the secret master key of the cryptographic
algorithm. The attack methodologies are normally developed on the algorithmic
level. But implementation-specific vulnerabilities also exist (Sect. 5.1.4).
There are different effects that a fault injection can achieve. Instruction skip and
instruction change perturbs the instruction being executed by modifying the opcode
of the instruction. Bit flip flips the bits in the data. The number of bits affected is
normally limited by the register size (although, technically it is possible to affect
a few registers at once). We use the notation m-bit flip to indicate how many bits
are flipped by the fault attack. This notion is consistent with our previous definition
of bit flip (see Definition1.2.17). Bit set/rest fixes the bit value to be 1 (set) or 0
(reset). Random byte fault changes the byte value to a random number. Stuck-at
fault permanently changes the value of one bit to 0 (stuck-at-0) or 1 (stuck-at-1).
We refer to those different effects as fault models.
If the fault injected in an intermediate value x results in a faulty value .x ' , we refer
to .ε := x ⊕ x ' as the fault mask, which represents the change in the faulted value.
We can divide the faults into two types depending on how long the effects
last. A permanent fault is a destructive fault that changes the value of a memory
cell permanently and hence affects data during the computations. Whereas when a
transient fault is injected, the circuit recovers its original behavior after the fault
stimulus ceases (usually just one instruction) or after the device reset. A transient
fault can perturb both data and instruction. In this chapter, we only consider transient
faults.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 353
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_5
354 5 Fault Attacks and Countermeasures

After the fault injection, there are two possible scenarios. The output (ciphertext)
is faulty, or the fault is ineffective and the ciphertext is not changed. We will see that
both scenarios can be exploited.
In the rest of this chapter, we will discuss fault attacks and countermeasures
for symmetric block ciphers (Sects. 5.1 and 5.2) and for RSA and RSA signatures
(Sects. 5.3 and 5.4).

5.1 Fault Attacks on Symmetric Block Ciphers

This section presents a few fault attack methods on symmetric block ciphers. By
convention (see Kerckhoffs’ principle in Definition 2.1.3), we assume that the
specifications of round functions and key schedules are public. The master key, and
hence also the round keys, is secret. We also assume that throughout the attack,
the same master key is used, and the goal of the attacker is normally to recover
certain round key(s). The methodologies presented can be applied to an unprotected
implementation of any symmetric block cipher proposed up to now.
Fault attacks normally aim to recover the last/first round key(s) and then use the
inverse key schedule to find the master key. As mentioned in Remarks 3.1.1, 3.1.4,
and 3.1.5, for DES and PRESENT-80, the knowledge of any round key gives 48 and
64 bits of the master key, respectively, and the rest of the key bits can be brute forced,
while for AES, the value of any round key reveals the value of the full master key.

5.1.1 Differential Fault Analysis

Differential Fault Analysis (DFA) was first introduced by Biham et al. [BS97] in
1997. It has been studied by numerous researchers in different settings and is one of
the most popular fault attack analysis methods for symmetric block ciphers.
DFA considers a fault injection into the intermediate state of the cipher, normally
in the last few rounds. Then the difference between correct and faulty ciphertexts is
analyzed to recover the round key(s).
Before going into details of DFA, we recall the notion of differential distribution
table of an Sbox from Definition 4.3.1.
Example 5.1.1 Let us consider one of the DES Sboxes, SB.1DES : F62 → F42
(Table 3.3). We note that since the maximum bit length of the input, 6, is longer
than that of the output, 4, the output difference can be zero for some cases. The size
of the table is .26 × (24 − 1). Part of it can be found in Table 5.1.
For example,

SB1DES (5 ⊕ 8) ⊕ SB1DES (5) = SB1DES (D) ⊕ SB1DES (000101)


.

= SB1DES (001101) ⊕ 7 = 13 ⊕ 7 = A.
Table 5.1 Part of the difference distribution table for SB.1DES (Table 3.3)
H
HH δ 1 2 ... 7 8 ...
Δ H
0 ... 13,14 ...
1 ... 1,6,30,37 ...
2 ... 3,4,A,D,23,24,31,33,34,36 ...
3 1C,1D,2C,2D,3C,3D 5,7,C,E,21,23,30,32 ... 2,5,8,F 11,12,13,17,19,1A,
1B,1F,26,2E,37,3F ...
5.1 Fault Attacks on Symmetric Block Ciphers

4 ... ...
5 6,7 20,22,3C,3E ... 1B,1C,3B,3C 7,F,23,27,2B,2F,35,3D . . .
6 C,D,24,25 24,26,2D,2F ... 9,E,11,12,15,16,20,27 4,C,10,14,18,1C,32,3A . . .
7 16,17,32,33 15,17,1C,1E ... 21,26,2A,2D 22,2A,36,3E ...
8 ... 10,17 ...
9 E,F,10,11,28,29,36,37,38,39 10,12,2C,2E,38,3A ... B,C,22,25 6,E,20,25,28,2D ...
A 4,5,14,15,26,27,30,31,34,35,3A,3B 0,2,14,16,25,27,39,3B . . . 0,7,1A,1D,28,2F,39,3E 5,D ...
.. .. .. .. .. .. ..
. . . . . . .
355
356 5 Fault Attacks and Countermeasures

Fig. 5.1 An illustration of DFA

An illustration of DFA is shown in Fig. 5.1. The attacker injects a fault in a


chosen round of the algorithm to get the desired fault propagation at the end of the
encryption. By examining the differences between a correct and a faulty ciphertext,
the possible values of the secret key can be narrowed down, and we also say that
the key hypotheses are reduced. Another important concept needed for DFA is
nonlinear functions (see Definition 4.5.1). The fault is usually injected at the input
of a nonlinear function of the algorithm.
Example 5.1.2 (How DFA works on a simple example) Let us consider the AND
operation (see Example 1.2.14) that takes inputs .a, b ∈ F2 and outputs

c = a & b.
.

All possible values of .a, b, c are given by

a b .c =a&b
0 0 0
0 1 0
1 0 0
1 1 1

Suppose the output c can be observed by the attacker and .a, b are unknown.
The goal of the attacker is to recover the value of a. This can be achieved by
DFA—during the computation, the attacker injects a fault in b by flipping it. By
the knowledge of the faulty and the correct outputs, the attacker can easily recover
the value of a: If the output stays the same, then .a = 0; otherwise .a = 1.
Next, we will detail how DFA works on an Sbox. Let SB.: Fω2 1 → Fω2 2 be an
Sbox, and let .a ∈ Fω2 1 , b ∈ Fω2 2 be fixed secret values. Define

f : Fω2 1 → Fω2
. (5.1)
x |→ SB(x ⊕ a) ⊕ b.

We will show how to recover the values of .a and .b with DFA.


Let us consider faults injected in the input of f . We use .x ' to denote the faulty
value of .x. The same as in Example 5.1.2, we assume a bit-flip fault model. Let .ε
denote the fault mask, i.e., .ε = x ⊕ x ' .
5.1 Fault Attacks on Symmetric Block Ciphers 357

Suppose the attacker has the knowledge of the Sbox design, inputs and outputs of
f , and the fault mask .ε. Furthermore, the attacker can repeat the computation with
the same input (not chosen by the attacker). With details of the Sbox, the attacker
can compute the DDT, denoted by T , of SB.
Let .Δ denote the difference between the correct and the faulty output; we have

Δ = (SB(x ⊕ a) ⊕ b) ⊕ (SB(x ' ⊕ a) ⊕ b) = SB(x ⊕ a) ⊕ SB(x ' ⊕ a)


.

= SB(x ⊕ a) ⊕ SB(x ⊕ a ⊕ ε).


(5.2)

Then the value .x ⊕ a is in the entry of T corresponding to input difference .δ = ε


and output difference .Δ. Thus, the possible values for .x ⊕ a can be reduced to those
in .T [Δ, ε]. With the knowledge of .x, the attacker can narrow down the possible
values of .a. With the knowledge of the input and output of f , each value of .a gives
a unique value of .b. The attacker can repeat the attack until the value of .a (and hence
.b) is recovered or until brute force is possible to try the remaining values.

Example 5.1.3 (How DFA works on PRESENT Sbox) Let us consider the case
when the Sbox in the definition of f (Eq. 5.1) is the PRESENT Sbox (Table 3.11).
Suppose the attacker fixes the input to be .x = 0, and they know that the correct
output of f is 0.
When the attacker injects fault in .x with fault mask .ε1 = 3, they get a faulty
output 1. By Eq. 5.2, we have

Δ1 = 0 ⊕ 1 = 1.
.

Thus .x ⊕ a is in the entry of DDT of PRESENT Sbox corresponding to input


difference 3 and output difference 1. By Table 4.1, the possible values for .x ⊕ a
are given by 9 and A.
When the attacker injects another fault with fault mask .ε2 = 2, they get a faulty
output 6. We have .Δ2 = 6. Again by Table 4.1, the possible values for .x ⊕ a are
given by 9 and B.
Thus, the attacker can conclude that

x ⊕ a = 9.
.

Since .x = 0, we know .a = 9. With the knowledge that the correct output is 0, we


have

SBPRESENT (0 ⊕ 9) ⊕ b = 0 =⇒ b = SBPRESENT (9).


.

Table 3.11 gives .b = E.


We can check that for .ε1 = 3,

Δ1 = f (0 ⊕ 3) = SBPRESENT (3 ⊕ 9) ⊕ E = SBPRESENT (A) ⊕ E = F ⊕ E = 1,


.
358 5 Fault Attacks and Countermeasures

and for .ε2 = 2,

Δ2 = f (0 ⊕ 2) = SBPRESENT (2 ⊕ 9) ⊕ E = SBPRESENT (B) ⊕ E = 8 ⊕ E = 6.


.

One might ask, how many faults are needed to recover the values of .a and .b. If
we take a closer look at Table 4.1, we can see that in case the attacker can choose the
fault mask, they only need two faults. For example, fault masks 3 and 5 can uniquely
determine the Sbox input—any two distinct elements that appear in the same entry
in column .δ = 3 are in two different entries in column .δ = 5. When a random fault
mask is considered, a brute force analysis can show that at most four different fault
masks are needed.

5.1.1.1 DFA on DES

Now, we will discuss how DFA can break implementations of DES (Sect. 3.1.1).
Recall that DES is a Feistel cipher. Its cipher state at the end of round i can be
denoted as .Li and .Ri , where L stands for left and R stands for right. The DES round
function F satisfies

(Li , Ri ) = F (Li−1 , Ri−1 ), where Li = Ri−1 , Ri = Li−1 ⊕ f (Ri−1 , Ki ).


.

(5.3)
Before the first round function, the encryption starts with an initial permutation (IP).
The inverse of IP, called the final permutation (IP.−1 ), is applied to the cipher state
after the last round before outputting the ciphertext. In our analysis, we ignore the
final permutation and consider the value before that as the ciphertext. Otherwise, the
attacker can easily obtain this value by applying IP to the ciphertext.
At the ith round, the function f in the round function (Eq. 5.3) of DES takes
input .Ri−1 ∈ F322 and round key .Ki ∈ F2 and outputs a 32-bit intermediate value
48

as follows:

f (Ri−1 , Ki ) = PDES (Sboxes(EDES (Ri−1 ) ⊕ Ki )).


. (5.4)

First, .Ri−1 is passed to an expansion function .EDES : F32 2 → F2 (Table 3.2).


48

Then the output .EDES (Ri−1 ) is XOR-ed with the round key .Ki , producing a 48-bit
intermediate value. This 48-bit value is divided into eight 6-bit subblocks. Eight
j
distinct Sboxes, SB.DES : F62 → F42 (1 ≤ j ≤ 8), are applied to each of the 6 bits.
Finally, the resulting 32-bit intermediate value goes through a permutation function
.PDES : F
32 → F32 (Table 3.4).
2 2
For .j = 1, 2, . . . , 8, let .EDES (Ri )j denote the j th 6 bits of .EDES (Ri ). For
example, .EDES (Ri )1 are bits at positions .1, 2, 3, 4, 5, 6 of .EDES (Ri ) (see also Note
j −1
in Sect. 3.1.1). Similarly, let .Ki denote the j th 6 bits of .Ki and .PDES (Ri ⊕ Li−1 )j
−1
be the j th 4 bits of .PDES (Ri ⊕ Li−1 ). By Eqs. 5.3 and 5.4, we have
5.1 Fault Attacks on Symmetric Block Ciphers 359

−1 j j
PDES
. (Ri ⊕ Li−1 )j = SBDES (EDES (Ri−1 )j ⊕ Ki ). (5.5)

We consider a fault injection at the right half of the cipher state at the beginning
of the 16th round, i.e., fault in .R15 . Suppose the fault model is 1-bit flip. In other
words, the fault mask .ε ∈ F32
2 satisfies .wt (ε) = 1 and

'
R15
. = R15 ⊕ ε.

We assume the attacker has the knowledge of the output of DES (correct
and faulty ciphertexts), fault model, and fault location. They can also repeat the
computation with the same plaintext, not chosen by the attacker. The attacker’s goal
is to recover .K16 , the last round key.
Let .L'16 and .R16 ' denote the left and right parts of the faulty ciphertext,

respectively. By our assumption, the attacker has the knowledge of .L'16 and .L16 .
Since .R15 = L16 , we have

L'16 ⊕ L16 = R15


.
'
⊕ R15 = ε (fault mask). (5.6)

Define
'
ΔR16 := R16
. ⊕ R16 . (5.7)

By Eq. 5.5,

−1 j j
PDES
. (R16 ⊕ L15 )j = SBDES (EDES (L16 )j ⊕ K16 ),
−1 '
⊕ L15 )j = SBDES (EDES (L'16 )j ⊕ K16 )
j j
PDES (R16
j j
= SBDES (EDES (L16 ⊕ ε)j ⊕ K16 ).

Since .PDES and .EDES are linear, we have


−1
PDES
. (ΔR16 )j
j j j j
= SBDES (EDES (L16 )j ⊕ K16 ⊕ EDES (ε)j ) ⊕ SBDES (EDES (L16 )j ⊕ K16 ).

j
Thus, .EDES (L16 )j ⊕ K16 is an input for the j th DES Sbox such that with input
−1
difference .EDES (ε)j , the output difference is .PDES (ΔR16 )j . With the knowledge of
j
.ε, .ΔR16 , and .L16 , the attacker can reduce the key hypotheses for .K .
16
We note that if .EDES (ε)j = 0, the input for the j th Sbox is not changed, and
the output will also not change. In this case, we say that this Sbox is inactive.
Otherwise, we say the Sbox is active. For an inactive Sbox, a different fault mask
will be needed to activate this Sbox. Since we consider a 1-bit flip, by the design of
.EDES (Table 3.2), 16 bits of the input are repeated in the output; thus only one or

two Sboxes will be active for one fault mask.


360 5 Fault Attacks and Countermeasures

Example 5.1.4 Let

L15 = 00000000,
. R15 = 00000000, K16 = 14D8F55DAA7A.

By Eq. 5.4,

f (R15 , K16 ) = PDES (Sboxes(EDES (R15 ) ⊕ K16 )) = 832ABB8E.


.

By Eq. 5.3,

. L16 = R15 = 00000000, R16 = L15 ⊕ 832ABB8E = 832ABB8E.

Suppose fault mask .ε = 40000000, then


'
R15
. = 40000000,

and

.L'16 = R15
' '
= 40000000, R16 '
= L15 ⊕ f (R15 , K16 ) = 83AAB98E.

We note that the values agree with Eq. 5.6:

ε = L'16 ⊕ L16 = 40000000.


.

By Eq. 5.7,
'
ΔR16 = R16
. ⊕ R16 = 00800200.

Since the bit flip is in the second bit of input for .EDES , according to Table 3.2,
j
the third bit of the output of .EDES will be changed. Consequently, .SBDES is active
for .j = 1 and inactive otherwise. We have

EDES (ε)1 = 8,
. EDES (ε)j = 0 for j /= 1.

By Table 3.4, the first 4 bits of the output of .PDES are given by the 9th, 17th, 23rd,
and 31st bits of the input, hence
−1
PDES
. (ΔR16 )1 = 1010 = A.

Consequently, the input of .SB1DES , which is

. EDES (R15 )1 ⊕ K16


1
= K16
1
,
5.1 Fault Attacks on Symmetric Block Ciphers 361

1 is equal
gives output difference .A when the input difference is .8. By Table 5.1, .K16
to one of the two possible values: 5 and D, where 5 agrees with the first 6 bits of .K16 .
In [BS97], the authors reported that with exhaustive search, they found that,
on average, four possible 6-bit key hypotheses remain for each active Sbox. An
improved attack that considers fault injection in the earlier rounds can be found
in [Riv09]

5.1.1.2 Diagonal DFA on AES-128

In this part, we discuss a DFA attack on AES-128 implementations. Recall that AES
cipher state can be represented as a 4 .× 4 matrix of bytes (see Eq. 3.2):
⎛ ⎞
s00 s01 s02 s03
⎜s10 s11 s12 s13 ⎟
.⎜ ⎟. (5.8)
⎝s20 s21 s22 s23 ⎠
s30 s31 s32 s33

Let us represent those bytes by squares as in Fig. 3.6 for visual illustration. Suppose
a fault is injected at the beginning of one round (except for the last round) in byte
.s00 . Then the fault propagation in this round can be represented by Fig. 5.2, where

blue squares correspond to bytes that might be affected by the fault. Since SubBytes
and ShiftRows only affect 1 byte and the first row does not change in ShiftRows
operation, in the first three states, the blue squares stay in the same position.
MixColumns takes one column as input and outputs one column. AddRoundKey
does not change the fault effects. Hence in the last state, the whole first column can
be affected by the fault. Similarly, if the fault is injected at the beginning of one
round in any combination of bytes .s00 , s11 , s22 , s33 , at the end of this round, the
whole first column might be affected by the fault. Some cases are shown in Fig. 5.3.
Let us refer to the bytes .s00 , s11 , s22 , s33 as a diagonal of AES state. We consider
a fault attack where a random byte fault is injected in this diagonal of the AES state
at the end of round 7. By the above discussion, we know that at the end of round 8,
the whole first column might be affected by the fault. Similarly, we can study the
fault propagation in round 9. Let .δi (.i = 1, 2, 3, 4) denote the differences between
the four correct and faulty bytes in the first column of the cipher state after SubBytes
in round 9. An illustration is shown in Fig. 5.4, where .S8 (respectively, .S9 ) denotes

Fig. 5.2 Visual illustration of how the fault propagates when a fault is injected at the beginning
of one AES round (not the last round) in byte .s00 . Blue squares correspond to bytes that can be
affected by the fault
362 5 Fault Attacks and Countermeasures

Fig. 5.3 Visual illustration of a


how the fault propagates
when a fault is injected at the
beginning of one AES round
in bytes: (a) .s00 , s11 , (b) b
.s00 , s11 , s22 , and (c)
.s00 , s11 , s22 , s33 . Blue squares
correspond to bytes that can
be affected by the fault c

Fig. 5.4 Visual illustration of fault propagation in the 9th round of AES when the fault was
injected in the diagonal .s00 , s11 , s22 , s33 of the AES cipher state at the end of round 7

the cipher state at the end of round 8 (respectively, round 9). After ShiftRows, those
four .δi s move to different positions as shown in the third cipher state in the figure.
Recall that MixColumns multiplies one column by the following matrix (see
Eq. 3.6):
⎛ ⎞
02 03 01 01
⎜01 02 03 01⎟
.⎜ ⎟
⎝01 01 02 03⎠ .
03 01 01 02

Since this is a linear operation, the differences will also be multiplied by the
corresponding coefficients in the matrix. Consequently, we get the last state .S9 as
shown in Fig. 5.4.
Let us represent the cipher state at the end of round 9 .S9 , the correct ciphertext
c, and the last round key .K10 with the following matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a00 a01 a02 a03 c00 c01 c02 c03 k00 k01 k02 k03
⎜a10 a11 a12 a13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S9 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝a20 a21 a22 a23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
a30 a31 a32 a33 c30 c31 c32 c33 k30 k31 k32 k33

In Round 10, we have SubBytes, ShiftRows, and AddRoundKey operations. In


particular, we have

c00 = SBAES (a00 ) ⊕ k00 ,


. c13 = SBAES (a10 ) ⊕ k13 ,
c22 = SBAES (a20 ) ⊕ k22 , c31 = SBAES (a30 ) ⊕ k31 ,
5.1 Fault Attacks on Symmetric Block Ciphers 363

which gives

a00 = SB−1
.
AES (c00 ⊕ k00 )

a10 = SB−1
AES (c13 ⊕ k13 )

a20 = SB−1
AES (c22 ⊕ k22 )

a30 = SB−1
AES (c31 ⊕ k31 ).

Let us denote the faulty ciphertext by .c' , we write


⎛ ' ' ' ' ⎞
c00 c01 c02 c03
⎜c ' '
c11 '
c12 ' ⎟
c13
.c = ⎜ 10 ⎟
'
⎝c ' '
c21 '
c22 ' ⎠.
c23
20
'
c30 '
c31 '
c32 '
c33

Then
'
a00
. = SB−1 '
AES (c00 ⊕ k00 )
'
a10 = SB−1 '
AES (c13 ⊕ k13 )
'
a20 = SB−1 '
AES (c22 ⊕ k22 )
'
a30 = SB−1 '
AES (c31 ⊕ k31 ).

Let .δ = δ1 . By observing the first column of .S9 in Fig. 5.4, we have


'
2δ = a00 ⊕ a00
. = SB−1 −1 '
AES (c00 ⊕ k00 ) ⊕ SBAES (c00 ⊕ k00 )
'
δ = a10 ⊕ a10 = SB−1 −1 '
AES (c13 ⊕ k13 ) ⊕ SBAES (c13 ⊕ k13 )
'
δ = a20 ⊕ a20 = SB−1 −1 '
AES (c22 ⊕ k22 ) ⊕ SBAES (c22 ⊕ k22 )
'
3δ = a30 ⊕ a30 = SB−1 −1 '
AES (c31 ⊕ k31 ) ⊕ SBAES (c31 ⊕ k31 ).

Then for each value of .δ, the possible values for .k00 , k13 , k22 , k31 will be restricted
by the above four equations. In particular,

a00 = SB−1
.
AES (c00 ⊕ k00 )

can be considered as an AES Sbox input that corresponds to input difference .2δ and
' . Similarly,
output difference .c00 ⊕ c00

a10 = SB−1
.
AES (c13 ⊕ k13 ), a20 = SB−1
AES (c22 ⊕ k22 ), a30 = SB−1
AES (c31 ⊕ k31 )
364 5 Fault Attacks and Countermeasures

are AES Sbox inputs that give output differences


' ' '
c13 ⊕ c13
. , c22 ⊕ c22 , c31 ⊕ c31 ,

when the input differences are

δ,
. δ, 3δ,

respectively. It was shown [SMR09] that, on average, the key hypotheses for
(k00 , k13 , k22 , k31 ) can be reduced to .28 .
.

Example 5.1.5 Suppose the master key is


⎛ ⎞
00 04 08 0C
⎜01 05 09 0D⎟
.⎜ ⎟
⎝02 06 0A 0E⎠
03 07 0B 0F

and the plaintext is


⎛ ⎞
00 44 88 CC
⎜11 55 99 DD⎟
.⎜ ⎟
⎝22 66 AA EE⎠ .
33 77 BB FF

By AES encryption and key schedule (Sect. 3.1.2), we can find that (see [NIS01]
Appendix C)
⎛ ⎞
D1 79 B4 D6
⎜87 C4 55 6F⎟
.S7 = ⎜ ⎟
⎝6C 30 94 F4⎠ ,
0F 0A AD 1F
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
47 A4 E0 AE 54 F0 10 BE 13 E3 F3 4D
⎜43 1C 16 BF⎟ ⎜99 85 93 2C⎟ ⎜11 94 07 2B⎟
.K8 = ⎜ ⎟ K9 = ⎜ ⎟ =⎜ ⎟
⎝87 65 BA 7A⎠ , ⎝32 57 ED 97⎠ , K10 ⎝1D 4A A7 30⎠ .
35 B9 F4 D2 D1 68 9C 4E 7F 17 8B C5

Then the intermediate values in round 8 are as follows:


5.1 Fault Attacks on Symmetric Block Ciphers 365

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
3E B6 8D F6 3E B6 8D F6 BA A1 D5 5F
SB ⎜ 17 1C FC A8⎟ ⎜ ⎟ ⎜
⎟ SR ⎜1C FC A8 17⎟ MC ⎜A0 F9 51 41⎟

.S7 −→⎜⎝50 04 22 BF⎠ −→ ⎝22 BF 50 04⎠ −−→ ⎝3D B5 2C 4D⎠
76 67 95 C0 C0 76 67 95 E7 6E BA 23
⎛ ⎞
FD 05 35 F1
AK ⎜E3 E5 47 FE⎟
−−→ ⎜ ⎟
⎝BA D0 96 37⎠ = S8 ,
D2 D7 4E F1

where SB, SR, MC, and AR stand for SubBytes (Table 3.9), ShiftRows, Mix-
Columns, and AddRoundKey, respectively. The operations in round 9 compute

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
54 6B 96 A1 54 6B 96 A1 E9 02 1B 35
SB ⎜ 11 D9 A0 BB⎟ ⎜ ⎟ ⎜
⎟ SR ⎜D9 A0 BB 11⎟ MC ⎜F7 30 F2 3C⎟

.S8 −→⎜⎝F4 70 90 9A⎠ −→ ⎝90 9A F4 70⎠ −−→ ⎝4E 20 CC 21⎠
B5 0E 2F A1 A1 B5 0E 2F EC F6 F2 C7
⎛ ⎞
BD F2 0B 8B
AK ⎜6E B5 61 10⎟ ⎟
−−→ ⎜⎝7C 77 21 B6⎠ = S9 .
3D 9E 6E 89

In round 10 we have
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
7A 89 2B 3D 7A 89 2B 3D 69 6A D8 70
SB ⎜ 9F D5 EF CA⎟ ⎜ ⎟ ⎜
⎟ SR ⎜D5 EF CA 9F⎟ AK ⎜C4 7B CD B4⎟

.S9 −→⎜⎝10 F5 FD 4E⎠ −→ ⎝FD 4E 10 F5⎠ −−→ ⎝E0 04 B7 C5⎠ = c.
27 0B 9F A7 A7 27 0B 9F D8 30 80 5A

Suppose a fault is injected in byte .s00 of .S7 with fault mask D8. We have
'
.s00 = D1 ⊕ D8 = 09.

The computations in round 8 become

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
09 79 B4 D6 01 B6 8D F6 01 B6 8D F6
⎜87 C4 55 6F⎟ SB ⎜17 1C FC A8⎟ SR ⎜1C FC A8 17⎟
.S7 = ⎜
' ⎟ ⎜ ⎟ ⎜ ⎟
⎝6C 30 94 F4⎠ −→ ⎝50 04 22 BF⎠ −→ ⎝22 BF 50 04⎠
0F 0A AD 1F 76 67 95 C0 C0 76 67 95
366 5 Fault Attacks and Countermeasures

⎛ ⎞ ⎛ ⎞
C4 A1 D5 5F 83 05 35 F1
⎜ 9F ⎟ ⎜
F9 51 41⎟ AK ⎜DC E5 47 FE⎟
−−→ ⎜
MC
−−→ ⎝ ⎟ = S' .
⎝02 B5 2C 4D⎠ 85 D0 96 37⎠ 8

A6 6E BA 23 93 D7 4E F1

Round 9 then calculates

⎛ ⎞ ⎛ ⎞
EC 6B 96 A1 EC 6B 96 A1
⎜ ⎟ ⎜
' SB ⎜86 D9 A0 BB⎟ SR ⎜D9 A0 BB 86⎟

.S8 −→⎝ −→ ⎝
97 70 90 9A ⎠ 90 9A 97 70⎠
DC 0E 2F A1 A1 DC 0E 2F
⎛ ⎞ ⎛ ⎞
82 6B 78 97 D6 9B 68 29
MC ⎜4F 59 57 09⎟ ⎜
⎟ AK ⎜D6 DC C4 25⎟

−−→ ⎜ '
⎝F6 9B 0A B6⎠ −−→ ⎝C4 CC E7 21⎠ = S9 .
3F 24 91 50 EE 4C 0D 1E

And, in round 10, we have


⎛ ⎞ ⎛ ⎞ ⎛ ⎞
F6 14 45 A5 F6 14 45 A5 E5 F7 B6 E8

' SB ⎜F6 86 1C ⎟ ⎜
3F⎟ SR ⎜86 1C 3F F6⎟ AK ⎜ 97 88 38 DD⎟
.S9 −→⎝ −→ ⎝ ⎟−−→ ⎜ ⎟ = c' .
1C 4B 94 FD⎠ 94 FD 1C 4B ⎠ ⎝ 89 B7 BB 7B⎠
28 29 D7 72 72 28 29 D7 0D 3F A2 12

The attacker obtains the following equations:

2δ = SB−1
.
−1
AES (69 ⊕ k00 ) ⊕ SBAES (E5 ⊕ k00 )

δ = SB−1 −1
AES (B4 ⊕ k13 ) ⊕ SBAES (DD ⊕ k13 )

δ = SB−1 −1
AES (B7 ⊕ k22 ) ⊕ SBAES (BB ⊕ k22 )

3δ = SB−1 −1
AES (30 ⊕ k31 ) ⊕ SBAES (3F ⊕ k31 ).

Thus the possible values of

SB−1
.
AES (69 ⊕ k00 ), SB−1
AES (B4 ⊕ k13 ), SB−1
AES (B7 ⊕ k22 ), SB−1
AES (30 ⊕ k31 )

are inputs of AES Sbox that with input differences

2δ,
. δ, δ, 3δ

produce output differences

69 ⊕ E5 = 8C,
. B4 ⊕ DD = 69, B7 ⊕ BB = 0C, 30 ⊕ DD = ED,
5.1 Fault Attacks on Symmetric Block Ciphers 367

Table 5.2 Part of the difference distribution table for AES Sbox (Table 3.9) corresponding to
output differences 0C, 69, 8C, and ED
HH δ
Δ HH
1 2 3 4 5 6 7 8 9 ...

0C 2,3 35,37 7D,7E E2,E7 0,6,A3,A5 4,C D2,DB . . .


69 48,49 70,76 ...
8C D9,DB 42,47 E5,ED ...
ED 52,53 49,4A 41,45 68,6F 70,78 C5,CC . . .

respectively.
Part of the AES Sbox difference distribution table corresponding to output
differences 8C, 69, 0C, and ED is shown in Table 5.2. We can see that .δ /= 02
since for input difference 02, the entry for row 69 is empty. In other words, there
are no inputs that have output difference 69 for input difference 02. Thus .δ can only
take values that give nonempty entries in the DDT for columns .2δ, .δ, .δ, .3δ and
corresponding rows 8C, 69, 0C, ED. By searching the rows 8C, 69, 0C, ED, we can
find all possible values of .δ:

01, 06, 0B, 28, 3D, 49, 6B, 76, 8F, 90, A6, B2, B8, D0, EE,
.

in total 15 choices. In most of the entries of AES Sbox DDT, there are only two
values; thus the remaining number of key hypotheses is roughly

24 × 15 ≈ 28 .
.

We can also check that for the correct values .k00 = 13, k13 = 2B, k22 =
A7, k31 = 17 (see Table 3.10 for .SB−1
AES ),

2δ = SB−1
.
−1
AES (7A) ⊕ SBAES (F6) = BD ⊕ D6 = 6B

δ = SB−1 −1
AES (9F) ⊕ SBAES (F6) = 6E ⊕ D6 = B8

δ = SB−1 −1
AES (10) ⊕ SBAES (1C) = 7C ⊕ C4 = B8

3δ = SB−1 −1
AES (27) ⊕ SBAES (28) = 3D ⊕ EE = D3.

According to Examples 1.5.17 and 1.5.18,

B8 × 02 = 10111000 × 02 = 01110000 ⊕ 1B = 6B,


. B8 × 03 = 6B ⊕ B8 = D3.

The other three columns of .S9 in Fig. 5.4 can provide similar results, reducing
the key hypotheses for other key bytes of .K10 . Consequently, with just one pair of
correct and faulty ciphertext, the key hypotheses for .K10 can be reduced to .232 as
opposed to the original .2128 .
368 5 Fault Attacks and Countermeasures

Fig. 5.5 Fault propagation δ2 δ1 3δ4 2δ3


for random byte fault injected δ2 3δ1 2δ4 δ3
in the “diagonals” of the 3δ2 2δ1 δ4 δ3
cipher state at the end of 2δ2 δ1 δ4 3δ3
round 7
δ3 3δ2 2δ1 δ4
3δ3 2δ2 δ1 δ4
2δ3 δ2 δ1 3δ4
δ3 δ2 3δ1 2δ4

3δ4 2δ3 δ2 δ1
2δ4 δ3 δ2 3δ1
δ4 δ3 3δ2 2δ1
δ4 3δ3 2δ2 δ1

S7 S8 S9

We note that in this attack, we assume the attacker has the knowledge of the fault
location (diagonal of cipher state at the end of round 7), fault model (random byte),
and output of AES (correct and faulty ciphertext). Since the attack is on the diagonal
of the cipher state, it is also called the diagonal DFA. Similar attacks can be carried
out if the fault is injected in the other three “diagonals” of the cipher state at the end
of round 7. The corresponding fault propagations are depicted in Fig. 5.5, where .Si
denotes the cipher state at the end of round i.

5.1.2 Statistical Fault Analysis

Statistical Fault Analysis (SFA) [FJLT13] assumes no knowledge of plaintext


or correct ciphertext for the attacker. Only knowledge of faulty ciphertext and a
nonuniform fault model is required.
We will provide more details on the definition of a nonuniform fault model. We
consider fault models that change an intermediate value x to .x ' . We can model
these two intermediate values as random variables X and .X' . Based on the fault
properties, we can draw a table with probabilities for the value x to be changed to
' ' '
.x , i.e., .P (X = x |X = x). Such a table is called a fault distribution table. We say

that the fault model is nonuniform if

1
P (X' = x ' |X = x) /=
.
2b

for some x and .x ' , where b is the maximum bit length of x.


Example 5.1.6 Let us consider the case when x is just 1 bit. A stuck-at-0 fault
changes x to 0 with probability 1. A bit-flip fault model changes x to .x ⊕ 1 with
probability 1. A random fault changes x to .x ⊕ 1 with probability .0.5. The fault
distribution tables for those three fault models are shown in Table 5.3. In this case,
5.1 Fault Attacks on Symmetric Block Ciphers 369

Table 5.3 Fault distribution .x


'
tables for fault models: (a)
0 1 0 1 0 1
stuck-at-0, (b) bit flip, and (c)
random fault x 0 1 0 0 0 1 0 0.5 0.5
1 1 0 1 1 0 1 0.5 0.5
(a) (b) (c)

Table 5.4 Fault distribution .x


'
tables for fault models: (a)
stuck-at-0 with probability
.0.5 and (b) random-AND with 0 1 0 1
.δ, where .δ follows a uniform x 0 1 0 0 1 0
distribution 1 0.5 0.5 1 0.5 0.5
(a) (b)

the bit length of x is 1, and a fault model is nonuniform if

1
P (X' = x ' |X = x) /=
.
2

for some x and .x ' . Thus both stuck-at-0 and bit-flip fault models are nonuniform.
Example 5.1.7 We again consider the case when x is 1 bit. We discuss two more
complicated nonuniform fault models. Stuck-at-0 with probability .0.5 changes x
to 0 with probability .0.5. The corresponding fault distribution table is shown in
Table 5.4 (a). Random-AND with .δ, where .δ follows a uniform distribution, has the
same fault distribution table. For example,

P (x ' = 1|x = 1) = P (δ = 1) = 0.5.


.

5.1.2.1 SFA Attack on AES-128 Round 9

In this part, we will discuss an SFA attack on AES-128. We represent the cipher
state at the end of round 9 .S9 , the correct ciphertext c, and the last round key .K10
with the following matrices:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S9 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33

According to the round operations, SubBytes, ShiftRows, and AddRoundKey, in


round 10, we have
370 5 Fault Attacks and Countermeasures

c00 = SBAES (s00 ) ⊕ k00 =⇒ s00 = SB−1


.
AES (c00 ⊕ k00 ). (5.9)

We consider a fault in .s00 with a nonuniform fault model. Let .S00 and .S00' denote the
'
random variables corresponding to .s00 and its faulty value .s00 , respectively. Suppose
the attacker has the knowledge of the fault location and the fault distribution table,
i.e., the probabilities
' '
P (S00
. = s00 |S00 = s00 )

' from .F8 . The goal of the attacker is to recover .k .


for all .s00 and .s00 2 00
We assume .S00 follows a uniform distribution, i.e.,

1
.P (S00 = s00 ) = , ∀s00 ∈ F82 .
256
Then by Lemma 1.7.2,


255
' ' ' '
.P (S00 = s00 ) = P (S00 = s00 |S00 = s00 )P (S00 = s00 )
s00 =0

1 ⎲
255
' '
= P (S00 = s00 |S00 = s00 ). (5.10)
256
s00 =0

To carry out the attack,


{ the attacker injects } fault in .s00 above and collects a set
of m faulty ciphertexts . c'1 , c'2 , . . . , c'm . Let .k̂00 denote a key hypothesis for .k00 .
Then for each .c'i , we can compute a hypothetical value for .s00 ' using Eq. 5.9, denoted
i
.ŝ , as follows:
00

i
ŝ00
. = SB−1 'i
AES (c00 ⊕ k̂00 ). (5.11)

i can be
The probability that the faulty value of .s00 in the ith encryption equals .ŝ00
found using the fault distribution table with Eq. 5.10:

1 ⎲
255
' '
P (S00
. = ŝ00
i
)= P (S00 = ŝ00
i
|S00 = s00 ).
256
s00 =0

Define .𝓁(k̂00 ) to be the probability that the faulty value of .s00 in the ith encryption
i for all i, i.e.,
equals the hypothetical value .ŝ00


m
'
𝓁(k̂00 ) :=
. P (S00 = ŝ00
i
). (5.12)
i=1
5.1 Fault Attacks on Symmetric Block Ciphers 371

Then the correct key can be found using the maximum likelihood approach

k00 = arg max 𝓁(k̂00 ).


.
k̂00

Example 5.1.8 Let us consider a stuck-at-0 fault model, i.e.,



' = 00
1 s00
' '
.P (S00 = s00 |S00 = s00 ) =
0 Otherwise,

for all .s00 ∈ F82 .


In this case, one faulty ciphertext is enough to recover .k00 . Since the attacker
' is always 00, they can recover .k by computing
knows that the faulty value .s00 00

' '
.k00 = c00 ⊕ SBAES (00) = c00 ⊕ 63.

Example 5.1.9 In this example, we consider a random-AND fault model such that

1 ' = s ANDδ
s00
' ' 00
.P (S00 = s00 |S00 = s00 ) =
0 Otherwise,

where
1
P (δ = x) =
. , ∀x ∈ F82 .
256
By Eq. 5.10,

1 ⎲
255
' ' ' '
P (S00
. = s00 )= P (S00 = s00 |S00 = s00 )
256
s00 =0
⎛ 255 ⎞
1 ⎲ ⎲
255
' '
= P (S00 = s00 |S00 = s00 , δ = x)P (δ = x)
256
s00 =0 x=1

1 ⎲ {
255
'
}
= 2
| δ | s00 = δANDs00 |
256
s00 =0
{ ' = δANDs ,
} '
| (δ, s00 ) | s00 00 s00 ∈ F82 , δ ∈ F82 | 38−wt(s00 )
= 2
= ,
256 2562
(5.13)
( ' ) ' (See Definition 1.6.10). To
where .wt s00 denotes the Hamming weight of .s00
'
derive the last equality, we note that if 1 bit of .s00 is 0, then the corresponding
372 5 Fault Attacks and Countermeasures

bit for .δ and .s00 can be either 0 or 1 but not both 1, giving us three choices. If 1 bit
' is 1, then the corresponding bit in .δ and .s must both be 1.
of .s00 00
Let .s00 = AB, .k00 = 00. Then (see Table 3.9 for .SBAES )

c00 = SBAES (s00 ⊕ k00 ) = 62.


.

Suppose five injected faults result in values of .δ = 0F, F0, FF, 54, CD, respec-
tively. Then the corresponding faulty values of .s00 'i are .0B, A0, AB, 00, 89. And the
'i
faulty ciphertext bytes .c00 are .2B, E0, 62, 63, A7.
Take .k̂00 = 1A, and by Eq. 5.11, we have (see Table 3.10 for .SB−1 AES )

1
ŝ00
. = SB−1 −1
AES (2B ⊕ 1A) = SBAES (31) = C7,
2
ŝ00 = SB−1 −1
AES (E0 ⊕ 1A) = SBAES (FA) = 2D,
3
ŝ00 = SB−1 −1
AES (62 ⊕ 1A) = SBAES (78) = BC,
4
ŝ00 = SB−1 −1
AES (63 ⊕ 1A) = SBAES (79) = B6,
5
ŝ00 = SB−1 −1
AES (A7 ⊕ 1A) = SBAES (BD) = 7A.

By Eqs. 5.12 and 5.13,


5
' ' '
.𝓁(1A) = P (S00 = ŝ00
i
) = P (S00 = C7)P (S00 = 2D)
i=1
' ' '
P (S00 = BC)P (S00 = B6)P (S00 = 7A)
1
= × 38×5−wt(C7)−wt(2D)−wt(BC)−wt(B6)−wt(7A)
25610
1 316
= 10
× 340−5−4−5−5−5 = .
256 25610

Take .k̂00 = 00, we have


1
ŝ00
. = SB−1
AES (2B) = 0B,
2
ŝ00 = SB−1
AES (E0) = A0,
3
ŝ00 = SB−1
AES (62) = AB,
4
ŝ00 = SB−1
AES (63) = 00,
5
ŝ00 = SB−1
AES (A7) = 89.
5.1 Fault Attacks on Symmetric Block Ciphers 373

And


5
' ' ' '
𝓁(00) =
. P (S00 = ŝ00
i
) = P (S00 = 0B)P (S00 = A0)P (S00 = AB)
i=1
' '
P (S00 = 00)P (S00 = 89)
1
= × 38×5−wt(0B)−wt(A0)−wt(AB)−wt(00)−wt(89)
25610
1 327
= 10
× 340−3−2−5−3 = .
256 25610
We can see that .𝓁(00) > 𝓁(1A).
It was shown in [FJLT13] that with high probability, the correct key byte can be
found with only a few faults. The same method can recover other bytes of .K10 . We
note that each byte can be recovered in parallel; hence the number of faults required
to get the full round key depends on the number of bytes that can be faulted with
one fault injection.
In case the attacker only knows that the fault model is nonuniform, without the
knowledge of its fault distribution table, a metric based on the Square Euclidean
Imbalance (SEI) can be used. Define
⎛ { } ⎞2

255
| i | ŝ00
i =j |
1
.SEI(k̂00 ) := − .
m 256
j =0

We can see that by definition, SEI measures a certain distance between the obtained
' and the uniform distribution. Since we know that
hypothetical distribution of .S00
' to be far
the fault model is nonuniform, we expect the distribution induced by .S00
from the uniform distribution. Thus we take the correct key to be

k00 = arg max SEI(k̂00 ).


.
k̂00

5.1.2.2 SFA on AES-128 Round 8

In this part, we consider the fault to be injected in the output of round 8, .S8 . Similar
to before, we represent the cipher state at the end of round 8 .S8 , the correct ciphertext
c, and the last round key .K10 with the following matrices:
374 5 Fault Attacks and Countermeasures

Fig. 5.6 Illustration of fault propagation for a fault injected in the first byte of .S8 (the cipher state
at the end of round 8)

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S8 = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33

We further represent the output of InvMixColumns operation on the second last


round key, .K9 , as follows:
⎛ ⎞
a00 a01 a02 a03
⎜a10 a11 a12 a13 ⎟
.InvMixColumns(K9 ) = ⎜ ⎟.
⎝a20 a21 a22 a23 ⎠
a30 a31 a32 a33

Suppose a fault is injected in .s00 with a nonuniform fault model. The fault
propagation is shown in Fig. 5.6.
We can see that .s00 is related to .c00 , c13 , c22 , c31 and .k00 , k13 , k22 , k31 as follows:

.s00 = SB−1
AES (a00 ⊕ InvMixColumns for the first column

(SB−1 −1 −1 −1
AES (c00 ⊕ k00 ), SBAES (c13 ⊕ k13 ), SBAES (c22 ⊕ k22 ), SBAES (c31 ⊕ k31 ))).

As discussed in Sect. 3.1.2, InvMixColumns computation is equivalent to multipli-


cation with the following matrix:
⎛ ⎞
0E 0B 0D 09
⎜09 0E 0B 0D⎟
.⎜ ⎟
⎝0D 09 0E 0B⎠ .
0B 0D 09 0E

We have

.s00 = SB−1 −1 −1
AES (a00 ⊕ 0E · SBAES (c00 ⊕ k00 ) ⊕ 0B · SBAES (c13 ⊕ k13 )

⊕ 0D · SB−1 −1
AES (c22 ⊕ k22 ) ⊕ 09 · SBAES (c31 ⊕ k31 )).
{ }
With a set of m faulty ciphertexts . c'1 , c'2 , . . . , c'm , the attacker can make
hypothesis on the values of .k00 , k13 , k22 , k31 , and .a00 , denoted by .k̂00 , k̂13 , k̂22 , k̂31 ,
5.1 Fault Attacks on Symmetric Block Ciphers 375

' ,
and .â00 . Then they can compute the corresponding hypothetical values for .s00
i
denoted .ŝ00 , as follows:

i
ŝ00
. = SB−1 −1 'i −1 'i
AES (â00 ⊕ 0E · SBAES (c00 ⊕ k̂00 ) ⊕ 0B · SBAES (c13 ⊕ k̂13 )

⊕ 0D · SB−1 'i −1 'i


AES (c22 ⊕ k̂22 ) ⊕ 09 · SBAES (c31 ⊕ k̂31 )).

The correct key bytes can be recovered with either maximum likelihood (when
the fault distribution table is known) or SEI (when the fault distribution table is
unknown) as discussed above.
We refer the reader to the original paper [FJLT13] for other methods of obtaining
the correct key hypothesis and attacks in even earlier rounds of AES.
The main advantage of SFA is that only faulty ciphertexts are required, and
there is no need for repeated plaintexts. However, the attack assumes that each fault
injection is successful.

5.1.3 Persistent Fault Analysis

Persistent Fault Analysis (PFA) [ZLZ+ 18] considers a fault in the memory, normally
where the Sbox lookup table is stored. As we do not expect the table to be rewritten
during the computation, the fault would stay until the device is reset; hence the name
“persistent” is used in the attack method.
We will use AES-128 as a running example to show how the attack works. The
methodology also applies to other block ciphers.
We consider a random byte fault model. Suppose the fault location is in the first
byte of the Sbox lookup table. Then the output for SB.AES (00) = v is changed to .v ' ,
the fault mask .ε ∈ F82 is given by

ε = v ⊕ v' .
.

By Table 3.9, we know that .v = 63. We assume the attacker has the knowledge of
the output of AES (correct and faulty ciphertexts), fault model, and fault location.
However, the attacker does not know the fault mask. The attacker aims to recover
the last round key .K10 .
Recall that in round 10, the operations for AES encryption include Sub-
Bytes, ShiftRows, and AddRoundKey. We represent the cipher state right before
AddRoundKey in round 10, denoted S, the ciphertext c, and .K10 with the following
matrices:
376 5 Fault Attacks and Countermeasures

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
s00 s01 s02 s03 c00 c01 c02 c03 k00 k01 k02 k03
⎜s10 s11 s12 s13 ⎟ ⎜c10 c11 c12 c13 ⎟ ⎜k10 k11 k12 k13 ⎟
.S = ⎜ ⎟, c=⎜ ⎟, K10 =⎜ ⎟.
⎝s20 s21 s22 s23 ⎠ ⎝c20 c21 c22 c23 ⎠ ⎝k20 k21 k22 k23 ⎠
s30 s31 s32 s33 c30 c31 c32 c33 k30 k31 k32 k33

We have .c00 = s00 ⊕ k00 .


Since the fault affects the encryption starting from the first round, here we only
consider the values of S and c with fault present in the Sbox lookup table and do not
look into the original values of S and c. We omit the superscript .' in .S ' and .c' which
were used to indicate the faulty value in our previous discussions. We note that it
is possible that for some encryptions the fault does not affect the result, e.g. when
00 is never used as an input for the Sbox computations. However, such cases do not
affect the attack method.
Let .S00 denote the random variable corresponding to the value of .s00 . Due to the
diffusion and confusion layers in AES, it is reasonable to assume


⎨≈ 256 s00 = v ⊕ ε
2

P (S00
. = s00 ) = 0 s00 = v


⎩≈ 1 otherwise.
256

Since .c00 = s00 ⊕ k00 , given .c00 , we know

. k00 /= c00 ⊕ v.
{ }
Thus the attacker can collect a set of m (faulty) ciphertexts . c1 , c2 , . . . , cm and
eliminate key hypotheses for .k00 that are equal to
i
c00
. ⊕ v.

Example 5.1.10 Let the key byte .k00 = 45 and the fault mask .ε = 12. Then the
faulty output of AES Sbox for input .00 becomes

63 ⊕ 12 = 71.
.

In this case, no matter what input the AES Sbox gets during the computations, the
output will never be 63. In particular, we have .s00 /= 63. Equivalently,

k00 ⊕ 63 = 45 ⊕ 63 = 26
.

would never appear in the first byte of the ciphertexts. Otherwise, we would have

s00 = 26 ⊕ k00 = 26 ⊕ 45 = 63,


.
5.1 Fault Attacks on Symmetric Block Ciphers 377

which is not possible.


Suppose the attacker collects the following values for .c00 with the fault present
in the Sbox lookup table: .00, 12, FE. Then the attacker can eliminate

00 ⊕ 63 = 63,
. 12 ⊕ 63 = 71, FE ⊕ 63 = 9D

from the key hypotheses.


Next, we consider the case when the attacker has the knowledge of the fault mask
ε. Let Y denote the random variable corresponding to the value of .c00 , and we have
.



⎨≈ c00 = v ⊕ ε ⊕ k00
2
⎪ 256
P (Y = c00 ) = 0
. c00 = v ⊕ k00


⎩≈ 1 otherwise.
256
{ }
Given a set of ciphertexts . c1 , c2 , . . . , cm , if we look at the first byte of those
ciphertexts, we expect .v ⊕ ε ⊕ k00 to appear with the highest frequency. Thus the
attacker computes
{ }
ymax := arg max| c00
.
i
| c00
i
= y |.
y

Then a candidate for the correct key is given by .ymax ⊕ v ⊕ ε.


Alternatively, we can also calculate the empirical probabilities for each y,
denoted .P̂ (Y = y), as follows:
{ i }
| c00 | c00
i =y |
.P̂ (Y = y) = .
m
For a large enough sample, we expect

2
. P̂ (ymax ) ≈ .
256

The simulated results in [ZLZ+ 18] show that with about 4000 faulty ciphertexts, the
empirical probability of .ymax is high enough to be distinguished from that of other
values of y.

5.1.4 Implementation-Specific Fault Attack

In this subsection, we discuss an implementation-specific DFA attack on


PRESENT [BHL18]. We consider the implementation of PRESENT with the
second method discussed in Sect. 3.2.2.1 (Algorithm 3.5). Recall that the method
378 5 Fault Attacks and Countermeasures

combines sBoxLayer and pLayer by using four .8 × 8 tables—Table 1, Table


2, Table 3, and Table 4. Each table extracts certain bits of the cipher state, and
the final output will be obtained by combining those bits using bitwise OR. The
particular implementation we target is from [PV13, AV13], which was written in
AVR assembly (see [Atm16] for 8-bit AVR Instruction Set Manual). Part of the
implementation is listed in Algorithm 5.1.

Algorithm 5.1: Part of an implementation for PRESENT encryption that com-


bines sBoxLayer and pLayer in AVR assembly [PV13, AV13]. A pseudocode
can be found in Algorithm 3.5
1 ...
2 ldi ZH, 0x06 // load Table one address
3 mov ZL, r0 // r0 contains input to Table one
// lookup program memory at address Z, store in r21. This is
equivalent to storing the Table one output corresponding to input
from r0 in r21. Thus in this step, we lookup bits
0, 1, 16, 17, 32, 33, 48, 49.
4 lpm r21, Z
5 andi r21, 0xC0 // extract output bits 0, 1
6 ... // load Table Two address
7 lpm r23, Z // lookup Table two (bits 50, 51, 2, 3, 18, 19, 34, 35) and
store in r23
8 andi r23, 0x30 // extract output bits 2, 3
9 or r21, r23 // combine bits 0, 1, 2, 3
10 . . . // lookup Table three (bits 36, 37, 52, 53, 4, 5, 20, 21)
and store in r23
11 andi r23, 0x0C // extract output bits 4, 5
12 or r21, r23 // combine bits 0, 1, 2, 3, 4, 5
13 . . . // lookup Table four (bits 22, 23, 38, 39, 54, 55, 6, 7) and
store in r23
14 andi r23, 0x03 // extract output bits 6, 7
15 or r21, r23 // combine bits 0, 1, 2, 3, 4, 5, 6, 7
16 . . .

In line 2, we load the address of Table 1, then line 3 looks up the table, and finally,
line 4 stores the table output in the register r21. These three lines implement line 1
in Algorithm 3.5. Afterward, in line 5, the leftmost 2 bits are extracted from r21 and
stored in r21. This corresponds to line 5 in Algorithm 3.5. As we have explained in
Sect. 3.2.2.1, these 2 bits correspond to bits at positions 0 and 1 of pLayer output.
Similarly, lines 6–8 extract the 2nd and 3rd bits of pLayer output using Table 2 and
store them in r23. Then line 9 combines bits 0, 1, 2, and 3 with bitwise OR. Then
the implementation continues to extract pLayer output bits at positions 4 and 5 with
Table 3 and at positions 6 and 7 with Table 4. Those bits are all combined through
bitwise OR to register r21 (lines 12 and 15).
The fault attack on this implementation injects fault in register r23 between
lines 14 and 15 in the final round of PRESENT. The fault model used is a bit flip.
5.2 Fault Countermeasures for Symmetric Block Ciphers 379

Recall that in this round of PRESENT, we have operations addRoundKey,


sBoxLayer, and pLayer, which will be followed by another addRoundKey (see
Sect. 3.1.3). Let .κ7 κ6 . . . κ1 κ0 denote the 0th byte of the round key .K32 , which is
the round key used right before outputting the ciphertext. Let .b7 b6 b5 . . . b0 denote
the intermediate value contained in register r21 right after line 15. Then the 0th byte
of the correct ciphertext, denoted .c7 c6 c5 . . . c0 , is given by

c7 c6 c5 . . . c0 = b7 b6 b5 . . . b0 ⊕ κ7 κ6 . . . κ1 κ0 .
.

And the value in register r23 between lines 14 and 15 is .000000b1 b0 .


We assume the attacker has the knowledge of the outputs of PRESENT (cipher-
text) and can repeat the computation with the same plaintext (not chosen by
the attacker). Furthermore, we consider a relatively strong attacker model where
the attacker can choose the fault mask .ε. In particular, we let .ε = 11111100.
Consequently, we inject a 6-bit flip in register r23 between lines 14 and 15. The
faulty value in register r23 will be

.000000b1 b0 ⊕ ε = 000000b1 b0 ⊕ 11111100 = 111111b1 b0 .

After line 15, the value in register r21 will then become .111111b1 b0 . And the
faulty ciphertext byte .c7' c6' c5' . . . c0' is given by

c7' c6' c5' . . . c0' = 111111b1 b0 ⊕ κ7 κ6 . . . κ1 κ0 .


.

Since the faulty ciphertext byte .c7' c6' c5' . . . c0' is known, the attacker can recover 6
bits of .K32 by computing

κ7 κ6 κ5 κ4 κ3 κ2 = c7' c6' c5' c4' c3' c2' ⊕ 111111.


.

Similar methods can be used to recover all other bits of .K32 .


We note that this attack is specific to the implementation considered. It shows
that even with theoretically secure countermeasures in place, the programmer should
verify its implementation, for example, by using an automated tool (see [BHL18,
HBZL19, HSP20] for automated evaluation of SW implementations and [BGE+ 17,
PGP+ 19] for HW implementations).

5.2 Fault Countermeasures for Symmetric Block Ciphers

A simple countermeasure one might consider to protect against certain fault attacks
would be to repeat the encryption, compare the two outputs, and only return the
ciphertext if those two outputs are equal [BECN+ 06]. For example, for DFA
attacks described in Sect. 5.1.1, such a countermeasure will be successful since those
attacks require the knowledge of the faulty ciphertext. However, an easy attack on
380 5 Fault Attacks and Countermeasures

this countermeasure would be fault injection on both encryption computations or


skipping the instruction for checking the outputs [SHS16].
In this section, we will discuss in detail two more sophisticated countermeasures
against fault attacks on symmetric block ciphers.

5.2.1 Encoding-Based Countermeasure

We recall from Definition 1.6.3 that the (minimum) distance of a binary code C,
denoted .dis (C), is given by

dis (C) = min {dis (c1 , c2 ) | c1 , c2 ∈ C, c1 /= c2 } .


.

We have seen that a binary code with minimum distance .dis (C) can detect
dis (C) − 1 bit flips (see Definition 1.6.5 and Theorem 1.6.1). Thus a natural choice
.

for fault countermeasure is to consider encoding the intermediate values during the
computation. The question is, which code to choose and how to implement it?
As an example of what kind of code to use, we will discuss one proposal of
using anticode (see Definition 1.6.12) for the countermeasure against bit flips and
instruction skips [BHL19]. Recall that a binary .(n, M, d, δ)-anticode has length n,
cardinality M, minimum distance d, and maximum distance .δ, where the maximum
distance of a binary code C (see Definition 1.6.12) is given by

maxdis(C) = max {dis (c1 , c2 ) | c1 , c2 ∈ C} .


.

For example, .{10, 01} is a binary .(2, 2, 2, 2)-anticode.


Following Kerckhoffs’ principle (see Definition 2.1.3), we assume the code used
for the countermeasure is public. In particular, the attacker has the knowledge of all
the codewords and how the information is encoded.
Intuitively if the minimum distance of the code is too small, we know that the
code cannot detect a large number of bit flips. On the other hand, let us consider a
code of length n and size M that contains at least two codewords, say .c1 , c2 , with
.dis (c 1 , c 2 ) = n. If an n-bit flip is injected when .c 1 or .c 2 is used for the computation,

then the resulting faulty value is still a codeword and cannot be detected. Since there
are in total M codewords, the possibility for the fault to go undetected is at least
.2/M. Thus, a very big maximum distance is also not desirable.

We refer the reader to the original paper [BHL19] for the formalization of
encoding-based countermeasures for symmetric block ciphers and calculations of
the probability of detecting any m-bit flips and instruction skips given a binary code.
The authors also provide a theoretical analysis which concludes that to have overall
good protection against all possible bit flips, it is better to use code with not too
small minimum distance and not too big maximum distance.
5.2 Fault Countermeasures for Symmetric Block Ciphers 381

Such an observation leads us to the notion of anticode (see Definition 1.6.12).


The paper [BHL19] also demonstrated the effectiveness of using anticodes with
simulated results.
In the rest of this part, we would like to focus on how encoding countermeasures
can be implemented in software for PRESENT encryption. The implementation we
present has the following properties:
. Each operation is implemented as a table lookup from memory.
. Before the table lookup, the destination register of an operation is precharged to
.0.

. When any of the inputs is .0, the output is .0.


. When an error is detected, the output is .0 (error message).
Furthermore, we also assume the registers are precharged to .0 before the program
starts, and this process cannot be faulted. Such a design can protect the implemen-
tation from single instruction skips.
For example, Algorithm 5.2 implements the computation of a binary operation
through a table lookup. The two inputs a and b are loaded to registers r0 and r1
in instructions 1 and 2. The binary operation is computed by table lookup, and the
result is stored in register r2 (instruction 4). Note that instruction 3 puts the error
message 0 in r2 before it is used. Since the registers are supposed to be precharged
to .0, skipping instruction 1 or 2 will result in the input of the table lookup being .0,
and by our design, the final output will be .0. Skipping instruction 3 will not change
the output or the program flow. Skipping instruction 4 will make the final output to
be .0. Thus a single instruction skip of any instruction of Algorithm 5.2 will either
make no change to the output or result in outputting .0, which is the error message.

Algorithm 5.2: A simple program to demonstrate protection against single


instruction skip attacks
1 LDI r0 a// load input a
2 LDI r1 b// load input b
3 EOR r2 r2// precharge register r2 to zero
4 LPM r2 r0 r1// execution of an operation by table lookup

Clearly, we need to choose codes that do not contain .0 as a codeword. Of course,


the error message can be changed to a different value, allowing the usage of codes
containing .0, e.g., linear codes (see Definition 1.6.7). But for the implementation
technique we are going to discuss, the structure of linear codes is not important.
In case the fault changes some encoded intermediate value to a word that is
not a codeword, the table lookup will produce .0, which indicates an error. In the
subsequent instructions, when the input of a table is .0, the output will always be .0
since .0 is not a codeword. In such cases, we say that the fault is detected. Otherwise,
when a successful fault injection does not result in .0 output, we say the fault is
undetected.
382 5 Fault Attacks and Countermeasures

Table 5.5 Lookup table for 00 01 10 11


carrying out XOR between .a, b
(.a, b ∈ F2 ) using 01 as the 00 00 00 00 00
codeword for 0 and 10 as the 01 00 01 10 00
codeword for 1 10 00 10 01 00
11 00 00 00 00

Example 5.2.1 As a simple example, let us consider .{01, 10}, a binary .(2, 2, 2, 2)-
anticode. Since there are two codewords, it can be used to encode 1 bit of
information. Let 01 be the codeword for 0 and 10 be the codeword for 1. The
lookup table for carrying out XOR between .a, b (.a, b ∈ F2 ) is shown in Table 5.5.
As mentioned before, 00 indicates an error. Thus the table outputs 00 if one input is
not a codeword.
Example 5.2.2 Let us consider bit flip attacks on the inputs of XOR operation from
Example 5.2.1. We can see that any 1-bit flip will be detected: If the fault is injected
in input 01, with 1-bit flip, we get either 00 or 11; both will give output 00. Similarly,
if 1-bit flip is injected in input 10, we will have 00 or 11, and the output will again
be 00.
On the other hand, a 2-bit flip will be undetected. For example, suppose we would
like to compute .0 ⊕ 0. Then the inputs for the table lookup will be 01 and 01, and
the output will be 01, which corresponds to 0. If a 2-bit flip is injected in the first
input, we get 10 and 01 for the table lookup. The result will be 10. Such a fault will
not be detected and can successfully change the output of the operation.
We recall the notion of Quotient group and Remainder group for PRESENT
Sboxes from Sect. 4.5.2.3. We have discussed that pLayer can be considered as four
identical parallel bitwise operations where each is a function .p : F16 2 → F2 that
16

takes one Quotient group output and permutes it to the corresponding Remainder
group input. Furthermore, we have seen in Sect. 3.1.3 that addRoundKey is a
function .F64
2 → F2 . Each Sbox in the sBoxLayer is a function SB.: F2 → F2 .
64 4 4

Thus, one convenient code choice would be those with cardinality 16, encoding 4
bits of information. In particular, we are looking for a binary .(n, 16, d, δ)-anticode,
where d is the minimum distance of the code and .δ is the maximum distance of the
code.
We refer the readers to [BHL19] for an algorithm for finding anticodes that
achieve a low probability of undetected faults with given length, minimum distance,
and maximum distance. In the rest of this subsection, we will use the following
binary .(8, 16, 2, 7)-anticode as a running example

. {01, 08, 02, 0B, 04, 1D, 1E, 30, 07, 65, 6A, AD, B3, CE, D9, F6} . (5.14)

In particular, 01 is the codeword for 0000, 08 in the codeword for 0001, etc. And
we write
5.2 Fault Countermeasures for Symmetric Block Ciphers 383

01 = encode(0000).
.

Given an anticode C, the addRoundKey operation can be implemented using an


XOR table similar to the one shown in Example 5.2.1. The size of the table will be
8 8 ~ denote this table lookup operation.
.2 × 2 . Let .⊕

Example 5.2.3 Using the anticode given in Eq. 5.14, the table entry corresponding
to 01 and 08 will be

encode(0000 ⊕ 0001) = encode(0001) = 08.


.

And we write

~08 = 08.
01⊕
.

The implementation of sBoxLayer and pLayer is based on four .16 × 64 lookup


tables, .T 0, T 1, T 2, T 3. Let .x = x3 x2 x1 x0 be an element in .F42 . We write

SB(x3 x2 x1 x0 ) = x3s x2s x1s x0s .


.

Example 5.2.4 Take .D = 1101, then

. x3 = 1, x2 = 1, x1 = 0, x0 = 1.

Since SB.(D) = 7 = 0111 (see Table 3.11), we have

.x3s = 0, x2s = 1, x1s = 1, x0s = 1.

The design of tables T 0, T 1, T 2, and T 3 is as follows:

.T 0:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(000x3s ), encode(000x2s ),
encode(000x1s ), encode(000x0s )

.T 1:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(00x3s 0), encode(00x2s 0),
encode(00x1s 0), encode(00x0s 0)
384 5 Fault Attacks and Countermeasures

.T 2:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(0x3s 00), encode(0x2s 00),
encode(0x1s 00), encode(0x0s 00)

.T 3:C → C×C×C×C
encode(x3 x2 x1 x0 ) |→ encode(x3s 000), encode(x2s 000),
encode(x1s 000), encode(x0s 000).

Thus, each table extracts the bits of the Sbox output, permutes them, and outputs
the corresponding codeword. It is easy to see that each entry of the outputs of each
table can be either encode.(0000) or encode.(0001) for T 0, encode.(0010) for T 1,
encode.(0100) for T 2, and encode.(1000) for T 3.
Example 5.2.5 Suppose the input is .01 = encode(0000). The corresponding Sbox
output would be .C = 1100 (see Table 3.11), i.e., .x3s x2s x1s x0s = 1100. Using the anticode
given in Eq. 5.14, the output of T 0 will be

.encode(0001)= 08, encode(0001)= 08, encode(0000)= 01, encode(0000)= 01.

The output of T 1 is

.encode(0010)= 02, encode(0010)= 02, encode(0000)= 01, encode(0000)= 01.

T 2 gives

.encode(0100)= 04, encode(0100)= 04, encode(0000)= 01, encode(0000)= 01.

Finally, T 3 produces

.encode(1000)= 07, encode(1000)= 07, encode(0000)= 01, encode(0000)= 01.

Example 5.2.6 Suppose the input is .08 = encode(0001). The corresponding Sbox
output would be .5 = 0101, i.e., .x3s x2s x1s x0s = 0101. Using the anticode given in Eq. 5.14,
the output of T 0 will be

.encode(0000)= 01, encode(0001)= 08, encode(0000)= 01, encode(0001)= 08.

The output of T 1 is

.encode(0000)= 01, encode(0010)= 02, encode(0000)= 01, encode(0010)= 02.

T 2 gives
5.2 Fault Countermeasures for Symmetric Block Ciphers 385

.encode(0000)= 01, encode(0100)= 04, encode(0000)= 01, encode(0100)= 04.

Finally, T 3 produces

.encode(0000)= 01, encode(1000)= 07, encode(0000)= 01, encode(1000)= 07.

Now, let the original cipher state at sBoxLayer input be .b63 b62 . . . b0 . For the
encoding-based implementation, the corresponding cipher state will be

.encode(b63 b62 b61 b60 )encode(b59 b58 b57 b56 ) . . .

encode(b7 b6 b5 b4 )encode(b3 b2 b1 b0 ).

Each codeword in this cipher state will be passed to tables .T 0, T 1, T 2, T 3, and the
outputs will be recorded. Then the output of pLayer will be computed by combining
~.
those table outputs through .⊕
Example 5.2.7 By Table 3.12, the pLayer output bits at positions .0, 1, 2, 3 come
from the bits at positions .0, 4, 8, 12 of the input of pLayer. Thus, we first get
encode.(000b0s ) from T 0 output, encode.(00b4s 0) from T 1, encode.(0b8s 00) from T 2,
s 000) from T 3, and then the 0th nibble of pLayer output will be
and encode.(b12

~encode(00b4 0)⊕
.encode(000b0 )⊕
s ~encode(0b8 00)⊕
s ~encode(b12 000).
s s

As another example, the third nibble (bits .16, 17, 18, 19) of pLayer output is given by

~encode(00b5 0)⊕
.encode(000b1 )⊕
s ~encode(0b9 00)⊕
s ~encode(b13 000).
s s

Remark 5.2.1 By the design of our implementation, when the faulty intermediate value
is not a codeword, the table lookup returns .0, and the attacker will not be able to tell what
the original faulty ciphertext is. Since both DFA and SFA require analysis of the faulty
ciphertexts, they can be prevented when the fault model is bit flip, and the number of bit
flips is lower than the minimum distance of the binary code.
We have also seen that binary codes can correct error. According to Theo-
rem 1.6.2, if m bits are flipped during the computation, a binary code C used for
encoding-based countermeasure can correct this fault as long as .m ≤ ⎿(d − 1)/2⏌,
where d is the minimum distance of C . Note that to realize the incomplete decoding
rule, we need an error message to indicate more than one codeword is at the same
smallest distance from the input word.
For example, let us consider the 3-repetition code .C[3,1,3] = {000, 111}, which is
a .[3, 1, 3]-linear code (see Example 1.6.8). Since .C[3,1,3] contains two codewords,
it can be used to encode 1 bit of information. As 000 is a codeword of .C[3,1,3] , we
cannot use it as the error message. On the other hand, we note that no word in .F32 is
at the same distance from 000 and 111, which means we will always be able to find a
codeword using the minimum distance decoding rule. .C[3,1,3] has minimum distance
386 5 Fault Attacks and Countermeasures

Table 5.6 Lookup table for & 000 001 010 011 100 101 110 111
error-correcting code based
computation of AND between 000 000 000 000 000 000 000 000 000
.a, b (.a, b ∈ F2 ), using the 001 000 000 000 000 000 000 000 000
3-repetition code .{000, 111}. 010 000 000 000 000 000 000 000 000
000 is the codeword for 0, 011 000 000 000 111 000 111 111 111
and 111 is the codeword for 1 100 000 000 000 000 000 000 000 000
101 000 000 000 111 000 111 111 111
110 000 000 000 111 000 111 111 111
111 000 000 000 111 000 111 111 111

3. Then we know that an implementation based on encoding countermeasure with


will be able to correct errors caused by 1-bit flip attacks.
.C[3,1,3]

Let 000 be the codeword for 0 and 111 be the codeword for 1. The lookup table
for computation of AND between .a, b (.a, b ∈ F2 ) with error correction is shown in
Table 5.6. For example, if the inputs are 0 (000) and 1 (111), the correct output
should be 0, which corresponds to codeword 000.
We can also see that if there are more bit flips, the faulty output might be
corrected to a wrong codeword. For example, if the inputs are 111 and 111, but the
second 111 is faulted to 001 with a 2-bit flip attack, then the table lookup gives output
000. However, since .1 & 1 = 1, the output should be 111. Thus, it is better to only
use error-correcting code-based countermeasure when we know at most .⎿(d − 1)/2⏌
bits can be flipped, where d is the minimum distance of the binary code.
We refer the readers to [BKHL20] for an encoding-based hardware implemen-
tation of PRESENT using the 3-repetition code .C[3,1,3] .

5.2.2 Infective Countermeasure

The idea of infective countermeasure is to process the ciphertext in a way that


the output becomes useless for an attacker when faults are injected during the
computations. We will take the proposal from [TBM14] and only focus on the
case for AES-128. The protection for other AES variants can be done in a similar
way; we refer the readers to [TBM14] for more details.
The main methodology of the countermeasure is to compute each round of
AES encryption twice before moving to the next round. The results of the two
computations of the same round will be compared, if a fault is detected, the rest
of the computation should produce random values. Computations of dummy rounds
are also randomly added in between the AES rounds so that the attacker would not
know where the fault was actually injected.
As mentioned in Sect. 3.1.2, AES-128 has key size 128 bits and round number
Nr.= 10. We define .Fi (.i = 0, 1, 2, . . . , 10) as follows: .F0 denotes the initial
AddRoundKey operation in AES; for .i = 1, 2, . . . , 9, .Fi denotes the AES round
5.2 Fault Countermeasures for Symmetric Block Ciphers 387

Algorithm 5.3: Infective Countermeasure for AES-128


Input: p, β, keys, t // p is a plaintext block; β is a random number; keys
contains the AES round keys Ki and the dummy round keys κi
(Eq. 5.15) for i = 0, 1, 2, . . . , 10, see Eq. 5.16; t is a user-specified
security parameter.
Output: ciphertext or infected ciphertext
1 R0 = p// cipher state
2 R1 = p// redundant cipher state
3 R2 = β// dummy round state
4 Generate rstr∈ Ft2 // contains 22 of 1s corresponding to AES rounds and
t − 22 of 0s corresponding to dummy rounds
5 j =0
6 idx = 1
7 while idx ≤ t do
8 i = ⎿j/2⏌// i is the round counter
9 λ = rstr[idx]// λ is given by the idxth bit of rstr, λ = 0 implies a
dummy round
10 a = ((LSB of j ) & λ) ⊕ 2(¬λ)// LSB stands for the least significant
bit, & is bitwise AND (see Definition 1.3.6), ¬ is logical
negation
11 Ra = Fi (Ra , keys[λ][i])
12 γ = λ & (LSB of j ) & (¬10 (R0 ⊕ R1 ))// if j is odd and λ = 1, detect
fault injection in AES
13 δ = (¬λ) & (¬10 (R2 ⊕ β))// detect fault injection in dummy round when
λ=0
14 R0 = (¬(γ ∨ δ) · R0 ) ⊕ ((γ ∨ δ) · R2 )
15 j =j +λ
16 idx = idx + 1
17 return R0

function, in particular, .Fi consists of the following operations: SubBytes, ShiftRows,


MixColumns, and AddRoundKey; .F10 denotes the AES round function for the
last round. It consists of SubBytes, ShiftRows, and AddRoundKey. Let .Ki (.i =
0, 1, 2, . . . , 10) denote the round keys for AES. Each .Fi takes as input the cipher state
at the end of round .i − 1 and .Ki , and outputs the cipher state at the end of round i .
Correspondingly, we also generate a random number .β and the round keys for
the dummy rounds, denoted .κi (.i = 0, 1, 2, . . . , 10), such that

.Fi (β, κi ) =β (5.15)

for .i = 0, 1, 2 . . . , 10. We note that since .F0 is an AddRoundKey operation,

.κ0 = 0000000000000000.

Furthermore,

.κi = β ⊕ MixColumns(ShiftRows(SubBytes(β))), for i = 1, 2, . . . , 9


388 5 Fault Attacks and Countermeasures

and

.κ10 = β ⊕ ShiftRows(SubBytes(β)).

We set an array of keys of size .2 × 11, denoted keys as

.keys[0][i] = κi , keys[1][i] = Ki . (5.16)

The details of the countermeasure are shown in Algorithm 5.3. As mentioned


before, each AES round is computed twice. The user-specified number t determines
how many dummy rounds will be added during the computation. The cipher state
for the first AES computation is stored in .R0 , and the cipher state in the redundant
AES computation is stored in .R1 . Both are initialized to be the plaintext (lines 1
and 2). The dummy round state is stored in .R2 and initialized to be the random
number .β (line 3). j (line 5) counts the total number (including the redundant ones)
of AES rounds computed, and .i = ⎿j/2⏌ (line 8) is the actual round counter. The
random string rstr contains 22 of 1s corresponding to two computations of each .Fi
for .i = 0, 1, . . . , 10 and .t − 22 bits of 0 corresponding to dummy rounds. In each
loop, we go through the idxth bit of rstr (line 9), and the value is stored in .λ. The
value of idx is increased by 1 (line16) at the end of each loop so that in the next
loop we will go to the next bit of rstr.
In line 10, we note that if j is even (respectively, odd), the least significant bit
(LSB) of j is 0 (respectively, 1), thus


⎨0 ⊕ 2 = 2,
⎪ if λ = 0
.a = ((LSB of j ) & λ) ⊕ 2(¬λ) = (0 & 1) ⊕ 0 = 0, if λ = 1 and j is even



(1 & 1) ⊕ 0 = 1, if λ = 1 and j is odd.

Then, in line 11, when .λ = 0, .a = 2, we compute a dummy round i with

.R2 = Fi (R2 , keys[0][i]) = Fi (R2 , κi ).

When .λ = 1, j is even, and .a = 0, we compute AES round i with

.R0 = Fi (R0 , keys[1][i]) = Fi (R0 , Ki ).

When .λ = 1, j is odd, and .a = 1, we compute a redundant AES round i with

.R1 = Fi (R1 , keys[1][i]) = Fi (R1 , Ki ).

The total round counter j is increased by .λ at the end of each loop (line 15).
When .λ = 0, only a dummy round is computed.
Up to now, we have seen how the AES rounds and dummy rounds are computed.
Next, we discuss how fault is handled in the algorithm.
5.2 Fault Countermeasures for Symmetric Block Ciphers 389

First, we recall the notion of indicator function from Definition 3.2.2. We


consider the indicator function for .0 with domain .F128
2 :

.10 : F128
2 → F2

x |→ (1 − xi ).
i

In other words,

1 if x = 0
.10 (x) =
0 otherwise.

Then, with logical negation, .¬10 (x) : F128


2 → F2 and


0 if x = 0
.¬10 (x) =
1 otherwise.

Consequently, in line 12, we have



0 if λ = 0 or j is even
.γ = λ & (LSB of j ) & (¬10 (R0 ⊕ R1 )) =
¬10 (R0 ⊕ R1 ) otherwise

0 if λ = 0 or j is even or R0 = R1
=
1 λ = 1, j is odd, and R0 /= R1 .

Thus, when j is odd and .λ = 1 (i.e., in the loop when the redundant AES round
is computed), .γ indicates if the cipher state in the AES round computation, .R0 , is
equal to the redundant cipher state, .R1 , or equivalent, whether fault happened in
AES round or in the redundant round computation. If there was no fault, .γ = 0;
otherwise, .γ = 1.

Algorithm 5.4: Computation of AES round in the infective Countermeasure


for AES-128 from Algorithm 5.3
1 j is even
2 i = ⎿j/2⏌// i is the round counter
3 λ=1
4 a=0
5 R0 = Fi (R0 , keys[1][i])// keys[1][i] = Ki is the ith round key for AES
6 γ =0
7 δ=0
8 R 0 = R0
390 5 Fault Attacks and Countermeasures

Algorithm 5.5: Computation of redundant AES round in the infective Coun-


termeasure for AES-128 from Algorithm 5.3
1 j is odd
2 i = ⎿j/2⏌// i is the round counter
3 λ=1
4 a=1
5 R1 = Fi (R1 , keys[1][i])// keys[1][i] = Ki is the ith round key for AES
6 γ = ¬10 (R0 ⊕ R1 )// detect fault injection in AES
7 δ=0
8 R0 = ((¬γ ) · R0 ) ⊕ (γ · R2 )// if there is fault in AES computation, R0 = R2
becomes a random number

Algorithm 5.6: Computation of the dummy round in the infective Countermea-


sure for AES-128 from Algorithm 5.3
1 λ=0
2 a=2
3 R2 = Fi (R2 , keys[0][i])// i is the round counter, keys[0][i] = κi is the ith
round key for the dummy rounds
4 γ =0
5 δ = ¬10 (R2 ⊕ β)// detect fault injection in dummy round
6 R0 = ((¬δ) · R0 ) ⊕ (δ · R2 )// if there is fault in the dummy round
computation, R0 = R2 becomes a random number

Similarly, in line 13, we have


⎧ ⎧
0 if λ = 1 0 if λ = 1 or R2 = β
.δ = =
¬10 (R2 ⊕ β) if λ = 0 1 if λ = 0 and R2 /= β.

Thus, when .λ = 0, i.e., in the loop when the dummy round is computed, .δ indicates
if there is a fault injected in the computation of the dummy round state .R2 . By the
design of dummy round keys and .β (see Eq. 5.15), if there are no faults, .R2 = β and
.δ = 0. Otherwise, .R2 /= β and .δ = 1.

Finally, in line 14, we have



R0 if γ = 0 and δ = 0
.R0 = (¬(γ ∨ δ) · R0 ) ⊕ ((γ ∨ δ) · R2 ) =
R2 otherwise.

This line guarantees that .R0 will be changed to a random number .R2 if a fault is
detected in any of the computations. Consequently, the output will be a random
number or infected ciphertext.
The computations for the AES round, the redundant round, and the dummy round
are shown in Algorithms 5.4, 5.5, and 5.6.
5.3 Fault Attacks on RSA and RSA Signatures 391

5.3 Fault Attacks on RSA and RSA Signatures

As discussed in Sect. 2.1.2, a public key cryptosystem has a public key and a private
key. For fault attacks that will be discussed in this section, the attacker’s goal will
be the recovery of the secret key.
Unlike fault attacks on symmetric block ciphers, attacks on public key ciphers
depend on the underlying intractable problem, and we do not have a systematic
methodology. However, the general attack concept can be applied to ciphers
based on similar intractable problems. This section will focus on fault attacks on
implementations of RSA signatures. We will discuss a few fault attacks during the
signature signing procedure to recover the private key.

Note
We note that the attacks on RSA signature signing procedure can also be applied
to RSA decryption process.

For the rest of this section, let p and q be two distinct odd primes. Let .n = pq
and .e ∈ Z∗ϕ(n) be the public key for RSA signatures. .d = e−1 mod ϕ(n) denotes the
private key. The goal of the attacker is to recover d . As discussed in Sect. 3.5.1.3, the
signature is computed on the hash value, .h(m), of the intended message m, where
h is a fast public hash function (see Sect. 2.1.1). For simplicity, we will use m to
denote the hash value .h(m).
Let .𝓁d and .𝓁n denote the bit length of d and n, respectively. We have the following
binary representation (see Theorem 1.1.1) of d :

d −1
𝓁⎲
.d = di 2i .
i=0

We recap here the CRT-based implementation for RSA signatures. Following the
discussions in Sect. 3.5.1.3, to sign the signature for m, the owner of the private key,
say Alice, computes

.sp := md mod (p−1) mod p, sq := md mod (q−1) mod q, (5.17)

and the signature s is given by Gauss’s algorithm,

.s = sp yq q + sq yp p mod n,

or by Garner’s algorithm,

.s = sp + ((sq − sp )yp mod q)p,


392 5 Fault Attacks and Countermeasures

where

.yq = q −1 mod p, yp = p −1 mod q. (5.18)

Alice sends s and m to Bob. To verify the signature, Bob computes and checks if

.s
e
mod n = m.

If the equality holds, Bob considers the signature valid.

5.3.1 Bellcore Attack

We first describe an attack that recovers the private key of RSA signatures
by exploiting a faulty signature. The attack was first introduced by Boneh et
al. [BDL97]. The name “Bellcore” comes from the company the authors were
working for at the time of the publication. This paper is also the very first paper
that introduced fault attacks to cryptographic implementations.
As mentioned in Sect. 3.5.1.3, .yq and .yp (Eq. 5.18) can be precomputed. We
assume there are no faults in their computations.
By the design of .sp , .sq , .yp , and .yq , we have

.s ≡ sq mod q, s ≡ sp mod p, (5.19)

which gives

.m ≡ s e ≡ sqe mod q, m ≡ s e ≡ spe mod p. (5.20)

Suppose a malicious fault was induced during the signing of the signature and
the computation of .sp or .sq (Eq. 5.17), but not both, is corrupted. Let us assume that
.sp is faulty and .sq is computed correctly. A similar attack applies if .sq is faulty and
'
.sp is correct. Let .s denote the faulty signature. By Eq. 5.19,

'
.s ≡ s ≡ sq mod q, s ' /≡ s mod p.

In other words,

'
.q|(s − s), p ∤ (s ' − s).

Recall that n and e are public. If the attacker further has the knowledge of s and .s ' ,
then they can compute
n
.q = gcd(s ' − s, n), p= .
q
5.3 Fault Attacks on RSA and RSA Signatures 393

As mentioned in Sect. 3.3, after factorizing n, the attacker can compute

.ϕ(n) = (p − 1)(q − 1)

and eventually recover the private key

.d = e−1 mod ϕ(n)

by the extended Euclidean algorithm (Algorithm 1.2)


For a different attack [Len96], we assume the attacker does not have the
knowledge of the correct signature s . Instead, the attacker can obtain the faulty
signature .s ' and the original message hash value m. For example, the attacker can
request Alice for the signature of a chosen message. By Eq. 5.20,

'e
.s ≡ m mod q, s 'e /≡ m mod p,

i.e.,

'e
.q|(s − m), p ∤ (s 'e − m).

Thus the attacker can factorize n by computing


n
.q = gcd(s 'e − m, n), p= .
q

Example 5.3.1 Let .p = 5, .q = 7, and .e = 5. We have calculated that .d = 5 in


Example 3.4.1 and .yq = 3, .yp = 3 in Example 3.5.8. Suppose .m = 6. By Eq. 5.17, to
calculate the signature, Alice computes

sp = md mod (p−1) mod p = 65 mod 4 mod 5 = 1,


.
sq = md mod (q−1) mod q = 65 mod 6 mod 7 = 6.

And the signature is

.s = sp + ((sq − sp )yp mod q)p = 1 + ((6 − 1) × 3 mod 7) × 5 = 6.

We can verify that

.s
e
mod n = 65 mod 35 = 6 = m.

Now suppose the computation of .sp is faulty and .sp' = 3. Then we have

'
.s = sp' + ((sq − sp' )yp mod q)p = 3 + ((6 − 3) × 3 mod 7) × 5 = 3 + 2 × 5 = 13.

If the attacker has the knowledge of .s = 6 and .s ' = 13, they can compute
394 5 Fault Attacks and Countermeasures

.q = gcd(s ' − s, n) = gcd(13 − 6, 35) = gcd(7, 35) = 7.

If the attacker has the knowledge of .s ' = 13 and .m = 6, they can compute

.q = gcd(s 'e − m, n) = gcd(135 − 6, 35) = gcd(371287, 35).

By the Euclidean algorithm,

371287 = 35 × 10608 + 7, gcd(371287, 35) = gcd(35, 7),


.
35 = 7 × 5, gcd(35, 7) = 7,

and .q = 7.
Similarly, suppose the computation of .sq is faulty and .sq' = 2. Then

'
.s = sp + ((sq' − sp )yp mod q)p = 1 + ((2 − 1) × 3 mod 7) × 5 = 16.

If the attacker has the knowledge of .s = 6 and .s ' = 16, they can compute

.p = gcd(s ' − s, n) = gcd(16 − 6, 35) = gcd(10, 35) = 5.

If the attacker has the knowledge of .s ' = 16 and .m = 6, they can compute

.p = gcd(s 'e − m, n) = gcd(165 − 6, 35) = gcd(1048570, 35).

By the Euclidean algorithm,

1048570 = 35 × 29959 + 5, gcd(1048570, 35) = gcd(35, 5),


.
35 = 5 × 7, gcd(35, 5) = 5.

Hence .p = 5.
Example 5.3.2 Let .p = 11, .q = 13. Then .n = 143,

.ϕ(n) = 10 × 12 = 120.

Choose .e = 11, which is coprime with .ϕ(n). By the extended Euclidean algorithm,

.120 = 11 × 10 + 10, 11 = 10 × 1 + 1 =⇒ 1 = 11 − (120 − 11 × 10) = 11 × 11 − 120,

and we have .d = 11−1 mod 120 = 11. Again, by the extended Euclidean algorithm,

.13 = 11×1+2, 11 = 2×5+1 =⇒ 1 = 11−2×5 = 11−5×(13−11) = 11×6−13×5,

and we have
5.3 Fault Attacks on RSA and RSA Signatures 395

.yq = q −1 mod p = 13−1 mod 11 = −5 mod 11 = 6,

yp = p −1 mod q = 11−1 mod 13 = 6.

Let .m = 2. By Eq. 5.17, to calculate the signature, Alice computes

. sp = md mod (p−1) mod p = 211 mod 10 mod 11 = 2 mod 11 = 2,


sq = md mod (q−1) mod q = 211 mod 12 mod 13 = 2048 mod 13 = 7.

Using Garner’s algorithm, the signature is

.s = sp + ((sq − sp )yp mod q)p = 2 + ((7 − 2) × 6 mod 13) × 11 = 2 + 4 × 11 = 46.

We have

. 462 mod 143 = 114, 463 mod 143 = 114 × 46 mod 143 = 96,
465 mod 143 = 114 × 96 mod 143 = 76, 4610 mod 143 = 762 mod 143 = 56.

We can then verify that

.s
e
mod n = 4611 mod 143 = 56 × 46 mod 143 = 2 = m.

Now suppose the computation of .sp is faulty and .sp' = 7. Then we have

'
.s = sp' + ((sq − sp' )yp mod q)p = 7 + ((7 − 7) × 6 mod 13) × 11 = 7.

If the attacker has knowledge of .s = 46 and .s ' = 7, they can compute

.q = gcd(s ' − s, n) = gcd(7 − 46, 143) = gcd(−39, 143) = gcd(39, 143).

By the Euclidean algorithm,

143 = 39 × 3 + 26, gcd(39, 143) = gcd(39, 26),


. 39= 26 + 13, gcd(39, 26) = gcd(26, 13),
26 = 13 × 2, gcd(26, 13) = 13.

Hence .q = 13.
If the attacker has knowledge of .s ' = 7 and .m = 2, they can compute

.q = gcd(s 'e − m, n) = gcd(711 − 2143) = gcd(1977326741, 143).

By the Euclidean algorithm,


396 5 Fault Attacks and Countermeasures

1977326741 = 143 × 13827459 + 104, gcd(1977326741, 143) = gcd(143, 104),


143 = 104 + 39, gcd(143, 104) = gcd(104, 39),
. 104 = 39 × 2 + 26, gcd(104, 39) = gcd(39, 26),
39 = 26 + 13, gcd(39, 26) = gcd(26, 13),
26 = 13 × 2, q = gcd(26, 13) = 13.

Similarly, suppose the computation of .sq is faulty and .sq' = 2. Then

'
.s = sp + ((sq' − sp )yp mod q)p = 2 + ((2 − 2) × 6 mod 13) × 11 = 2.

If the attacker has knowledge of .s = 46 and .s ' = 2, they can compute

.p = gcd(s ' − s, n) = gcd(2 − 46, 143) = gcd(−44, 143) = gcd(44, 143).

By the Euclidean algorithm,

143 = 44 × 3 + 11, gcd(44, 143) = gcd(44, 11),


.
44 = 11 × 4, q = gcd(44, 11) = 11.

If the attacker has knowledge of .s ' = 2 and .m = 2, they can compute

.p = gcd(s 'e − m, n) = gcd(211 − 2, 143) = gcd(2046, 143).

By the Euclidean algorithm,

2046 = 143 × 14 + 44, gcd(2046, 143) = gcd(143, 44),


. 143 = 44 × 3 + 11, gcd(143, 44) = gcd(44, 11),
44 = 11 × 4, p = gcd(44, 11) = 11.

5.3.2 Attack on the Square and Multiply Algorithm

In this subsection, we will look at fault attacks on the square and multiply algorithm.
We will first detail the bit flip attack proposed in [BDH+ 97], and then we will
discuss an improved version proposed in [JQBD97].
Instead of a CRT-based implementation, we assume the implementation com-
putes the signature with the right-to-left square and multiply algorithm. Following
Algorithm 3.7, to compute .md mod n, we have Algorithm 5.7, where .𝓁d is the bit
length of d .
For the attack, we inject a bit-flip fault model so that 1 bit of d , say .di , is
flipped. Let .d ' denote the faulty value of d . Then the faulty signature is given by
'
.s = m mod n. From Algorithm 5.7, lines 4 and 5, we can see that the computations
d'
5.3 Fault Attacks on RSA and RSA Signatures 397

Algorithm 5.7: Computing RSA signature with the right-to-left square and
multiply algorithm
Input: n, m, d// n is the RSA modulus; m is hash value of the message;
d is the private key of bit length 𝓁d
Output: s = md mod n
1 s=1
2 t =m
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
i
// multiply by m2
5 s = s ∗ t mod n
i+1
// t = m2
6 t = t ∗ t mod n
7 return s

of s and .s ' will differ by the multiplication of .m2 . In particular, we have


i


s' m−2 mod n, if di = 1, di' = 0
i

. ≡ (5.21)
if di = 0, di' = 1.
i
s m2 mod n,

Suppose the attacker has the knowledge of s , .s ' , and m; then they can compute

s'
. mod n
s

and compare it with

2i
.m mod n

to recover the value of .di .


To improve the attack, we loosen the assumption on the attacker and assume that
they only have the knowledge of .s ' and m (not knowing s ). In this case, we note that

s 'e s 'e
. ≡ mod n.
se m

Then it follows from Eq. 5.21 that



s 'e m−e2 mod n, if di = 1, di' = 0
i

. ≡ (5.22)
if di = 0, di' = 1.
i
m me2 mod n,
398 5 Fault Attacks and Countermeasures

Thus the attacker can compute

s 'e
. mod n
m

and compare with

e2i
.m mod n

to recover the value of .di .


Both attacks can be repeated for different bits of d to recover the whole private
key.
Example 5.3.3 Let .p = 3, .q = 5. We have .n = 15 and .ϕ(n) = 2 × 4 = 8. Suppose
= 3 = 112 and .m = 2. We have computed in Example 3.5.1 that
.d

.s = md mod n = 8.

The intermediate values for the computation with Algorithm 5.7 will be

i di t result
.0 1 4 2
1 1 1 8

By the extended Euclidean algorithm, we get

.e = d −1 mod ϕ(n) = 3−1 mod 8 = 3.

Suppose .d0 is flipped, then .d ' = 2 = 102 . The resulting computation following
Algorithm 5.7 will then have the intermediate values as follows:

i di t result
.0 0 4 1
1 1 1 4

Thus .s ' = 4.
With the knowledge of .s = 8, .s ' = 4, and .m = 2, the attacker computes

s' 4 s'
≡ ≡ 2−1 mod 15, ≡ m−2 mod n.
i i
. m2 ≡ 21 ≡ 2 mod 15 =⇒
s 8 s

By Eq. 5.21, .d0 = 1.


In case the attacker does not have the knowledge of s, they can compute

s 'e 43
. ≡ ≡ 32 ≡ 2 mod 15.
m 2
5.3 Fault Attacks on RSA and RSA Signatures 399

By the extended Euclidean algorithm,

.15 = 2 × 7 + 1 =⇒ 2−1 mod 15 = −7 mod 15 = 8.

And we have

m−e2 ≡ 2−3×2 ≡ 2−3 ≡ 83 ≡ 512 ≡ 2 mod 15.


e2i 0 i 0
.m ≡ 23×2 ≡ 23 ≡ 8 mod 15,

Thus

s 'e
≡ m−e2 mod n.
i
.
m

By Eq. 5.22, .d0 = 1.

5.3.3 Attack on the Public Key

In this subsection, we will discuss an attack [BCG08] that injects faults into the
RSA public key n, during the signature singing, and recovers the private key d .
Since the value n is big, it will be stored in a few registers. The fault can be injected
during loading or preparing n. The attack is specific to the right-to-left square and
multiply algorithm.
The RSA signature computation with the right-to-left square and multiply
algorithm is detailed in Algorithm 5.7. Let .n' denote the faulty RSA modulus and

.ε := n ⊕ n'

be the fault mask. Suppose the fault is injected in round j (.1 ≤ j ≤ 𝓁d − 2), resulting
in a faulty square computation in line 6

.t = t ∗ t mod n' ,

and this faulty .n' is also used for the rest of the computation. Then the faulty
signature is given by
⎡⎛ ⎞ ⎤
−1
j∏ d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣⎝ m 2i d i mod n⎠ m mod n ⎦ mod n' . (5.23)
i=0 i=j

We note that if the fault is injected in round .j = 𝓁d − 1 for the computation of


the square, the output will not be affected, and hence the faulty signature will not be
useful for recovery of the secret key. If the fault is injected in round 0, since .m ∈ Zn ,
the computation result will be .md mod n' , and the attacker would need to brute force
400 5 Fault Attacks and Countermeasures

all possible values of d to find out which one gives the faulty signature. Hence we
assume .j ≥ 1.
Recall that the correct signature is given by

d −1
𝓁∏
i
.s = m2 di mod n.
i=0

Define

.d(j ) := d𝓁d −1 . . . dj +1 dj 00 . . . 00,

then

d −1
𝓁∏
i
d
.m (j ) = m2 di mod n,
i=j

and
⎡ ⎤
d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣(sm−d(j ) mod n) m mod n ⎦ mod n' .
i=j

There are .2𝓁d −j possible values for .d(j ) .


Suppose the attacker has knowledge of .ε (hence .n' ), the message hash value m,
the correct signature s , and the faulty signature .s ' . For each guessed value of .d(j ) ,
denoted

.d̂(j ) = d̂𝓁d −1 . . . d̂j +1 d̂j 00 . . . 0,

the attacker computes


⎡ ⎤
d −1 ⎛
𝓁∏ ⎞2i−j +1 d̂i
j −1
.ŝ
'
= ⎣(sm−d̂(j ) mod n) m2 mod n ⎦ mod n' (5.24)
i=j

and compares it with .s ' . Then they record values of .d̂(j ) that satisfy

'
.ŝ = s',

which reduces the hypotheses for the j th—.(𝓁d − 1)th bits of d .


The attack can be repeated for other bits of d to reduce the key hypotheses further.
Example 5.3.4 Let .n = 15, m = 2. Then .ϕ(n) = 8. Let .d = 5 = 1012 . Computing

.s = md mod n = 25 mod 15
5.3 Fault Attacks on RSA and RSA Signatures 401

with Algorithm 5.7, we have the following intermediate values in each loop:

i di t s
0 1 4 2
.
1 0 1 2
2 1 1 2

and the correct signature .s = 2.


Suppose a fault is injected in n when line 6 is executed in the iteration .i = 1, resulting
in .n' = 13. The intermediate values will be

i di t s
0 1 4 2
.
1 0 3 2
2 1 9 6

and the faulty signature .s ' = 6, which agrees with Eq. 5.23:
⎡⎛ ⎞ ⎤
−1
j∏ d −1 ⎛
𝓁∏ ⎞2i−j +1 di
2j −1
.s
'
= ⎣⎝ m 2i d i mod n⎠ m mod n ⎦ mod n'
i=0 i=j
⎾ 2 ⎛

∏ ⎞ 2i di
20 d0
mod n'
0
= (m mod n) m2 mod n
i=1
⎾ ⎤
= (md0 mod n)(m mod n)2d1 +2 d2 mod n'
2

= (2 mod 15)(2 mod 15)4 mod 13 = 25 mod 13 = 6.

To recover the secret key d, the attacker takes all possible values for .d(1) = d2 d1 0
and computes the corresponding possible faulty signatures with Eq. 5.24:
⎡ ⎤
d −1 ⎛
𝓁∏ ⎞2i−j +1 d̂i
j −1
.ŝ
'
= ⎣(sm−d̂(j ) mod n) m2 mod n ⎦ mod n'
i=j
⎾ ⎤
= (2m−d̂(1) mod n)(m mod n)2d̂1 +2 d̂2 mod n'
2

⎾ ⎤
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' .
2

For .d̂(1) = 000, we have


⎾ ⎤
'
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' = 2 × 1 mod 13 = 2.
2
.ŝ
402 5 Fault Attacks and Countermeasures

For .d̂(1) = 010,


⎾ ⎤
'
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' =(2−1 mod 15) × 22
2
.ŝ

mod n' = 8 × 4 mod 13 = 6.

For .d̂(1) = 100,


⎾ ⎤
'
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' =(2−3 mod 15) × 24
2
.ŝ

mod n' = 2 × 16 mod 13 = 6.

For .d̂(1) = 110,


⎾ ⎤
'
= (21−d̂(1) mod 15) × 22d̂1 +2 d̂2 mod n' = (2−5 mod 15) × 26
2
.ŝ

mod n' = 8 × 64 mod 13 = 5.

Thus the attacker can conclude that .d(1) = 010 or 100, i.e., .d1 d2 = 01 or 10.
In case the attacker does not have the knowledge of the exact fault mask .ε (and hence
' ), but instead, they know the range for .ε . Then they can brute force all possible
.n

values of .ε and .d̂(j ) to reduce the key candidate. We refer the readers to [BCG08]
for more details.

5.3.4 Safe Error Attack

This part looks into implementations that are either based on the right-to-left square
and multiply algorithm (Sect. 3.5.1.1) or on the Montgomery powering ladder
(Sect. 3.5.1.2). We further require that the modular multiplication is implemented
with Blakely’s method (Sect. 3.5.2.1). We will discuss a fault attack that is specific
to such a setting.
The attack exploits the knowledge of whether an intermediate faulty value is
used or not by observing whether the final output is changed, thus the name safe
error attack [YJ00]. Since only knowing whether the output is changed or not is
enough, if we implement a countermeasure that repeats the computation, compares
the final results, and outputs an error when a fault is detected, the safe error attack
still applies.
Let .ω be the computer’s word size (see Sect. 2.1.2). Take .κ = ⎾𝓁n /ω⏋, i.e.,

.(κ − 1)ω < 𝓁n ≤ κω,

where .𝓁n is the bit length of n.


5.3 Fault Attacks on RSA and RSA Signatures 403

5.3.4.1 Safe Error Attack on the Montgomery Powering Ladder

With the Montgomery powering ladder and Blakley’s method, the signature .s =
md mod n is computed with Algorithm 5.8 (see Algorithm 3.15).
Since .𝓁n is the bit length of n, the bit lengths of the variables .R0 and .R1 are at
most .𝓁n . We can write


κ−1 ⎲
κ−1
.R0 = R0i (2ω )i , R1 = R1i (2ω )i .
i=0 i=0

We can also assume each of .R0i and .R1i is stored in one register.
Suppose .dj = 0, and a fault is injected during the j th iteration of the outer loop,
when .i < i0 in the loop starting from line 6, in the variable .R0i0 , for some .i0 such
that .0 ≤ i0 ≤ κ − 1. Then the value in .R1 in line 9 will not be affected since .R0i0 is
used when .i = i0 . However, the value in .R0 in line 14 will be faulty. Hence the final
output will be faulty.
On the other hand, suppose .dj = 1, and a fault is injected during the j th iteration
of the outer loop and when .i < i0 in the loop starting from line 17, in the variable
.R0i0 , for some .i0 such that .0 ≤ i0 ≤ κ − 1. Then the fault will go unnoticed since

.R0i0 is used when .i = i0 and the value in .R0 will be rewritten in line 20. Thus the

final output will be correct.


We assume the attacker has the knowledge of the correct signature, and they can
rerun the algorithm with the same inputs, inject fault, and observe the final output.
To recover the value of .dj , the attacker fixes an .i0 , estimates the time for i to be less
than .i0 in loop j , and injects fault in .R0i0 . If the signature is faulty, then .dj = 0,
and if the signature is correct, then .dj = 1. We note that the computation times for
one loop starting from line 6 and one loop starting from line 17 are similar since
they both involve two multiplications and one modular reduction. The attack can be
repeated for different bits of d to recover the full private key.
Example 5.3.5 Let us repeat the computations in Examples 5.3.3 and 3.5.1 with
Algorithm 5.8. We have

.p = 3, q = 5, n = 15, ϕ(n) = 2 × 4 = 8, d = 3 = 112 , m = 2.

And .𝓁n = 4, .𝓁d = 2. Suppose .ω = 2, then


⎾ ⎤ ⎾ ⎤
𝓁n 4
.κ = = = 2.
ω 2

With Algorithm 5.8, lines 1 and 2 give

.R0 = 1, R00 = 01, R01 = 00. R1 = 2, R10 = 10, R11 = 00.

The intermediate values are


404 5 Fault Attacks and Countermeasures

Algorithm 5.8: RSA signature computation with Montgomery powering ladder


and Blakely’s method
Input: n, m, d// n is the RSA modulus of bit length 𝓁n ; m is the hash
value of the message; d is the private key of bit length 𝓁d
Output: md mod n
1 R0 = 1
2 R1 = m
3 for j = 𝓁d − 1, j ≥ 0, j − − do
4 if dj = 0 then
// lines 5 - 9 implement R1 = R0 R1 mod n
5 R=0
6 for i = κ − 1, i ≥ 0, i − − do
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
7 R = 2ω R + R0i R1
8 R = R mod n
9 R1 = R
// lines 10 - 14 implement R0 = R02 mod n
10 R=0
11 for i = κ − 1, i ≥ 0, i − − do
12 R = 2ω R + R0i R0
13 R = R mod n
14 R0 = R
15 else
// lines 16 - 20 implement R0 = R0 R1 mod n
16 R=0
17 for i = κ − 1, i ≥ 0, i − − do
18 R = 2ω R + R0i R1
19 R = R mod n
20 R0 = R
// lines 21 - 25 implement R1 = R12 mod n
21 R=0
22 for i = κ − 1, i ≥ 0, i − − do
23 R = 2ω R + R1i R1
24 R = R mod n
25 R1 = R

26 return R0

j = 1 d1 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 mod 15 = 2
line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R = 2ω R + R11 R1 mod n = 0
. i=0 R = 2ω R + R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01
j = 0 d0 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 × 4 mod 15 = 8
line 20 R0 = 8 R00 = 00, R01 = 10
5.3 Fault Attacks on RSA and RSA Signatures 405

Hence the output is 8.


Suppose the attacker would like to find out what is .d0 . They estimate the time for
.j = 0 in the outer loop and .i = 0 in the loop starting from either line 6 or line 17.

Then they inject fault into .R01 at this time. We note that .R01 is used (blue .R01 in the
above equations) before .i = 0 and reassigned value in line 20 (orange .R01 in the above
equations). Thus the computations are not affected, and the signature is correct. The
attacker can conclude that .d0 = 1.
Example 5.3.6 Let .d = 2 = 102 , and keep the other parameters the same as in
Example 5.3.5. Then

.s = md mod n = 22 mod 15 = 4.

With Algorithm 5.8, lines 1 and 2 give

.R0 = 1, R00 = 01, R01 = 00. R1 = 2, R10 = 10, R11 = 00.

The intermediate values are

j = 1 d1 = 1
loop line 17 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω R + R00 R1 mod n = 2 mod 15 = 2
. line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R = 2ω R + R11 R1 mod n = 0
i=0 R = 2ω R + R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01

j = 0 d0 = 0
loop line 6 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω + R00 R1 mod n = 8
line 9 R1 = 8 R10 = 00, R11 = 10
loop line 11 i = 1 R = 2ω R + R01 R0 mod n = 0
i=0 R = 2ω R + R00 R0 mod n = 2 × 2 mod 15 = 4
line 14 R0 = 4

Hence the output is 4.


Suppose the attacker would like to find out what is .d0 . They estimate the time for
.j = 0 in the outer loop and .i = 0 in the loop starting from either line 6 or line 17.

Then they inject fault into .R01 at this time. Suppose the faulty .R01 has a value 01. The
intermediate values will be as follows:
406 5 Fault Attacks and Countermeasures

j = 1 d1 = 1
loop line 17 i = 1 R=0
i=0 R = R00 R1 mod n = 2 mod 15 = 2
line 20 R0 = 2 R00 = 10, R01 = 00
loop line 22 i = 1 R=0
i=0 R = R10 R1 mod n = 2 × 2 mod 15 = 4
line 25 R1 = 4 R10 = 00, R11 = 01
.
j = 0 d0 = 0
loop line 6 i = 1 R = 2ω R + R01 R1 mod n = 0
i=0 R = 2ω + R00 R1 mod n = 8
line 9 R1 = 8 R10 = 00, R11 = 10
loop line 11 i = 1 R = 2ω R + R01 R0 mod n = 1 × 4 mod 15 = 4
i=0 R = 2ω R + R00 R0 mod n = 22 × 4 + 2 × 2 mod 15 = 5
line 14 R0 = 5

where the blue .R01 is used before the fault injection and the green .R01 carries the faulty
value of .R01 . Thus the final result will be changed, and the attacker can conclude .d0 = 0.

5.3.4.2 Safe Error Attack on the Square and Multiply Algorithm

Before detailing the safe error attack on the square and multiply algorithm, we first
consider a fault attack on Algorithm 5.9, where Blakely’s method (Algorithm 3.11)
is used for computing modular multiplication.
Let .a, b ∈ Zn be two integers. Since .𝓁n is the bit length of n, the bit length of a
is at most .𝓁n . Recall that .κ = ⎾𝓁n /ω⏋. We can store a in .κ registers, each containing
one .ai and (see also Eq. 3.22)


κ−1
.a = ai (2ω )i . (5.25)
i=0

We assume the attacker has the knowledge of the correct output for a pair of
a and b. And they can rerun the algorithm with the same input, inject fault, and
observe the output. Suppose .c = 1, and a fault is injected during the loop starting
from line 3 in the register containing .ai0 (.0 ≤ i0 ≤ κ − 1), when .i < i0 . In this case,
the fault in .ai0 will not affect the output since .ai0 is used when i is equal to .i0 . On
the other hand, if .c = 0 and a fault is injected in the register containing .ai0 during
the computation, then the final result will be faulty since the faulty value in a will
be returned.
Now, if the attacker does not know the value of c and would like to recover it by
fault injection attacks, they can assume that .c = 1, and the loop in line 3 is executed.
Then they inject fault in .ai0 at the time when i is less than .i0 . Finally, they compare
5.3 Fault Attacks on RSA and RSA Signatures 407

Algorithm 5.9: An algorithm involving computing modular multiplication with


Blakely’s method
Input: n, a, b, c// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a, b ∈ Zn ; c = 0, 1
Output: ab mod n if c = 1 and a otherwise
1 if c = 1 then
2 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the computer’s word size
3 for i = κ − 1, i >= 0, i − − do
4 R = 2ω R + ai b
5 R = R mod n
6 a=R
7 return a

the output with the correct one and recovers the value of c—if the output is correct,
.c= 1; otherwise .c = 0.
The same attack idea can be applied to the square and multiply algorithm to
recover the secret key. With the right-to-left square and multiply algorithm and
Blakley’s method, the signature .s = md mod n is computed with Algorithm 5.10
(see Algorithms 3.13 and 5.7). Since .𝓁n is the bit length of n, the bit lengths of the
variables s and t are at most .𝓁n . We can write


κ−1 ⎲
κ−1
.s = sj (2ω )j , t= tj (2ω )j .
j =0 j =0

Then, in Algorithm 5.10, lines 5–9 implement .s = s ∗ t mod n (line 5 of


Algorithm 5.7), and lines 10–14 implement .t = t ∗t mod n (line 6 of Algorithm 5.7).
Similar to before, we consider fault injections in the variables in Algorithm 5.10
at a certain time.
Suppose .di = 1, and a fault is injected during the i th iteration of the outer loop
and at the time when j is less than .j0 during the loop starting from line 6, in the
register containing .sj0 , where .0 ≤ j0 ≤ κ − 1. The fault in .sj0 will not affect the
output since .sj0 is used when j is equal to .j0 and the value in s is replaced by R in
line 9.
Suppose .di = 0, and a fault is injected during the i th iteration of the outer loop
in the register containing .sj0 (.0 ≤ j0 ≤ κ − 1); then the value in s will be changed,
and the final result will be different.
From these observations, similarly to the attack on Algorithm 5.9, the attacker
first assumes .di = 1 and injects fault in .sj0 at the time corresponding to .j < j0 .
If the final result is not changed, the attacker can conclude that .di = 1; otherwise,
.di = 0. The attacker can then repeat the attack for different values of i to recover

the entire private key.


408 5 Fault Attacks and Countermeasures

Algorithm 5.10: RSA signature signing computation with the right-to-left


square and multiply algorithm and Blakely’s method
Input: n, m, d// n is the RSA modulus of bit length 𝓁n ; m is the hash
value of the message; d is the private key of bit length 𝓁d
Output: md mod n
1 s=1
2 t =m
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
// lines 5 - 9 implement s = s ∗ t mod n
5 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the computer’s word size
6 for j = κ − 1, j ≥ 0, j − − do
7 R = 2 ω R + sj t
8 R = R mod n
9 s=R
// lines 10 - 14 implement t = t ∗ t mod n
10 R=0
11 for j = κ − 1, j ≥ 0, j − − do
12 R = 2ω R + t j t
13 R = R mod n
14 t =R
15 return s

Similar techniques can also be applied to attack the left-to-right square and
multiply algorithm with Blakely’s method. We refer the interested reader to [YJ00].
Example 5.3.7 Let us repeat the computations in Example 5.3.5 with Algorithm 5.10.
We have

.p = 3, q = 5, n = 15, d = 3 = 112 , m = 2,
𝓁n = 4, 𝓁d = 2, ω = 2, κ = 2.

With Algorithm 5.10, lines 1 and 2 give

.s = 1, s0 = 01, s1 = 00. t = 2, t0 = 10, t1 = 00.

The intermediate values during the computation are


5.3 Fault Attacks on RSA and RSA Signatures 409

i = 0 d0 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 2 mod 15 = 2
line 9 s=2 s0 = 10, s1 = 00
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
. j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 2 × 4 mod 15 = 8
line 9 s=8

Hence the correct output is 8.


Suppose the attacker would like to find out what is .d0 . They make the guess that
.d0 = 1 and inject faults into .s1 when .i = 0 for the outer loop and .j = 0 in the loop

starting from line 6. We note that .s1 is used (blue .s1 in the above equations) before .j = 0
and reassigned value in line 9 (orange .s1 in the above equations). Thus the computations
are not affected, and the final result is unchanged. The attacker can conclude that .d0 = 1.
Example 5.3.8 Let .d = 2 = 102 , and keep the rest of the parameters as in
Example 5.3.7. Then

.s = md mod n = 22 mod 15 = 4.

With Algorithm 5.10, lines 1 and 2 give

.s = 1, s0 = 01, s1 = 00. t = 2, t0 = 10, t1 = 00.

And the intermediate values are

i = 0 d0 = 0
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
.
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0
j =0 R = 2ω R + s0 t mod n = 0 + 1 × 4 mod 15 = 4
line 9 s=4

Hence the correct output is 4.


Now we consider an attacker who would like to recover the value of .d0 . They make
the guess that .d0 = 1, estimates the time for .i = 0 in the outer loop and .j = 0 in the
loop starting from line 6, and injects faults into .s1 at this point of time. Since .d0 = 0, s is
410 5 Fault Attacks and Countermeasures

not used in the iteration for .i = 0. We can assume the fault is injected before the start of
the next iteration as the computation time for lines 10–14 is similar to that for lines 5–9.
Suppose the faulty .s1 has a value 01. The intermediate values will be as follows:

i = 0 d0 = 0
loop line 11 j = 1 R = 2ω R + t1 t mod n = 0
j =0 R = 2ω R + t0 t mod n = 0 + 2 × 2 mod 15 = 4
line 14 t =4 t0 = 00, t1 = 01
.
i = 1 d1 = 1
loop line 6 j = 1 R = 2ω R + s1 t mod n = 0 + 1 × 4 mod 15 = 4
j =0 R = 2ω R + s0 t mod n = 22 × 4 + 1 × 4 mod 15 = 5
line 9 s=5

where the green .s1 is the faulty .s1 . The final result is changed, and the attacker can
conclude .d0 = 0.

5.4 Fault Countermeasures for RSA and RSA Signatures

In this section, we will discuss a few countermeasures for the attacks presented in
Sect. 5.3. We keep the same notations as before. p and q are two distinct odd primes,
and .n = pq . .d ∈ Z∗ϕ(n) is the private key for RSA signatures and .e = d −1 mod ϕ(n).
n has bit length .𝓁n . d has bit length .𝓁d with the following binary representation (see
Theorem 1.1.1):

d −1
𝓁⎲
.d = di 2i .
i=0

m denotes the hash value of the message. .s = md mod n is the corresponding


signature.
CRT-based implementation of RSA signatures computes

.sp := md mod (p−1) mod p, sq := md mod (q−1) mod q, (5.26)

and s is given by Gauss’s algorithm

.s = sp yq q + sq yp p mod n

or by Garner’s algorithm

.s = sp + ((sq − sp )yp mod q)p,


5.4 Fault Countermeasures for RSA and RSA Signatures 411

where

.yq = q −1 mod p, yp = p −1 mod q. (5.27)

5.4.1 Shamir’s Countermeasure

A simple countermeasure proposed by A. Shamir [Sha97] for the Bellcore attack


(Sect. 5.3.1) is to use an extended modulus. More specifically, let r be a random .𝓁r -bit
prime number. Typically .𝓁r = 32 [KQ07]. Instead of computing .sp and .sq as given
in Eq. 5.26, we compute


.sp = md mod (p−1)(r−1) mod pr, sq∗ = md mod (q−1)(r−1) mod qr. (5.28)

Then we check if


.sp ≡ sq∗ mod r. (5.29)

If yes, the signature s is given by

.s = sp∗ yq q + sq∗ yp p mod n. (5.30)

Firstly, we note that when there is no fault, by Eq. 5.28,


.sp ≡ md mod (p−1)(r−1) mod p.

Let

.a = d mod (p − 1)(r − 1),

then we can write

.d = a + b(p − 1)(r − 1)

for some integer b. We have

.d ≡ a ≡ (d mod (p − 1)(r − 1)) mod (p − 1).

By Corollary 1.4.3,


.sp ≡ md mod (p−1) mod p. (5.31)

Hence,
412 5 Fault Attacks and Countermeasures


.sp ≡ md mod (p−1) ≡ sp mod p.

Similarly,


.sq ≡ md mod (q−1) ≡ sq mod q.

Consequently, s given by Eq. 5.30 satisfies

.s ≡ sp∗ yq q ≡ sp∗ ≡ sp mod p, s ≡ sq∗ yp p ≡ sq∗ ≡ sq mod q

and is indeed the signature .md mod n.


Furthermore, since r is prime, by a similar argument for Eq. 5.31, we have


.sp ≡ md mod (r−1) mod r, sq∗ ≡ md mod (r−1) mod r,

which gives


.sp ≡ sq∗ mod r.

Suppose the Bellcore attack is to be carried out, and a malicious fault is injected
during the computation (Eq. 5.28) of .sp∗ or .sq∗ , but not both. Without loss of
generality, let us assume .sp∗ is faulty and .sq∗ is computed correctly. Let .sp∗' denote
the faulty .sp∗ . The fault will be detected if

∗'
.sp /≡ sq∗ mod r,

which means the probability of injecting an undetectable fault is the probability of


producing .sp∗' such that

∗'
.sp ≡ sq∗ mod r. (5.32)

If we assume the fault is injected so that the resulting value of .sp∗' is random and
follows a uniform distribution in .Zpr , then the probability for .sp∗' to satisfy Eq. 5.32
is .1/r . Thus, with Shamir’s countermeasure, the Bellcore attack will be successful
with probability .1/r . When the bit length of r is around 32 bits, this probability is
about .2−32 .
Example 5.4.1 Let us compute the signature from Example 5.3.1 with Shamir’s
countermeasure. We have

.p = 5, q = 7, n = 35, d = 5, m = 6.

Suppose .r = 3. By Eq. 5.28,

. sp∗ = md mod (p−1)(r−1) mod pr = 65 mod (4×2) = 65 mod 15 = 6,


5.4 Fault Countermeasures for RSA and RSA Signatures 413

sq∗ = md mod (q−1)(r−1) mod qr = 65 mod (6×2) = 65 mod 21 = 6.

We can check that


.sp ≡ sq∗ ≡ 0 mod 3.

We have shown in Example 3.5.8 that .yq = 3 and .yp = 3. By Eq. 5.30, the signature is
given by

.s = sp∗ yq q + sq∗ yp p mod n = 6 × 3 × 7 + 6 × 3 × 5 mod 35 = 6,

which agrees with the computations in Example 5.3.1.


Suppose an error occurred during the computation of .sp∗ and the faulty value .sp∗' = 4.
Then we would have

∗'
.sp /≡ sq∗ mod r.

However, in case .sp∗' = 9, we have

∗'
.sp ≡ sq∗ ≡ 0 mod 3,

and the faulty signature will be

'
.s = sp∗' yq q + sq∗ yp p mod n = 9 × 3 × 7 + 6 × 3 × 5 mod 35 = 34.

In this case, the attacker can repeat the Bellcore attack by computing

.q = gcd(s ' − s, n) = gcd(34 − 6, 35) = gcd(28, 35) = 7.

Example 5.4.2 Let us compute the signature from Example 5.3.2, by Shamir’s counter-
measure. We have

.p = 11, q = 13, n = 143, d = 11, m = 2.

Suppose .r = 5. By Eq. 5.28,

sp∗ = md mod (p−1)(r−1) mod pr = 211 mod 55 = 13,


.
sq∗ = md mod (q−1)(r−1) mod qr = 211 mod 65 = 33.

We can check that


.sp ≡ sq∗ ≡ 3 mod 5.
414 5 Fault Attacks and Countermeasures

We have shown in Example 5.3.2 that .yq = 6 and .yp = 6. By Eq. 5.30, the signature is
given by

.s = sp∗ yq q + sq∗ yp p mod n = 13 × 6 × 13 + 33 × 6 × 11 mod 143 = 46,

which agrees with the computations in Example 5.3.2.


Suppose an error occurred during the computation of .sp∗ and the faulty value .sp∗' = 10.
Then we would have

∗'
.sp /≡ sq∗ mod r.

However, in case .sp∗' = 3, we have

∗'
.sp ≡ sq∗ ≡ 3 mod 5,

and the faulty signature will be

'
.s = sp∗' yq q + sq∗ yp p mod n = 10 × 6 × 13 + 33 × 6 × 11 mod 143 = 98.

In this case, the attacker can repeat the Bellcore attack by computing

.q = gcd(s ' − s, n) = gcd(98 − 46, 143) = gcd(52, 143).

By the Euclidean algorithm,

143 = 52 × 2 + 39, gcd(52, 143) = gcd(52, 39),


= 39 × 1 + 13, gcd(52, 39) = gcd(39, 13),
. 52

39 = 13 × 3, q = gcd(39, 13) = 13.

5.4.2 Infective Countermeasure

Although Shamir’s countermeasure can effectively protect RSA signature computa-


tions against the Bellcore attack (Sect. 5.3.1), a simple improved attack is to bypass
the check of Eq. 5.29 using an instruction skip.
In this subsection, we will discuss a more sophisticated countermeasure against
the Bellcore attack, an infective countermeasure, proposed by Sung-Ming et al.
[SMKLM02]. The main goal of the countermeasure is to make .sp (Eq. 5.26)
faulty if .sq is faulty, hence the name “infective.” We have discussed an infective
countermeasure for AES in Sect. 5.2.2. We remark that the infective countermeasure
was first proposed for RSA signatures.
The same as before, let p and q be distinct odd primes. .n = pq . d is the private
key for RSA signatures. .e = d −1 mod ϕ(n). m is the hash value for the message.
5.4 Fault Countermeasures for RSA and RSA Signatures 415

Recall that

.yq = q −1 mod p, yp = p −1 mod q.

We select a random integer r such that .gcd(dr , ϕ(n)) = 1 and .er is a small integer,
where

.dr = d − r,

and

.er = dr−1 mod ϕ(n). (5.33)

Let
| | | |
m m
.kp = , kq = .
p q

The signature s is then computed using Eqs. 5.34–5.39.

.sp = mdr mod p, . (5.34)


m̂ = ((sper mod p) + kp p) mod q, . (5.35)
sq = m̂dr mod q, . (5.36)
sdr = sp yq q + sq yp p mod n, . (5.37)
~=
m (sqer mod q) + kq q, . (5.38)
s = sdr m
~r mod n. (5.39)

In Lemma 5.4.1, we will show that the signature s computed above is indeed
equal to the signature given by .md mod n.
Lemma 5.4.1

.s ≡ md mod n. (5.40)

Proof By definition of .er (Eq. 5.33),

.dr er ≡ 1 mod ϕ(n).

Since .ϕ(n) = (p − 1)(q − 1), we have

.dr er ≡ 1 mod (p − 1).

By Corollary 1.4.3,
416 5 Fault Attacks and Countermeasures

d e
.m r r mod p = m mod p.

Furthermore,
| |
m
. p = m − (m mod p).
p

Hence,
| |
m
d e
.(m r r mod p) + p = m. (5.41)
p

By Eqs. 5.34 and 5.35, we have

.m̂ = m mod q.

By Eq. 5.36,

.sq ≡ mdr mod q.

Together with Eq. 5.34, it follows from Chinese Remainder Theorem (see Theorem 1.4.7
and Example 1.4.19) that

.sdr ≡ mdr mod n.

Following a similar argument that leads to Eq. 5.41, we can show


| |
m
~
.m = (sqer mod q) + kq q = m d r er
mod q + q = m.
q

Finally, by Eq. 5.39,

.s = sdr m
~r mod n = mdr mr mod n = md−r+r = md mod n.



Next, we show that the Bellcore attack cannot succeed if s is calculated using
Eqs. 5.34–5.39.
Proposition 5.4.1 Suppose .p < q. If .sp is faulty, then .sq is also faulty.
Proof Let .sp' denote the faulty value of .sp , then .sp /= sp' . By Corollary 1.4.4,

e
.spr /≡ sp'er mod p.

Since .p < q,
5.4 Fault Countermeasures for RSA and RSA Signatures 417

e
.(spr mod p) mod q /= (sp'er mod p) mod q.

By Eq. 5.35, .m̂ is faulty. Thus .sq is also faulty by Eq. 5.36. ⨆

Lemma 5.4.2 Suppose .p > q. The cardinality of the set
{ }
. (a, b) | a, b ∈ Zp , a /= b, a ≡ b mod q

is given by
| | | | ⎛| | ⎞
p p p
.E := 2(p mod q) +q −1 .
q q q

Proof There are


⎛| | ⎞
p
.(p mod q) +1
q

many .a ∈ Zp such that

.0 ≤ a mod q ≤ (p mod q) − 1.
| |
In this case, there are . pq of .b ∈ Zp such that .b ≡ a mod q.
There are
⎛| | ⎞ | |
p p
.q − (p mod q) + 1 = (q − p mod q)
q q

many .a ∈ Zp such that

.p mod q ≤ a mod q ≤ q − 1.
| |
p
In this case, there are . q − 1 of .b ∈ Zp such that .b ≡ a mod q. We have

⎛| | ⎞| | | | ⎛| | ⎞
p p p p
.E = (p mod q) +1 + (q − p mod q) −1
q q q q
| || | | | | | ⎛| | ⎞
p p p p p
= (p mod q) + (p mod q) +q −1
q q q q q
| || | | |
p p p
− (p mod q) + (p mod q)
q q q
| | | | ⎛| | ⎞
p p p
= 2(p mod q) +q −1 .
q q q



418 5 Fault Attacks and Countermeasures

Example 5.4.3 Let .p = 7, .q = 5. There are


⎛| | ⎞
p
.(p mod q) + 1 = 2 × (1 + 1) = 4
q

many .a ∈ Z7 such that

.0 ≤ a mod q ≤ (p mod q) − 1, i.e., 0 ≤ a mod 5 ≤ 1.

Those values of a are given by .{0, 1, 5, 6}. In this case, there are
| | | |
p 7
. = =1
q 5

many .b ∈ Z7 such that .b ≡ a mod q. In particular, all possible values of .(a, b) are given
by

.(0, 5), (5, 0), (1, 6), (6, 1).

There are
| |
p
.(q − p mod q) = (5 − 2) × 1 = 3
q

many .a ∈ Z7 such that

.p mod q ≤ a mod q ≤ q − 1, i.e., 2 ≤ a mod 5 ≤ 4.

The values of a are given by .{2, 3, 4}. In this case, there are
| |
p
. − 1 = 0.
q

many .b ∈ Z7 such that .b ≡ a mod q. For example, there is no other number except for
2 in .Z7 that is congruent to .2 mod 7.
Thus the total number of pairs .(a, b) is 4. We can check that
| | | | ⎛| | ⎞
p p p
.E = 2(p mod q) +q − 1 = 2 × 2 + 5 × 0 = 4.
q q q

Proposition 5.4.2 Suppose .p > q. If .sp is faulty, the probability for .sq to be also faulty
is

E
.1 − .
p(p − 1)
5.4 Fault Countermeasures for RSA and RSA Signatures 419

Proof Let .sp' denote the faulty value of .sp , then .sp /= sp' . By Corollary 1.4.4,

e
.spr /≡ sp'er mod p.

There are .p(p − 1) distinct pairs .(sp , sp' ). By Eqs. 5.35 and 5.36 and Lemma 5.4.2, there
are E possible pairs .(sp , sp' ) that produce the same .m̂, hence the same .sq .
Thus the probability for .sq to be faulty is

E
.1 − .
p(p − 1)



We note that, in practice, p is large, and p and q are of similar bit lengths. Then E
will be small compared to .p(p − 1).
Example 5.4.4 Let .p = 421, .q = 419, then
| | | | ⎛| | ⎞
p p p
.E = 2(p mod q) +q − 1 = 2 × 2 × 1 + 419 × 1 × 0 = 4
q q q

and

E 4
.1 − =1− = 0.99998.
p(p − 1) 421 × 420

Proposition 5.4.3 If .sq is faulty and .sp is computed correctly, the attacker cannot
'er
compute .q = gcd(sdr − m, n) without brute force.
' , .m
Proof Suppose .sq is faulty, and .sp is computed correctly. Let .sdr ~' , .sq' , and .s ' denote
~, .sq , and s, respectively.
the faulty values of .sdr , .m
To carry out the Bellcore attack, the attacker needs to compute

'er
.q = gcd(sdr − m, n).

However, the attacker does not have the knowledge of .sdr ' . Instead, we can assume that
' '
the attacker knows .s . To get .sdr , the attacker needs to compute .m~'r .
'
We note that there are .q − 1 possible values for .sq . By Corollary 1.4.4 and Eq. 5.38,
there are .q − 1 possible values for .m ~' . And by Corollary 1.4.6, there are .q − 1 possible
'r
~ mod n. Thus the attacker cannot tell which value in .Zn is .m
values for .m ~'r mod n even
with the knowledge of m and r because of the unknown .m '
~ . In conclusion, the attacker
needs to brute force all possible values for .m ~'r mod n in .Zn . ⨆

In summary, the Bellcore attack assumes one of .sp and .sq is faulty, but not both. For
the infective countermeasure, we have shown:
. When .p < q , if .sp is faulty, .sq will also be faulty.
. When .p > q , if .sp is faulty, then .sq has a high probability to be faulty.
420 5 Fault Attacks and Countermeasures

. If .sq is faulty and .sp is not faulty, the attacker cannot repeat the attack without
brute force.
Example 5.4.5 Let .p = 3, .q = 5, and .m = 3. Then, .n = 15, .ϕ(n) = 8. As discussed
in Example 3.5.5, .yp = 2, .yq = 2. Suppose .d = 5.
To compute the signature with the infective countermeasure, choose .r = 2, and we
have

.dr = d − r = 5 − 2 = 3. er = 3−1 mod 8 = 3.


| | | | | | | |
m 3 m 3
.kp = = = 1, kq = = = 0.
p 3 q 5

And

.sp = mdr mod p = 33 mod 3 = 0,


m̂ = ((sper mod p) + kp p) mod q = 0 + 3 mod 5 = 3,

sq = m̂dr mod q = 33 mod 5 = 27 mod 5 = 2,


sdr = sp yq q + sq yp p mod n = 0 + 2 × 2 × 3 mod 15 = 12,
~ = (sqer mod q) + kq q = 23 mod 5 + 0 = 8 mod 5 = 3,
m

s = sdr m
~r mod n = 12 × 32 mod 15 = 108 mod 15 = 3.

We can verify that

.s = md mod n = 35 mod 15 = 243 mod 15 = 3.

If .sp is faulty and .sp' = 1, then

'
.m̂ = ((sp'er mod p) + kp p) mod q = (1 + 3) mod 5 = 4,

sq' = m̂'dr mod q = 43 mod 5 = 64 mod 5 = 4.

Thus .sq is also faulty, as has been shown in Proposition 5.4.1.


If .sq is faulty with .sq' = 1 and .sp is computed correctly, then

'
.sdr = sp yq q + sq' yp p mod n = 0 + 1 × 2 × 3 mod 15 = 6.

We note that

'er
.p = gcd(sdr − m, n) = gcd(63 − 3, 15) = gcd(213, 15). (5.42)

By the Euclidean algorithm,


5.4 Fault Countermeasures for RSA and RSA Signatures 421

213 = 15 × 14 + 3, gcd(213, 15) = gcd(15, 3),


.
15 = 3 × 5, p = gcd(15, 3) = 3.

However, the attacker does not have the knowledge of .m ' from
~'r to get the value of .sdr
'
.s . Thus from their point of view, any value in .Zn = Z15 might be .m
'r
~ . And they cannot
compute p as in Eq. 5.42.
Example 5.4.6 Let us compute the signature from Example 5.3.2 with the infective
countermeasure. We have

.p = 11, q = 13, n = 143, m = 2, ϕ(n) = 120, d = 11, yp = 6, yq = 6.

Choose .r = 4, then

.dr = d − r = 11 − 4 = 7.

By the extended Euclidean algorithm,

.120 = 7 × 17 + 1 =⇒ 1 = 120 − 7 × 17,

hence

.er = dr−1 mod ϕ(n) = −17 mod 120 = 103.

We also have
| | | | | | | |
m 2 m 2
.kp = = = 0, kq = = = 0.
p 11 q 13

And

.sp = mdr mod p = 27 mod 11 = 128 mod 11 = 7,


m̂ = ((sper mod p) + kp p) mod q

= (7103 mod 11 + 0) mod 13 = (7103 mod 10 mod 11) mod 13


= (73 mod 11) mod 13 = (343 mod 11) mod 13 = 2,
sq = m̂dr mod q = 27 mod 13 = 128 mod 13 = 11,
sdr = sp yq q + sq yp p mod n = 7 × 6 × 13 + 11 × 6 × 11 mod 143 = 128,
~ = (sqer mod q) + kq q = 11103 mod 13 + 0
m

= 11103 mod 12 mod 13 = 117 mod 13 = 2,


~r mod n = 128 × 24 mod 143 = 2048 mod 143 = 46.
s = sdr m
422 5 Fault Attacks and Countermeasures

Suppose .sp is faulty and .sp' = 2. Then

'
.m̂ = ((sp'er mod p) + kp p) mod q = (2103 mod 11 + 0) mod 13 = 23 mod 11 = 8,

sq' = m̂'dr mod q = 87 mod 13 = 5.

Thus .sq' is also faulty, as has been shown in Proposition 5.4.1.


Example 5.4.7 Now let us assume .p = 13, .q = 11. Let .d = 11 and .r = 4 as in
Example 5.4.6. We have

.n = 143, ϕ(n) = 120, yp = 6, yq = 6, dr = 7, er = 103, kp = 0, kq = 0.

Suppose .m = 12, then

.sp = mdr mod p = 127 mod 13 = 12,


m̂ = ((sper mod p) + kp p) mod q = (12103 mod 13 + 0)

mod 11 = (127 mod 13) mod 11 = 12 mod 11 = 1,


sq = m̂dr mod q = 17 mod 11 = 1,
sdr = sp yq q + sq yp p mod n = 12 × 6 × 11 + 1 × 6 × 13 mod 143 = 12,
~ = (sqer mod q) + kq q = 1103 mod 11 + 0 = 1,
m
~r mod n = 12 × 1 mod 143 = 12.
s = sdr m

We can check that

.s = md mod n = 1211 mod 143 = 12.

Suppose .sp is faulty and .sp' = 2. Then

'
.m̂ = ((sp'er mod p) + kp p) mod q = (2103 mod 13 + 0) mod 13 = 27 mod 13 = 11,

sq' = m̂'dr mod q = 117 mod 13 = 2.

Thus .sq' is also faulty.


By Lemma 5.4.2,
| | | | ⎛| | ⎞
p p p
.E = 2(p mod q) +q − 1 = 2 × 2 + 0 = 4.
q q q

By Proposition 5.4.2, the probability for .sp to be faulty and .sq to be computed correctly
is given by
5.4 Fault Countermeasures for RSA and RSA Signatures 423

E 4 1
. = = ≈ 0.0256.
p(p − 1) 13 × (13 − 1) 39

5.4.3 Countermeasure for Attacks on the Square and Multiply


Algorithm

In this subsection, we discuss a simple countermeasure proposed in [JPY01] for the


attacks discussed in Sect. 5.3.2. It follows a similar idea as Shamir’s countermeasure
(Sect. 5.4.1) for the Bellcore attack.
First, we choose a small random number r . Compute

.y = md mod r, z = md mod nr.

If .z /≡ y mod r , we conclude that an error has occurred; otherwise, the signature is


given by

.s = z mod n.

By Lemma 1.1.1 (6),

.y = md mod r, z = md mod nr =⇒ r|(y − md ), nr|(z − md ) =⇒ r|(y − z).

If there is no error during the computation, we have .z ≡ y mod r .


Furthermore, we note that the probability of an undetected fault is the probability
of

'
.z ≡ y ' mod r, (5.43)

where .z' and .y ' denote the values of z and y when fault is present during the
computation. If we assume the fault is random, then the probability of achieving
Eq. 5.43 can be approximated by the probability that two random numbers are
congruent modulo r , which is .1/r . If r is an integer of bit length 20, the probability
is less than .10−6 .
Example 5.4.8 Let us consider the computation from Example 5.3.3. We have

.p = 3, q = 5, n = 15, d = 3 = d1 d0 = 11, m = 2.

Following the above countermeasure, suppose .r = 3, and we have

.y = md mod r = 23 mod 3 = 2, z = md mod nr = 23 mod 45 = 8.

We can check that


424 5 Fault Attacks and Countermeasures

.z ≡ y ≡ 2 mod r.

And the signature is given by

.s = z mod n = 8 mod 15 = 8.

Now if there is a bit flip on the least significant bit of d, .d0 , resulting in .d ' = 2, then

' ' '


.y = md mod r = 22 mod 3 = 1, z' = md mod nr = 22 mod 45 = 4.

We have

'
.y /≡ z' mod r.

On the other hand, if the bit flip is on .d1 and we get .d ' = 1, then

' ' '


.y = md mod r = 21 mod 3 = 2, z' = md mod nr = 21 mod 45 = 2.

We have

'
.y ≡ z' ≡ 2 mod r.

And

'
.s = z' mod n = 2 mod 15 = 2.

In this case, the attack described in Sect. 5.3.2 can be repeated. In particular, the attacker
computes

s' 2 s'
= mod 15 = 2−2 mod 15, ≡ m−2 mod n.
i
. m2i = 22 mod 15 =⇒
s 8 s

By Eq. 5.21, .d1 = 1.

5.4.4 Countermeasures Against the Safe Error Attack

We note that a simple countermeasure exists for the safe error attack presented in
Sect. 5.3.4.
We first consider protecting the simple algorithm in Algorithm 5.9. Recall that
.a, b ∈ Zn . .𝓁n is the bit length of n, and the bit lengths of a , b are at most .𝓁n .

.κ = ⎾𝓁n /ω⏋,
5.4 Fault Countermeasures for RSA and RSA Signatures 425

where .ω is the word size of the computer. We can store a in .κ registers, each
containing one .ai and


κ−1
.a = ai (2ω )i .
i=0

Similarly, we can write b as


κ−1
.b = bi (2ω )i ,
i=0

where each .bi is stored in one register. Then we can modify Algorithm 5.9 to
Algorithm 5.11. Suppose .c = 1 and the fault is in .bi0 when .i < i0 , for some .i0
that satisfies .0 ≤ i0 ≤ κ − 1. Since .bi0 is used before the fault happens, the final
result will not be affected. Suppose .c = 0, then a fault in .bi0 at any time will not
change the final output either. If a fault is injected in a , the output will be faulty no
matter what value c takes. Thus, Algorithm 5.11 is not vulnerable to the safe error
attack discussed in Sect. 5.3.4.

Algorithm 5.11: Modified Algorithm 5.9 to counter the safe error attack
Input: n, a, b, c// n ∈ Z, n ≥ 2 has bit length 𝓁n ; a, b ∈ Zn ; c = 0, 1
Output: ab mod n if c = 1 and a otherwise
1 if c = 1 then
2 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the computer’s word size
3 for i = κ − 1, i >= 0, i − − do
4 R = 2ω R + bi a
5 R = R mod n
6 a=R
7 return a

Similarly, we can change line 18 of Algorithm 5.8 to

.R = 2ω R + R1i R0 .

We get Algorithm 5.12. In this case, suppose .dj = 0, and a fault is injected in the
variable .R0i0 , during the j th iteration of the outer loop and at the time .i < i0 in
the loop starting from line 6, where .0 ≤ i0 ≤ κ − 1. The final output will be faulty
because the faulty .R0i0 will be used in line 12. If a fault is injected in .R0i0 in the loop
starting from line 11, since the value in the whole variable .R0 is used in line 12, the
fault will propagate to the output.
On the other hand, if .dj = 1 and a fault is injected in .R0i0 during the j th iteration
of the outer loop, specifically at the time .i < i0 in the loop starting from line 17,
426 5 Fault Attacks and Countermeasures

Algorithm 5.12: RSA signature computation with Montgomery powering


ladder and Blakely’s method (Algorithm 5.8), protected against the safe error
attack from Sect. 5.3.4
Input: n, m, d// n is the RSA modulus of bit length 𝓁n ; m is the hash
value of the message; d is the private key of bit length 𝓁d
Output: md mod n
1 R0 = 1
2 R1 = m
3 for j = 𝓁d − 1, j ≥ 0, j − − do
4 if dj = 0 then
// lines 5 - 9 implement R1 = R0 R1 mod n
5 R=0
6 for i = κ − 1, i ≥ 0, i − − do
// κ = ⎾𝓁n /ω⏋, where ω is the word size of the computer
7 R = 2ω R + R0i R1
8 R = R mod n
9 R1 = R
// lines 10 - 14 implement R0 = R02 mod n
10 R=0
11 for i = κ − 1, i ≥ 0, i − − do
12 R = 2ω R + R0i R0
13 R = R mod n
14 R0 = R
15 else
// lines 16 - 20 implement R0 = R0 R1 mod n
16 R=0
17 for i = κ − 1, i ≥ 0, i − − do
18 R = 2ω R + R1i R0
19 R = R mod n
20 R0 = R
// lines 21 - 25 implement R1 = R12 mod n
21 R=0
22 for i = κ − 1, i ≥ 0, i − − do
23 R = 2ω R + R1i R1
24 R = R mod n
25 R1 = R

26 return R0

where .0 ≤ i0 ≤ κ − 1, the final output will also be faulty because the faulty .R0i0 will
be used in line 18. If the fault is injected in the j th iteration of the outer loop in .R0i0
in the loop starting from line 22, the fault will stay till the next iteration of the outer
loop and affect the output.
If a fault is injected in .R1i0 for some .i0 , by a similar argument, the signature will
always be faulty whether .dj = 0 or .dj = 1.
Thus Algorithm 5.12 is resistant to the safe error attack discussed in Sect. 5.3.4.1.
5.5 Further Reading 427

Algorithm 5.13: RSA signature signing computation with the right-to-left


square and multiply algorithm and Blakely’s method (Algorithm 5.10), pro-
tected against the safe error attack from Sect. 5.3.4.2
Input: n, m, d// n is the RSA modulus of bit length 𝓁n ; m is the hash
value of the message; d is the private key of bit length 𝓁d
Output: md mod n
1 s=1
2 t =m
3 for i = 0, i < 𝓁d , i + + do
// ith bit of d is 1
4 if di = 1 then
// lines 5 - 9 implement s = s ∗ t mod n
5 R=0
// κ = ⎾𝓁n /ω⏋, where ω is the computer’s word size
6 for j = κ − 1, j ≥ 0, j − − do
7 R = 2ω R + tj s
8 R = R mod n
9 s=R
// lines 10 - 14 implement t = t ∗ t mod n
10 R=0
11 for j = κ − 1, j ≥ 0, j − − do
12 R = 2ω R + t j t
13 R = R mod n
14 t =R
15 return s

In the same manner, to protect Algorithm 5.10 against the safe error attack, we
can just change line 7 to

.R = 2ω R + tj s.

We get Algorithm 5.13. If .di = 1, a fault during the i th iteration in .sj0 (.0 ≤ j0 ≤
κ − 1) will affect the result since the faulty .sj0 will be used in line 7. If .di = 0, s is
not used in the i th iteration, but the faulty .sj0 will be used in the next iteration and
affect the final output. On the other hand, a fault in t will always propagate to the
output since the faulty value will be used in line 12. Thus Algorithm 5.13 is resistant
to the safe error attack discussed in Sect. 5.3.4.2.

5.5 Further Reading

Differential fault analysis We have seen the diagonal DFA attack on AES in
Sect. 5.1.1.2. Tunstall et. al [TMA11] demonstrated that using this attack and
by exploiting the relation between .K10 (the last round key) and .K9 (the second
428 5 Fault Attacks and Countermeasures

last round key), the key guesses for .K10 can be further reduced to .212 . Piret and
Quisquater [PQ03] discussed another DFA attack on AES that injects fault to
the input of MixColumns in round 9. Phan et al. [PY06] proposed to combine
cryptanalysis techniques with DFA to recover the secret key of AES.
DFA attacks on PRESENT implementations can be found in, e.g., [BEG13,
WW10, BH15]. A generalization of DFA to SPN ciphers is given in [KHN+ 19].
Persistent fault analysis (PFA) We discussed the PFA attack on AES in Sect. 5.1.3.
PFA can also be applied to other block ciphers, e.g., PRESENT [ZZJ+ 20] and feistel
cipher [CB19]. In [ZZY+ 19], the authors demonstrated a practical fault injection
in the AES Sbox lookup table. In 2020, Xu et al. [XZY+ 20] discussed PFA attacks
in earlier rounds of AES and other SPN ciphers. Notably, AI has also been adopted
for PFA to recover the key for AES [COZZ23].
Other fault attack methodologies on symmetric block ciphers There are many
other fault attack methods. Here we give more information on a few of them.
Ineffective fault analysis (IFA) was first introduced in [Cla07], where the faults
that do not change the intermediate values are exploited. Those faults are called
ineffective faults. Normally a particular fault model is assumed, e.g., a stuck-at-0
fault model. We note that IFA is dependent on the effect a fault has on the corrupted
data. In comparison, the safe error attack (Sect. 5.3.4) does not require a specific
fault model, an intermediate value is changed, and the knowledge of whether the
faulty value is used or not is exploited.
Statistical ineffective fault attack (SIFA) [DEK+ 18] combines both SFA
(Sect. 5.1.2) and IFA. A nonuniform fault model is assumed, and the attack exploits
ineffective faults. More precisely, the dependency between the fault induction being
ineffective and the data that is processed is exploited. Different from SFA, SIFA
does not require each fault to be successful, but the attack requires repeated plaintext
and knowledge of the correct ciphertext (or whether each ciphertext is correct or
not). The fault injection is the same as described in Sect. 5.1.2.2. After the attacker
obtains a set of ciphertexts, they filter out the correct ones. With each hypothesis of
4 bytes of K10 , the attacker can compute a hypothesis of the original byte value s00 .
Then, statistical methods, such as maximum likelihood as discussed in Sect. 5.1.2.1,
can be applied to find the correct key hypothesis. In [DEK+ 18], the authors provide
a detailed theoretical analysis of the number of ciphertexts needed and extensive
experimental results.
Collision fault analysis [BK06] injects fault in the earlier rounds of a block cipher
implementation. Then the attacker records the faulty ciphertext and finds plaintext
that produces the same ciphertext, but without fault. Further analysis using those
plaintexts can recover the secret key. If the fault only changes 1 bit or 1 byte of the
intermediate value, the attacker can try different plaintexts that only differ at 1 bit
or 1 byte.
Algebraic fault analysis (AFA) [CJW10] is similar to DFA. It also exploits
differences between correct and faulty ciphertexts. But DFA relies on manual
5.5 Further Reading 429

analysis, and AFA expresses cryptographic algorithm in the form of algebraic


equations and utilizes SAT solver1 to recover the key.
Fault sensitivity analysis [LSG+ 10] exploits the sensitivity of a device to faults.
The attack analyzes when a faulty output begins to exhibit some detectable
characteristics and utilizes the information to recover the secret key. No knowledge
of faulty ciphertext is required for the attack.
Fault attacks on RSA and RSA signatures Shamir’s countermeasure (Sect. 5.4.1)
for the Bellcore attack (Sect. 5.3.1) was broken in 2002 [ABF+ 03]. Infective
countermeasure (Sect. 5.4.2) for the Bellcore attack was broken in 2006 [YKM06].
There are also various other countermeasures, such as BOS algorithm [BOS03] and
Vigilant’s algorithm [Vig08].
As mentioned in Sect. 5.3.1, the very first fault attack on cryptographic imple-
mentations was proposed in [BDL97] for attacking RSA signatures. In this paper,
the authors also discussed an attack aiming at the intermediate values of the square
and multiply algorithm to recover the private key. More attacks on the square and
multiply algorithm are proposed in, e.g., [Bor06], [SH08].
The first attack on RSA modulus n was proposed in [Sei05], where the goal
of the attacker is to corrupt RSA signature verification in a way that there is a
high probability that the verification will be successful for signatures created by the
attacker using their own private key and message. In more detail, the attack requires
the faulty n' to be a prime number known to the attacker such that gcd(e, n' − 1) = 1,
where e is the public key of RSA. Then the attacker can compute their private key
d ' = e−1 mod n' − 1 by the extended Euclidean algorithm and sign their chosen
message with their private key. The authors proved that there is a high probability
to produce a faulty n' with the above property. Another attack on RSA modulus can
be seen in [BCMCC06].
As mentioned in Sect. 5.3, even though no systematic methodologies exist for
fault attacks on public key ciphers, the general attack concept can be applied to a
different cipher based on a similar intractable problem. For example, the attack on
the square and multiply algorithm described in Sect. 5.3.2 can be applied to attack
discrete logarithm-based ciphers [BDH+ 97].
Fault countermeasures for symmetric block ciphers In Sect. 5.2, we have seen two
countermeasures for symmetric block ciphers.
Detection-based countermeasures In Sect. 5.2.1, we have discussed encoding-based
countermeasures, and we have seen a proposal to use anticodes for the implemen-
tation. Similar to the reasoning mentioned in Remark 5.2.1, fault attacks based on
knowledge of faulty ciphertext and certain bit-flip fault models can be prevented by
encoding-based countermeasures, e.g., IFA, SIFA, and AFA.

1 An SAT solver solves Boolean satisfiability problems. It takes a Boolean logic formula and checks

if there is a solution satisfying the formula.


430 5 Fault Attacks and Countermeasures

There are also other proposals for different code designs. For example, in
[KKT04], the authors proposed a special type of code for hardware countermea-
sures. The code is defined by
{ }
.C = (x, w) | x ∈ Fk2 , w = (P x)3 ∈ Fr2 ,

where H = (P |I ) is a parity-check matrix for a binary [n, k]-code, and r = n − k .


Akdemir et al. [AWKS12] considered robust codes. A binary code of length n is
said to be R -robust if

. max|C ∩ C + e| = R,
e/=0

where
{ }
.C + e = c + e | c ∈ C, e ∈ Fn2 .

In [GGP09], the authors proposed to use a digest value for the cipher state
and update it after each operation. The fault can be detected through the digest
values. See also [MSY06] for a comparative study on a few detection-based
countermeasures for symmetric block ciphers.
Infective countermeasure Infective countermeasure was first introduced for RSA
[SMKLM02] (see Sect. 5.4.2). Then it was adopted for symmetric block cipher in
2012 [GST12], where the authors discussed the implementation for both SPN and
Feistel ciphers. In 2014, Tupsamudre et al. broke this countermeasure for AES and
proposed an improved version [TBM14] for AES implementations.
As we have seen in Sect. 5.2.2, the infective countermeasure returns a ciphertext
in a way that if the attacker does not know the correct ciphertext will not be able to
tell if the fault injection was successful or not. However, as shown in [DEK+ 18],
for SIFA, even though the attacker does not know whether the ineffective fault
occurred in the target AES round or anywhere else, they can precalculate the
probability of faulting the target round and analyze the obtained ciphertext utilizing
this probability.
Generally applicable fault countermeasures For fault attack countermeasures,
except for those introduced in this chapter, there are also many other techniques.
Similarly to SCA countermeasures (Sect. 4.6), we can divide them according to the
levels of protection.
Protocol level countermeasure involves designing the usage of cryptographic
primitives in a way that certain fault attacks are not possible anymore, e.g.,
rekeying [MSGR10] or tweak-in-plaintext strategy [BBB+ 18].
Cryptographic primitive level approaches provide some sort of fault protection
directly in the cipher design [BLMR19, BBB+ 21]. The main advantage is to
unburden the implementer from the need to apply additional countermeasures.
However, at this point, the fault models covered directly in the design are limited.
5.5 Further Reading 431

Implementation-level countermeasures were the focus of this chapter. We have


seen that one common technique is the infective countermeasure, which was
discussed in Sect. 5.2.2 for symmetric block ciphers and in Sect. 5.4.2 for RSA.
Another common implementation-level countermeasure for both symmetric and
asymmetric ciphers is to introduce redundancy, for example, by repeating the
computation, e.g., deploying the circuit more than once, single-fault attacks can
be detected, or parity check-based countermeasure that allows the detection of
faults [KKG03, WKKG04] or by using error-detecting/correcting codes, which was
discussed in Sect. 5.2.1 for symmetric block ciphers. Code-based countermeasures
for public key cryptosystem can be found in, e.g., [GSK06].
Hardware level countermeasure has been studied for a long time in the smart
card industry, for example, using light sensors to detect the chip’s opening and
voltage/temperature sensors to detect fault injections by voltage glitches or tem-
perature variations [HS13]. A glitch detector can be used against voltage/clock
glitching [ZDT+ 14]. A ring oscillator-based sensor can be utilized for all of these,
including EM injection [HBB+ 16]. On the other hand, there are new ways to
induce faults proposed all the time. The main focus in academics is more on
countermeasures that aim at managing the effect of fault induction.
Chip package level techniques involve using a special package that prevents the
attacker from accessing the chip, for example, packaging that is hard to remove
without rendering the chip unusable or packaging with random distribution of
connection wires that would be cut during the depackaging process. Also, a layered
chip with the memory attached on top of the computation unit provides additional
security against FA.
Combined attacks Combined attacks were first proposed in the form of
differential behavioral analysis (DBA) which combines DPA with a safe error
attack [RM07]. The researchers then followed the idea and proposed attacks on
masked implementations of AES [CFGR10, RLK11]. PRESENT implementation
was targeted by a combined DFA and a SCADPA-like side-channel method
in [PBMB17]. Redundancy-based countermeasure on PRESENT and AES was
broken in [SJB+ 18]. A similar direction was taken in [PNP+ 20] where different
types of countermeasures were evaluated and attacked. A “blind side-channel” (a
method where the attacker does not need the value of the cipher output) SIFA was
proposed in [APZ21]. A “semi-blind” (no knowledge of the cipher input/output,
but ability to repeat the encryption with the same input) combined attack on bit
permutation-based ciphers with application to AEAD (authenticated encryption
with associated data) schemes was proposed in [HBB22].
Combined countermeasures We have seen encoding-based countermeasures for
SCA in Sect. 4.5.1.1 and for FA in Sect. 5.2.1. A proposal of finding optimal codes
against both attacks can be found in [BH17]. Various combined countermeasures
have also been studied. For example, see [SMG16] for a combined hardware
countermeasure based on masking and error-detecting code. [BDF+ 09] designs
a logic design-based solution, and [RDMB+ 18] discusses both hardware and
software countermeasures based on multiparty computation [CD+ 15]. A sensor-
432 5 Fault Attacks and Countermeasures

based countermeasure utilizing a ring oscillator with a phase-locked loop was


proposed in [RBBC18].
Attacks on post-quantum cryptographic implementations The first practical fault
attack on lattice-based key encapsulation schemes was proposed in [RRB+ 19],
targeting the usage of nonce. In [RBRC20] the authors propose several types
of attacks, including SCA, fault attacks, and combined attacks on lattice-based
schemes. A message-recovery attack on the code-based McEliece algorithm was
proposed in [CCD+ 21]. The attack works by changing the syndrome computation
from F2 to N, making it easy to break the security guarantee of the scheme.
In [XIU+ 21] the authors investigate all the NIST PQC Round 3 KEM candidates
w.r.t. fault attacks.
Attacks on neural networks Fault attack techniques have been adopted for attack-
ing neural network implementations recently, with a wide variety of attacker’s goals.
The very first work, published in 2017, proposes misclassification by bit flips
[LWLX17]. Several works followed in this direction [HFK+ 19, RHF19, RHL+ 21],
proposing more efficient and powerful attacks, mostly utilizing the Rowhammer
technique (see Sect. 6.2.1). The same goal was shown to be achievable by instruction
skips during the activation function execution [BHJ+ 18, HBJ+ 21]. Backdoor/Trojan
insertion to do targeted misclassification (a powerful method where the attacker
can choose the output class of the model) was proposed in [RHF20, CFZK21,
BHOS22]. A model extraction by faults was proposed in [RCYF22, BJH+ 21].
In such an attack, the adversary tries to learn the model parameters (weights and
biases), of a proprietary model.
Chapter 6
Practical Aspects of Physical Attacks

As physical attacks focus on implementations running on real-world devices,


there are many practical aspects one needs to consider. When developing attacks
and countermeasures, we often work with simplified models of those devices.
However, once we move from theoretical assumptions to practice, there might
appear deviations stemming from process variations, measurement errors, and
various noise sources that are not easy to determine. In this chapter, we will detail
practical aspects of side-channel and fault attacks1 that might be useful when doing
experimental evaluations. Apart from that, this chapter will also focus on industrial
standards that relate to hardware security.

6.1 Side-Channel Attacks

In the first part of this section, we will explain how information leakage is created
by the operation of integrated circuits. In the second part, we will detail the main
components of a measurement setup—oscilloscopes and probes.

6.1.1 Origins of Leakage

Current microchips are composed of solid-state metal-oxide-semiconductor field-


effect transistors (MOSFETs). There are arrays of positive (NMOS) and negative
(PMOS) transistors in each chip, which enable processing digital data composed of
0s and 1s. The reason to combine them in a single circuit is to increase the immunity

1 While we use the term “fault attacks” throughout the book, one can also find the term “fault

injection attacks” in the literature, which refers to the same.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 433
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2_6
434 6 Practical Aspects of Physical Attacks

Fig. 6.1 Power consumption


types in CMOS circuits. The
main type considered for
SCA is the switching power

Fig. 6.2 Switching of the CMOS circuit, showing (a) the charging path from .VDD to .CL and (b)
the discharging path .CL to GN D of the capacitive load

to noise and decrease the static power dissipation, compared to implementing each
of these types separately [Cal75]. A circuit consisting of NMOS and PMOS
transistors is called a complementary metal oxide semiconductor (CMOS).
The side-channel leakage that comes in the form of electromagnetic leakage of
power consumption originates from the physical characteristics of data processing
by CMOS-based circuits. Based on these characteristics, leakage models were
developed to recover the processed information (see Sect. 4.2.1, part Leakage
Models). There are two types of power dissipation in CMOS gates: static and
dynamic (see Fig. 6.1). Static power is consumed even if there is no circuit activity.
It is primarily caused by leakage currents that flow when the transistor is in the off
state. While this type of power dissipation is not as investigated in the world of
SCA as the dynamic one, some works focus on exploiting it [MMR19]. Dynamic
power dissipation comes in two forms: short-circuit currents (a short time during
the switching of a gate when PMOS and NMOS are conducting simultaneously)
and switching power consumption (charge and discharge of the load capacitance).
When considering side channels, the switching power is the most relevant as
it directly correlates the processed data with the observable changes in power
consumption [Sta10]. This behavior of the CMOS circuit is depicted in Fig. 6.2.
Generally, the energy delivery to a CMOS is split into two parts—the charging and
the discharging of the load capacitance .CL . During the charging phase, the input
gate signal makes a .1 → 0 switch, resulting in switching the PMOS transistor
6.1 Side-Channel Attacks 435

on and its NMOS counterpart off. As shown in Fig. 6.2a, in this scenario, the load
capacitance .CL is connected to the supply voltage (.VDD ) via the PMOS transistor,
thus allowing the current .I (t) to charge .CL . There are two important equations
defining the transition from the energy point of view [JAB+ 03]:

. Ed = CL VDD
2
,
⎰ ∞ 1
Ec =
. I (t)V (t)dt = 2
CL VDD ,
0 2

where .Ed is the delivered energy and .Ec is the energy stored in the .CL . From
these equations, it can be seen that only half of the delivered energy is stored in
the capacitor—the PMOS transistor dissipates the other half. Therefore, this power
loss during the logic transition can be measured and correlated with the switching
activity, resulting in SCA leakage. When the gate signal changes from 0 to 1, the
opposite scenario happens—the PMOS transistor is switched off, and NMOS is
switched on. The energy .Ed stored in .CL is drained to the ground via the NMOS
transistor, as can be seen in Fig. 6.2b, thus causing SCA leakage. For more details
regarding the power consumption of CMOS circuits, we refer the interested reader
to [NYGD22], for example.
It is important to note that there is usually more than one switch during one clock
cycle. This is because the input signals to the (multi-input) gate normally do not
arrive at the same time, resulting in several switches before the correct output is
generated. The output transitions before the stable state are called glitches. They are
unnecessary for the correct functioning of the circuit and consume a non-negligible
amount of dynamic power, ranging between .20% and .70% [SM12]. Glitches are
the reason why Boolean masking in hardware, although theoretically secure, can be
broken [MPG05]. An approach called threshold implementation [NRS11] solves
the problem of secure Boolean masking by utilizing multiparty computation and
secret sharing.

6.1.2 Measurement Setup

The core of the measurement setup for SCA is the oscilloscope. It can either be
connected to the power supply of the DUT for power measurements or can measure
electromagnetic (EM) signals through an EM probe.

6.1.2.1 Oscilloscopes

The measurement is normally done with a digital sampling oscilloscope—a device


that takes samples of the measured voltage signal over time. The core of such an
instrument is an analog-to-digital (ADC) converter, which takes the analog value of
436 6 Practical Aspects of Physical Attacks

a b

Fig. 6.3 Digital sampling of a continuous signal with ten samples of (a) low-frequency signal and
(b) high-frequency signal

the measured signal (voltage, in our case) at the specified sampling rate and changes
it into a digital value. The precision of this value is generally between 8 and 12 bits
for midrange oscilloscopes, and the sampling rate ranges from hundreds of mega
samples per second (MS/s) to several giga samples per second (GS/s).
When measuring analog signals, such as voltage, with digital devices, it is
important to note that we are measuring a continuous value with equipment that
samples such a value at periodic intervals (which is why we call it a time sample)
and stores it in a binary format with limited precision. Therefore, discretization is
applied twice—first in the time domain and then to the value itself. According to the
Nyquist–Shannon sampling theorem [Vai01], the sampling rate of the measurement
device should be at least twice the highest frequency component of the measured
signal. It is a good rule of thumb to have the oscilloscope sampling rate at least .4×
the target device frequency when doing measurements for power analysis attacks.
Figure 6.3 shows this phenomenon. The red curve denotes the original analog signal,
while the black lines show a sampling of this signal with ten samples over the given
time interval. While we can easily reconstruct the original signal if the frequency
is low (Fig. 6.3a), it becomes much harder with a high-frequency signal (Fig. 6.3b).
The precision of the oscilloscope specifies how many values can the sampled output
value take, e.g., an 8-bit ADC would give a range of 256 values, which is sufficient
for an SCA attack in most cases.
Another important parameter of the oscilloscope is the analog bandwidth. It is
defined as the frequency at which the amplitude measured by the oscilloscope has
reduced by 3 dB. To avoid the unnecessary modification of the measured signal, the
bandwidth should be at least .3× the target device frequency.
An important task during the acquisition is capturing the correct time window
corresponding to the operations we want to measure. In laboratory conditions, it
is common to use an artificial trigger signal that indicates the start/end of the
encryption. In real-world settings, it is necessary to identify the correct position
by examining the captured signal—this is usually done based on the evaluator’s
expertise.
6.2 Fault Attacks 437

6.1.2.2 Probes

Near-field electric and magnetic probes are an essential part of the setup when doing
electromagnetic side-channel analysis. They can be connected to the oscilloscope in
a passive way or with an amplifier. Optionally, a bandpass filter can be used to only
pass the relevant frequencies and discard the rest. Several established companies,
such as Riscure and Langer, provide probes suitable for EM SCA. Due to the
simplicity of the probe design, researchers have also been building their own probes
since the early days of SCA [GMO01]. Generally, a coiled copper wire is sufficient,
with a coil diameter of at most a few hundred microns. More details on designing
near-field probes can be found, for example, in [Siv17].

6.2 Fault Attacks

An interesting aspect of fault attacks is that, unlike with side-channel attacks, the
adversary can break the cryptographic security even without the knowledge of the
underlying algorithm, for example, by skipping the entire encryption routine by
injecting faults in the conditional branches [SWM18].
In this section, we will look into the practical aspects of fault attacks (FAs), such
as sample preparation, fault injection techniques and devices, and mechanisms to
trigger faults in integrated circuits.

6.2.1 Fault Injection Techniques

In this subsection, we will outline the most popular techniques for FA testing of
integrated circuits [BH22].

6.2.1.1 Clock/Voltage Glitching

Clock and voltage glitching techniques are the most accessible in terms of cost
as they do not need sophisticated equipment. Initially, they were only performed
locally with a device at hand, but with power management techniques such as
dynamic voltage and frequency scaling (DVFS), they can also be performed
remotely on chips that utilize that technology.
In the case of a voltage glitch, the faults are caused by precise high variations in
power supply or by underpowering the device. Power supply variations, or spikes,
modify the state of latches of flip-flops, influencing the control and data path logic
of the circuit [KJP14]. For example, if the voltage spike happens during memory
reading, wrong data may be retrieved. It was also shown that a different shape of the
glitch waveform affects the success of the attack [BFP19]. Underpowering, on the
438 6 Practical Aspects of Physical Attacks

Fig. 6.4 Depiction of a


voltage glitch on a smart card

other hand, affects the algorithm continuously and might cause faults throughout
the computation. Single faults are possible when the insufficient power supply
causes small enough stress so that dysfunctions do not occur immediately after
the computation starts and multiple faults do not happen [SGD08]. When the
attacker can physically access the target device, voltage glitching is generally easy
to implement. It is also the most inexpensive fault injection method as the necessary
equipment is wires for connecting to the device and a power source. A local voltage
glitch on a smart card is depicted in Fig. 6.4. Voltage glitching attacks were shown to
be effective even against security enclaves of Intel [CVM+ 21] and AMD [BJKS21].
An inexpensive Teensy 4.0 board (.≈30 USD) was used for the abovementioned
attacks, making them highly practical in terms of equipment cost.
Clock glitch is another technique that can be performed with low-cost equipment.
For digital computing devices, it is necessary to synchronize the calculations with
either an internal or an external clock. If the clock signal changes, the resulting com-
putation might have a wrong instruction executed or data corrupted. Devices that
require an external clock generator can be faulted by supplying a bad clock signal—
containing fewer pulses than the normal one [KSV13]. On the other hand, devices
that are configured to use an internal clock signal cannot be easily faulted. Clock
glitches are generally considered the simplest fault injection method as the attack
devices are easy to operate with. For example, clock glitches can be achieved by
using low-end field-programmable gate array (FPGA) boards [BGV11, ESH+ 11].
A relatively new direction in clock/voltage glitching is remote attacks that
take advantage of power management systems of modern processors. The security
aspects of these systems are rarely considered due to the complexity of devices
from the hardware point of view and software executed, cost, and time-to-market
constraints [PS19]. CLKSCREW is the first attack in this direction, targeting
frequency and voltage manipulation of the Nexus 6 phone, forcing the processor
to operate beyond recommended limits [TSS17]. The researchers experimentally
injected a one-byte random fault. CLKSCREW can be achieved just by utilizing
software control of energy management hardware regulators in the target devices.
Similar attacks were also proposed for ARM-based Krait processor [QWLQ19]
and Intel SGX [QWL+ 20]. The main advantage of these attacks is that they are
software-based, therefore allowing the threat model to shift from local to remote.
6.2 Fault Attacks 439

6.2.1.2 Optical Fault Injection

The ionization effect on transistors is a well-known phenomenon, and nowadays it


is common to perform failure tolerance testing of integrated circuits. For example,
testing robustness and reliability with lasers dates more than half a century back
[Hab65]. While there might not be any unexpected effects in standard conditions,
there are environments where ionization effects are common and cause unintentional
faults, such as the Earth’s orbit where satellites are deployed and conditioned to
cosmic rays [BSH75]. The first usage of optical fault injection against cryptographic
circuits dates back to 2002 when researchers used a flash gun and a laser pointer to
set and reset bits in an SRAM [SA02].
The variety of techniques within optical fault injection is vast—from camera
flashes to lasers to X-ray beams. It was also shown that with the usage of lasers, one
can probe the memory without changing it to check its content [CCT+ 18], shifting
the use case to the realm of side channels.
For security evaluation and certification labs, the method of choice is normally a
laser fault injection (LFI), depicted in Fig. 6.5a. Off-the-shelf setups for performing
LFI are readily available from companies selling testing equipment. A standard LFI
setup consists of the following parts: a laser source, an objective lens, a motorized
positioning table, and a controlling device. One can also utilize an infrared ring that
allows taking images of the chip from the backside, making the silicon substrate
transparent to the camera lens. An example of such an image is in Fig. 6.5b. It is also
common to include a digital oscilloscope to precisely check the timing of the laser
activation with respect to the cipher execution. While the cost of off-the-shelf LFI
testing equipment starts around 100k USD, it was shown that a low-cost setup can
be built for around 500 USD [KM20]. Naturally, specialized expertise is required
to design and assemble such a setup.
Optical fault injection requires direct access to the chip—from either the front or
the backside. That means, in most cases, it is necessary to remove the chip package,
by using either mechanical or chemical techniques. There is also an option to use a

a b

Fig. 6.5 Depiction of (a) laser fault injection on an AVR microcontroller mounted on Arduino
UNO board and (b) zoomed infrared image of the chip
440 6 Practical Aspects of Physical Attacks

focused ion beam (FIB), but these techniques are generally outside of the budget of a
standard testing laboratory, so in this part, we will focus on the two abovementioned
methods.
Mechanical techniques are relatively straightforward. They are mostly used for
backside decapsulation as the front side of the chip is too sensitive to any physical
tampering. They can, for example, involve using inexpensive manual rotary milling
machines that grind down the epoxy package. This is recommended mostly for low-
cost chips as there is a high risk of overheating or mechanically damaging the die.
Another way is to use specialized tools for decapsulation, thinning, and polishing
(e.g., Ultra Tec ASAP-1). These tools work in an automated way by slowly milling
down the package layers to avoid any damage. Naturally, the main drawback is the
cost which typically ranges in tens of thousands of dollars.
Chemical techniques are recommended when the front side of the chip needs
to be accessed. In some cases, such as smart cards, acetone is enough to remove
the protective plastic (after the outer hard plastic case is removed, e.g., by using a
scalpel). When removing the black epoxy package, one might need to use strong
acids, such as fuming nitric acid (.HNO3 with a concentration of at least .86%). This
typically involves operation in a safe laboratory environment equipped with a fume
hood and a proper acid disposal facility. A depiction of such a setup is shown in
Fig. 6.6. When using such aggressive acids, there is also a risk of removing the
bonding wires using this technique, unless they are either golden or at least gold-
plated. More details on decapsulation techniques can be found in [BC16].
When using optical fault injection techniques, it is important to know the
absorption depth in silicon as a function of wavelength. This is depicted in Fig. 6.7.
The green laser (532 nm) has an absorption depth of .≈ 1.3 .μm; therefore, it can be
utilized either for front-side injection (where it can directly access the components)

Fig. 6.6 Depiction of a chemical decapsulation by using fuming nitric acid


6.2 Fault Attacks 441

Fig. 6.7 Absorption depth in


silicon. The most common 107
laser wavelengths for testing

Absorption depth (cm)


105
integrated circuits are
103
highlighted—532 nm (green),
808 nm (near-infrared), and 101
1 mm
1064 (near-infrared) 10−1
12.7 µm
10−3
1.3 µm 1064 nm
10−5 808 nm
532 nm
10−7
200 400 600 800 1,000 1,200 1,400
Wavelength (nm)

or for almost fully removed silicon substrate from the backside. As the latter might
damage the chip, it makes sense to use lasers with deeper absorption depth, such as
808 nm or 1064 nm, both from the near-infrared light spectrum. The 1064 nm laser
allows a penetration depth up to 1 mm which can often be used even for non-thinned
substrate.
There are other fault injection techniques that are related to optical techniques in
the way they work. In the area of failure analysis, electron and ion beams have been
successfully used to test the reliability of circuits [SA93]. X-ray beams were used
to tamper with memories of a microcontroller [ABC+ 17].
All in all, optical fault injection offers precision and repeatability at a relatively
high cost (considering commercial off-the-shelf setups). With specific expertise, it is
possible to construct a DIY setup for a much lower price. The main drawback of this
technique is the necessity to “see” the chip, which normally requires depackaging
and delayering of the chip, making it often impractical outside of laboratory
environments. As it is a powerful technique, it is a de facto standard for security
testing and certification labs which need to consider strong attacker models.

6.2.1.3 Electromagnetic Fault Injection

The electromagnetic fault injection (EMFI) technique is a versatile way to attack


chips, allowing targeting both analog and digital blocks. The working principle
of EMFI is to generate a changing magnetic field that induces a voltage into the
structures of IC surface [DLM20]. In cryptographic circuits, digital logic is used
for the algorithm itself, while analog logic controls the clock and random number
generators. Below, we discuss the EMFI approaches that can be used to target each
of those.
Analog blocks can be targeted by powerful harmonic EM waves. A stable
sinusoidal signal can be generated by the attacker at a given frequency that injects
a harmonic wave creating a parasitic signal [HHS+ 11]. This signal can be used to
bias the clock behavior or to inject additional power directly and locally into the
442 6 Practical Aspects of Physical Attacks

Fig. 6.8 Depiction of a


pulsed electromagnetic fault
injection on an AVR
microcontroller mounted on
Arduino UNO board

chip. Harmonic EMFI equipment includes normally a motorized positioning table,


a signal generation module, and an oscilloscope.
As digital blocks are clocked, the method of choice is an EM pulse injection
during a specified clock cycle [SH07]. When a sharp and sudden EM pulse is
injected into the integrated circuit, it can create intense transient currents that change
the behavior of logic cells, ultimately causing faults. Standard equipment includes
a high-voltage pulse generator and a coil with a ferrite core, serving as an injection
probe. Such equipment is depicted in Fig. 6.8.
As the fault analysis methods targeting cryptographic implementation mostly use
data faults (bit flips, bit sets/resets, random byte faults, etc.), most of the research is
dedicated to pulse fault injection. It is possible to build low-cost EMFI equipment
for as low as 50 USD [O’F23]. A more comprehensive ready-to-use device can
be bought, for example, from NewAE for .≈ 3.3k USD (ChipSHOUTER2 ). While
an injector device itself is not enough for a proper testing setup, in [KBJ+ 22]
the authors show how to incorporate ChipSHOUTER in a testbench including
XYZ stage and a controller for .≈ 7k EUR.3 If one needs more powerful and
precise equipment, Avtech pulse generators can be purchased in a price range
between 10k and 20k USD.4 In that case, a near-field injection probe is needed,
which can either be bought for a few hundred USD or manufactured from very
inexpensive components. Many resources can be found in the literature on designing
and building custom EMFI probes [ORJ+ 13, Sau13, BKH+ 19]. The basic building
blocks are a ferrite core, a copper wire, and a connector. A generic design of an
EMFI probe is depicted in Fig. 6.9.

2 https://www.newae.com/chipshouter
3 We use currencies stated in original papers, which is why some prices are in USD and some in

EUR.
4 https://www.avtechpulse.com/medium/
6.2 Fault Attacks 443

Fig. 6.9 A depiction of a


generic design of an
electromagnetic fault
injection probe

Recently, several interesting low-cost custom-built setups were proposed in the


literature, capable of performing various attack models:
• Defeating secure boot on a multicore 1GHz+ ARM was shown to be practically
feasible with just a 350 USD EMFI platform named BADFET [CH17].
• Bypassing firmware security protection in various configurations was done by a
device called SiliconToaster, a USB-powered EM injector capable of generating
.1.2kV of voltage [AH20].

• Privilege escalation using a malicious field-replaceable unit (FRU) with a


modified mosquito killer spark gap generator was described in [DO22].
From the above, it is evident that EMFI is a popular fault injection technique
that is easily accessible due to low cost but at the same time offers a localized
and powerful way to defeat secure components on modern chips. Compared to
optical fault injection, it does not need direct visibility over the chip and, therefore,
leaves out the necessity of cumbersome decapsulation. Moreover, enthusiasts can
find many publicly available instructions on how to build a working EMFI setup
from easily available off-the-shelf components.

6.2.1.4 Rowhammer Attacks

Rowhammer is a remote fault injection technique that exploits the physical charac-
teristics of DRAM (dynamic random access memory) technology. This attack works
by aggressive reading/writing to memory cells adjacent to the target cell, where it
causes bit flips [KDK+ 14]. The attack is made possible by advancing technology
which allows shrinking the cells and placing them closer to each other. A smaller
cell uses less capacity for charge and therefore provides less tolerance to noise and
greater vulnerability to errors [MDB+ 02]. High cell density further extends this
vulnerability by creating electromagnetic coupling effects between them, producing
unwanted interactions [KKY+ 89]. The Rowhammer access patterns are depicted in
Fig. 6.10. The aggressor row refers to a row that is being hammered by the attacker
to flip the bits in the victim row. According to [JVDVF+ 22], three common patterns
were shown effective in flipping bits. The single-sided pattern uses one aggressor
row next to the victim row and the other one far apart. The double-sided pattern
tightly surrounds the victim row with aggressor rows, increasing the chance of bit
flips. Finally, there is an n-sided pattern where n refers to .n − 1 victim rows being
hammered by n aggressor rows. The figure shows an example for .n = 4.
444 6 Practical Aspects of Physical Attacks

Fig. 6.10 Different ways of spatial arrangement of aggressor rows (black) and target/victim rows
(red/pink) in DRAM. (a) Single-sided. (b) Double-sided. (c) 4-sided

As DRAM is the most prevalent technology for nonvolatile memories in modern


devices, it is no surprise that the Rowhammer attack was demonstrated on a
plethora of targets, ranging from smartphones [VDVFL+ 16] to cloud environment
[ORBG17] to browsers [BRBG16]. Aside from obvious targets such as privilege
escalation or cryptanalytic fault attacks, Rowhammer also became popular in the
area of hardware attacks on neural networks [TIA+ 23, YRF20].
There is no need for specialized equipment to perform this attack—it is normally
triggered through a software program. While the standard modus operandi is a code
execution on the target machine, it was shown that Rowhammer can be realized
by sending network packets to the target machine over RDMA-enabled networks
[TKA+ 18].

6.3 Industry Standards

Hardware vulnerability assessment of cryptographic implementations has made its


way to industrial standardization. Some products, such as credit cards, need to be
evaluated and certified to show they are sufficiently resistant against SCA and FIA.
There are two main evaluation frameworks used in the industry: Common Criteria
and NIST FIPS 140. We will outline each of them below.

Note A good overview of cybersecurity standards in various industries is


maintained by the European Cyber Security Organisation (ECSO) in their
Overview of existing Cybersecurity standards and certification schemes
report [Org17].
A more detailed review of side-channel evaluation standards and methods
is given in [ABB+ 20].
6.3 Industry Standards 445

6.3.1 Common Criteria

The Common Criteria for Information Technology Security Evaluation (colloquially


known as Common Criteria or CC) is an international standard published in
ISO/IEC 15408 [2709] document. It is a general security evaluation framework
where users specify their security functional and assurance requirements in a
document called Security Target (ST). ST is defined as an “implementation-
dependent statement of security needs for a specific identified Target of Evaluation
(TOE),” where TOE is the product that is being certified. ST can conform to
one or more Protection Profiles (PPs)—generic documents written by a user or a
community for a family of products, such as smart cards, tokens, or firewalls.
The level of security of the evaluated TOE is divided into seven categories—
Evaluation Assurance Levels (EALs). While EAL 1 mostly focuses on functional
testing with minimum emphasis on security, EAL 7 requires formally verified design
and tests. Higher EALs are typically used for military-grade products and require
lengthy and expensive evaluation. The CC website5 lists the accredited labs capable
of certifying to a certain EAL and also provides the list of certified products.
The generic steps to be taken before the evaluation can start are as follows:
1. Choosing the National Scheme. CC Certificate Authorizing Schemes were
established by 17 countries. Each of them developed their own legislation and
norms.
2. Choosing the Target of Evaluation. The TOE and its boundary need to be defined.
The TOE can be a part of an IT product, an IT product, a set of an IT product, a
technology, or a combination of those.
3. Picking an Evaluation Assurance Level. The evaluation requirements will be
based on the EAL. Also, the CC Test Laboratory needs to be certified to evaluate
TOEs with the chosen EAL.
4. Choosing the Protection Profile (optional). A suitable PP serves as a guiding
document, ensuring that the security features of the TOE align well with the
requirements tailored to the category of products to which TOE belongs.
5. Preparing the Security Target. The ST is an implementation-dependent declara-
tion of security needs for the given TOE.
6. Preparing the Evaluation Work Plan. This plan is prepared by the CC Test
Laboratory and approved by the Certification Body.
When it comes to SCA, the main area of interest within CC is smart cards. In
this context, two documents are used as guidelines for the evaluation, both of them
produced through the International Security Certification Initiative (ISCI) and the
Joint Interpretation Library (JIL) Hardware Attacks Subgroup (JHAS):
• Application of Attack Potential to Smart Cards [SI20a]: The document specifies
on how to express the effort required by the attacker to mount a successful attack.

5 https://www.commoncriteriaportal.org
446 6 Practical Aspects of Physical Attacks

It is related to risk analysis methods and considers the following rating factors:
elapsed time, expertise, knowledge of TOE, access to TOE, used equipment, and
open samples.
• Attack Methods for Smart Cards and Similar Devices [SI20b]: This is a com-
panion document, under limited distribution. It describes the attacks themselves.
The rating method from the listed documents is also adopted in the security
evaluation specified by EMVCo, an organization managed by the major payment
security players (American Express, Discover, JCB, MasterCard, UnionPay, and
Visa). Their aim is to maintain the standardized security level of contact and
contactless payment system by managing and evolving the security requirements
and related testing processes.

6.3.2 FIPS 140-3

The Federal Information Processing Standard (FIPS) 140-3, Security Requirements


for Cryptographic Modules [NIS19] is a document released by the US National
Institute of Standards and Technology (NIST). It is applicable to Federal agencies
that use cryptographic-based systems. Unlike CC, which specifies an evaluation
method and is independent of the underlying algorithms, FIPS 140-3 lists the
approved algorithms allowed for usage. The standard specifies six security levels
related to physical security, with level 1 stating requirements for protective coating
and level 6 requiring countermeasures against differential power/electromagnetic
analysis. The product certification is done through the Cryptographic Module
Validation Program (CMVP), which is a joint effort between the NIST and the
Canadian Centre for Cyber Security.
The FIPS-140 links side-channel evaluation test metrics to the ISO/IEC
17825:2016 standard (with the new version coming in 2024) and the tools and
methods to the ISO/IEC 20085-1 and 20085-2 standards.
Appendix A
Proofs

A.1 Matrices

Let R be a commutative ring in this section.


In Definition 1.3.4, we have defined the determinant of a matrix A with
coefficients from a commutative ring R. Here we show that the value of .det(A)
in Eq. 1.6 does not depend on the choice of .i0 .
Lemma A.1.1 For any .0 < i0 ≤ n − 1

Σ
n−1 Σ
n−1
. det(A) = (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ).
j =0 j =0

⎫ by induction. For .n = 1, it is trivially true. For .n = 2, we can write


Proof⎧ We prove
a00 a01
.A = . Take .i0 = 0,
a10 a11

Σ
n−1 Σ
1
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ) = a00 a11 − a01 a10 .
j =0 j =0

Take .i0 = 1,

Σ
n−1 Σ
1
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)1+j a1j det(A1j ) = −a10 a01 + a11 a00 .
j =0 j =0

Since R is a commutative ring, .a00 a11 − a01 a10 = −a10 a01 + a11 a00 , the lemma is
true for .n = 2.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 447
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
448 A Proofs

Now suppose the lemma is true for .n = k, where .k ≥ 2. In particular, for any
0 ≤ j < k and .i0 /= 0, we have
.

Σ
k−1 Σ
k−1
. det(A0j ) = (−1)𝓁 a0j det(A00,j 𝓁 ) = (−1)i0 +𝓁 ai0 j det(A0i0 ,j 𝓁 ), (A.1)
𝓁=0 𝓁=0

where .A0i,j 𝓁 is obtained from .A0j by deleting the ith row and .𝓁th column. We will
show that the lemma is true for .n = k + 1. Take .i0 = 0, we have

Σ
n−1 Σ
k
. (−1)i0 +j ai0 j det(Ai0 j ) = (−1)j a0j det(A0j ).
j =0 j =0

Take .i0 /= 0, we have


⎧k−1 ⎫
Σ
n−1 Σ
k Σ
i0 +j i0 +j
. (−1) ai0 j det(Ai0 j ) = (−1) ai0 j 𝓁
(−1) a0𝓁 det(Ai0 0,j 𝓁 )
j =0 j =0 𝓁=0

Σ
k Σ
k−1
= (−1)i0 +j ai0 j (−1)𝓁 a0j det(Ai0 0,j 𝓁 )
j =0 𝓁=0
⎧k−1 ⎫
Σ
k Σ
i0 +𝓁
= j
(−1) a0j (−1) ai0 j det(Ai0 0,j 𝓁 )
j =0 𝓁=0

Σ
k
= (−1)j a0j det(A0j ),
j =0

where the last equality follows from Eq. A.1.


By mathematical induction, we have proved the lemma. ⨆

A.2 Invertible Matrices for the Stochastic Leakage Model

In this section, we will focus on matrices with coefficients from the field of real
numbers .R. Let .n, m be two positive integers.
Definition A.2.1 For any vector .u = (u0 , u1 , . . . , un−1 ) ∈ Rn , the Euclidean norm
of .u, denoted .||u||2 is defined to be
⎧n−1 ⎫1/2
Σ
.||u||2 = u2i .
i=0
A Proofs 449

In other words, the Euclidean norm of .u is the square root of the scalar product (see
Definition 1.3.2) between .u and .uT :
⎧ ⎫1/2
||u||2 = u · uT
. . (A.2)

Remark A.2.1 It is easy to see that if .||u||2 = 0, then .u = 0.


Example A.2.1 Let .u = (1, 2, 5), then
√ √ √
||u||2 =
. 1 + 22 + 52 = 1 + 4 + 25 = 30.

Definition A.2.2 For any two vectors .u, v ∈ Rn , the Euclidean distance, denoted
d(u, v), is defined to be the Euclidean norm of the vector .u − v
.

d(u, v) = ||u − v||2 .


.

Definition A.2.3 The row rank (resp. column rank) of a matrix A, denoted rank.(A)
over .R is the maximum number of rows (resp. columns) in A that constitute a set of
independent vectors.
The following result is very useful for us. For a proof, see, e.g., [Ber09, Section
2.4].
Theorem A.2.1 The column rank of a matrix A is equal to its row rank.
Definition A.2.4 The rank of a matrix A, denoted rank.(A) is the row rank of A. An
n × m matrix A is said to have full column rank if rank.(A) = m. It is said to have
.

full row rank if rank.(A) = n.


Example A.2.2 Let
⎛ ⎞
1011
.A = ⎝0 1 0 1⎠ .

1001

We can see that the vectors .{(1, 0, 1), (0, 1, 0), (1, 0, 0)} are independent but

(1, 1, 1) = (1, 0, 1) + (0, 1, 0).


.

Thus A has rank 3. And A has full row rank.


Let A be an .n×m matrix. Take any row vector .u ∈ Rn , we note that .uA is a linear
combination (see Definition 1.3.9) of rows of A. By Definition 1.3.11, the rows of A
are linearly independent if and only if there does not exist a nonzero vector .u such
that .uA = 0. Similar results hold for the columns of A. We have proved
450 A Proofs

Lemma A.2.1 An .n × m matrix A has full row rank if and only if there does not
exist a nonzero vector .u ∈ Rn such that .uA = 0. A has full column rank if and only
if there does not exist a nonzero vector .u ∈ Rm such that .AuT = 0.
Theorem A.2.2 An .n × n square matrix A is invertible if and only if rank.(A) = n.
Proof We will provide the proof for the necessity. We refer the readers to [Goc11,
Section 3.6] for the proof of the sufficiency.
By Definition 1.3.3, A is invertible if and only if there exists an .n × n matrix B
such that .AB = BA = In , where .In is the .n−dimensional identity matrix. Suppose
A is invertible and rank.(A) /= n. Then by Lemma A.2.1, there exists a nonzero
vector .u ∈ Rn such that .uA = 0. Then we have

.uAB = 0B = 0 = 0In ,

a contradiction. ⨆

Lemma A.2.2 Let M be an .n × m matrix. The matrix T
.M M is invertible if and
only if M has full column rank.
Proof Let

A = M T M.
.

Then A is a square matrix of size .m × m.


.=⇒ Suppose A is invertible and rank.(M) /= m. By Lemma A.2.1, there exists a

nonzero vector .u ∈ Rm such that .MuT = 0. We have

AuT = M T MuT = 0.
.

By Lemma A.2.1 again, we know that rank.(A) /= n and according to Theo-


rem A.2.2, A is not invertible. A contradiction.
.⇐= Suppose rank.(M) = m and A is not invertible. By Theorem A.2.2 and

Lemma A.2.1, there exists a nonzero vector .u ∈ Rm such that .AuT = 0, which
gives (see Eq. A.2 and Remark A.2.1)

.0 = uM T MuT = (MuT )T (MuT ) = ||MuT ||2 =⇒ MuT = 0.

By Lemma A.2.1, M does not have full column rank. A contradiction. ⨆



Let us consider the matrix .Mv from Sect. 4.3.2.2. Suppose we take a collection
of .mv different values of .v such that the rows of .Mv are linearly independent, then
.Mv has row rank equal to .mv . By Theorem A.2.1, .Mv also has column rank .mv .

Since .Mv has .mv columns, by definition, .Mv has full column rank. It follows from
Lemma A.2.2 that the matrix .MvT Mv is invertible.
A Proofs 451

In particular, with all possible values of .v appearing in the rows of .Mv , we will
have .mv linear independent rows in .Mv given by those with Hamming weight 1:

(1, 0, 0, . . . , 0), (0, 1, 0, 0, . . . , 0), (0, 0, 1, 0, . . . , 0), . . . , (0, 0, 0, . . . , 0, 1).


.

Then .Mv has row rank equal to .mv and .MvT Mv will be an invertible matrix.
Appendix B
Long Division

In primary school, we learned to do long division for calculating the quotient and
remainder of dividing one integer by another integer. For example, to compute

1346 = 25 × q + r,
.

we can write

53
25 1346
125
96
75
. 21

and we get .q = 53, .r = 21.


Similarly, let us take two polynomials .f (x), g(x) ∈ F [x], where F is a field. We
can also compute .f (x) divided by .g(x) using long division. Let .F = F2 . Take

f (x) = x 8 + x 4 + x 3 + x + 1 ∈ F2 [x],
.

and

g(x) = x + 1 ∈ F2 [x].
.

We have

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 453
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
454 B Long Division

x7 + x6 + x5 + x4 + x2 + x + 1
x + 1) x8 + x4 + x3 + x + 1
x8 + x7
x7 + x4 + x3 + x + 1
x7 + x6
x6 + x4 + x3 + x + 1
x6 + x5
x5 + x4 + x3 + x + 1
x5 + x4
x3 + x + 1
x3 + x2
x2 + x + 1
x2 + x
1

Thus (see Example 1.5.21)

f (x) = (x + 1)(x 7 + x 6 + x 5 + x 4 + x 2 + x + 1) + 1.
.
Appendix C
DES Sbox

Table C.1 Sboxes in DES (Sect. 3.1.1) round function


15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10
3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5
0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15
13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9
(a) .SB2DES
10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8
13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1
13 6 4 9 8 15 3 0 11 1 2 12 5 10 14 7
1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12
(b) .SB3DES
7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9
10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4
3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14
(c) .SB4DES
2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6
4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
(continued)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 455
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
456 C DES Sbox

Table C.1 (continued)


(d) .SB5DES
12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11
10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6
4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13
(e) .SB6DES
4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1
13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6
1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2
6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12
(f) .SB7DES
13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7
1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2
7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11
(g) .SB8DES
Appendix D
Algebraic Normal Forms for PRESENT
Sbox Output Bits

For .i = 1, 2, 3, define

.ϕi : F42 → F2
x |→ SBPRESENT (x)i ,

where .SBPRESENT (x)i is the ith bit of SB.PRESENT (x), the PRESENT Sbox output
corresponding to .x. In this section, we will compute the algebraic normal forms for
.ϕi . Similarly to Table 3.13, we construct the table for each .ϕi —see Tables D.1, D.2,

and D.3.
The coefficients .λ are calculated based on Eq. 3.10 and the following equations:

λ0000 = ϕi (0000),
. λ0001 = ϕi (0000) + ϕi (0001),

λ0010 = ϕi (0000) + ϕi (0010),

.λ0011 = ϕi (0000) + ϕi (0010) + ϕi (0001) + ϕi (0011),


λ0100 = ϕi (0000) + ϕi (0100),
λ0101 = ϕi (0000) + ϕi (0001) + ϕi (0100) + ϕi (0101),
λ0110 = ϕi (0000) + ϕi (0010) + ϕi (0100) + ϕi (0110),
Σ
7
λ0111 = ϕi (x),
x=0
λ1000 = ϕi (0000) + ϕi (1000),
λ1001 = ϕi (0000) + ϕi (0001) + ϕi (1000) + ϕi (1001),

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 457
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
458 D Algebraic Normal Forms for PRESENT Sbox Output Bits

Table D.1 The Boolean function .ϕ1 takes input .x and outputs the 1st bit of SB.PRESENT (x). The
second last row lists the output of .ϕ1 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ1
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ1 (x) 0 0 1 1 0 0 1 0 1 1 1 0 0 1 0 1
.λx 0 0 1 0 0 0 0 1 1 0 1 1 1 1 0 0

Table D.2 The Boolean function .ϕ2 takes input .x and outputs the 2nd bit of SB.PRESENT (x). The
second last row lists the output of .ϕ2 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ2
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ2 (x) 1 1 1 0 0 0 0 1 0 1 1 0 1 1 0 0
.λx 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0

Table D.3 The Boolean function .ϕ3 takes input .x and outputs the 3rd bit of SB.PRESENT (x). The
second last row lists the output of .ϕ3 for different input values. The last row lists the coefficients
(Eq. 3.10) for the algebraic normal form of .ϕ3
.x 0 1 2 3 4 5 6 7 8 9 A B C D E F
.x3 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
.x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
.x1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
.x0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
SB.PRESENT (x) C 5 6 B 9 0 A D 3 E F 8 4 7 1 2
.ϕ3 (x) 1 0 0 1 1 0 1 1 0 1 1 1 0 0 0 0
.λx 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0

λ1010 = ϕi (0000) + ϕi (0010) + ϕi (1000) + ϕi (1010),


λ1011 = ϕi (0) + ϕi (1) + ϕi (2) + ϕi (3) + ϕi (8) + ϕi (9) + ϕi (A) + ϕi (B),

λ1100 = ϕi (0000) + ϕi (0100) + ϕi (1000) + ϕi (1100),


.

λ1101 = ϕi (0) + ϕi (1) + ϕi (4) + ϕi (5) + ϕi (8) + ϕi (9) + ϕi (C) + ϕi (D),


D Algebraic Normal Forms for PRESENT Sbox Output Bits 459

λ1110 = ϕi (0) + ϕi (2) + ϕi (4) + ϕi (6) + ϕi (8) + ϕi (A) + ϕi (C) + ϕi (E),


Σ
F
λ1111 = ϕi (x).
x=0

By Eq. 3.9, we have

.ϕ1 (x) = λ0010 x1 + λ0111 x2 x1 x0 + λ1000 x3 + λ1010 x3 x1


+λ1011 x3 x1 x0 + λ1100 x3 x2 + λ1101 x3 x2 x0
= x1 + x3 + x1 x3 + x2 x3 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 ,
ϕ2 (x) = λ0000 + λ0011 x1 x0 + λ0100 x2 + λ1000 x3 + λ1001 x3 x0
+λ1010 x3 x1 + λ1011 x3 x1 x0 + λ1101 x3 x2 x0
= 1 + x2 + x3 + x0 x1 + x0 x3 + x1 x3 + x0 x1 x3 + x0 x2 x3
ϕ3 (x) = λ0000 + λ0001 x0 + λ0010 x1 + λ0110 x2 x1 + λ0111 x2 x1 x0
+λ1000 x3 + λ1011 x3 x1 x0 + λ1101 x3 x2 x0
= 1 + x0 + x1 + x3 + x1 x2 + x0 x1 x2 + x0 x1 x3 + x0 x2 x3 .
Appendix E
Encoding-Based Countermeasure for
Symmetric Block Ciphers

In Table E.1, we list values in .TSG , which are signals for each integer between
00 and 3F with Hamming weight 6, computed with the stochastic leakage model
obtained in Code-SCA Step 6 from Sect. 4.5.1.1. The sorted version of .TSG is shown
in Table E.2, where the signals are in ascending order and the words from .F62 with
Hamming weight 6 are recorded accordingly.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 461
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
462 E Encoding-Based Countermeasure for Symmetric Block Ciphers

Table E.1 Table .TSG , 3F 00111111 .−0.00980


estimated signals for each
5F 01011111 .−0.00976
integer between 00 and FF
with Hamming weight 6, 6F 01101111 .−0.01058

computed with the stochastic 77 01110111 .−0.01066


leakage model obtained in 7B 01111011 .−0.01053
Code-SCA Step 6 from 7D 01111101 .−0.01059
Sect. 4.5.1.1. The first (resp.
7E 01111110 .−0.00943
second) column contains the
hexadecimal (resp. binary) 9F 10011111 .−0.00987

representations of the AF 10101111 .−0.01069


integers. The last column lists B7 10110111 .−0.01078
the corresponding estimated BB 10111011 .−0.01065
signals
BD 10111101 .−0.01071
BE 10111110 .−0.00955

CF 11001111 .−0.01066
D7 11010111 .−0.01074
DB 11011011 .−0.01061

DD 11011101 .−0.01067
DE 11011110 .−0.00951
E7 11100111 .−0.01156

EB 11101011 .−0.01143
ED 11101101 .−0.01149
EE 11101110 .−0.01033

F3 11110011 .−0.01152
F5 11110101 .−0.01158

F6 11110110 .−0.01042
F9 11111001 .−0.01145
FA 11111010 .−0.01029

FC 11111100 .−0.01035
E Encoding-Based Countermeasure for Symmetric Block Ciphers 463

Table E.2 Sorted version of F5 11110101 −0.01158


.TSG from Table E.1 such that
the estimated signals (values E7 11100111 −0.01156
in the last column) are in F3 11110011 −0.01152
ascending order. The ED 11101101 −0.01149
hexadecimal (resp. binary) F9 11111001 −0.01145
representations of the EB 11101011 −0.01143
corresponding integers are in
the first (resp. second) B7 10110111 −0.01078
column. Words highlighted in D7 11010111 −0.01074
blue constitute the chosen BD 10111101 −0.01071
binary code with AF 10101111 −0.01069
Algorithm 4.5 DD 11011101 −0.01067
77 01110111 −0.01066
CF 11001111 −0.01066
BB 10111011 −0.01065
DB 11011011 −0.01061
7D 01111101 −0.01059
6F 01101111 −0.01058
7B 01111011 −0.01053
F6 11110110 −0.01042
FC 11111100 −0.01035
EE 11101110 −0.01033
FA 11111010 −0.01029
9F 10011111 −0.00987
3F 00111111 −0.00980
5F 01011111 −0.00976
BE 10111110 −0.00955
DE 11011110 −0.00951
7E 01111110 −0.00943
References

[2709] ISO/IEC JTC 1/SC 27. ISO/IEC 15408-1: Information technology—Security


techniques—Evaluation criteria for IT security—Part 1: Introduction and general
model, International Organization for Standardization, 2009.
[ABB+ 20] Melissa Azouaoui, Davide Bellizia, Ileana Buhan, Nicolas Debande, Sébastien
Duval, Christophe Giraud, Éliane Jaulmes, François Koeune, Elisabeth Oswald,
François-Xavier Standaert, et al. A systematic appraisal of side channel evaluation
strategies. In Security Standardisation Research: 6th International Conference,
SSR 2020, London, UK, November 30–December 1, 2020, Proceedings 6, pages
46–66. Springer, 2020.
[ABC+ 17] Stéphanie Anceau, Pierre Bleuet, Jessy Clédière, Laurent Maingault, Jean-luc
Rainard, and Rémi Tucoulou. Nanofocused x-ray beam to reprogram secure
circuits. In International Conference on Cryptographic Hardware and Embedded
Systems, pages 175–188. Springer, 2017.
[ABF+ 03] Christian Aumüller, Peter Bier, Wieland Fischer, Peter Hofreiter, and J-P Seifert.
Fault attacks on RSA with crt: Concrete results and practical countermeasures.
In International Workshop on Cryptographic Hardware and Embedded Systems,
pages 260–275. Springer, 2003.
[AFV07] Frederic Amiel, Benoit Feix, and Karine Villegas. Power analysis for secret
recovering and reverse engineering of public key algorithms. In Selected Areas in
Cryptography: 14th International Workshop, SAC 2007, Ottawa, Canada, August
16–17, 2007, Revised Selected Papers 14, pages 110–125. Springer, 2007.
[AG01] Mehdi-Laurent Akkar and Christophe Giraud. An implementation of DES and
AES, secure against some attacks. In Cryptographic Hardware and Embedded
Systems–CHES 2001: Third International Workshop Paris, France, May 14–16,
2001 Proceedings 3, pages 309–318. Springer, 2001.
[Age15] National Security Agency. Commercial National Security Algorithm Suite. https://
apps.nsa.gov/iaarchive/programs/iad-initiatives/cnsa-suite.cfm, 2015.
[AGF21] Rabin Yu Acharya, Fatemeh Ganji, and Domenic Forte. Infoneat: Information
theory-based neuroevolution of augmenting topologies for side-channel analysis.
arXiv preprint arXiv:2105.00117, 2021.
[AH20] Karim M Abdellatif and Olivier Hériveaux. Silicontoaster: a cheap and pro-
grammable em injector for extracting secrets. In 2020 Workshop on Fault Detection
and Tolerance in Cryptography (FDTC), pages 35–40. IEEE, 2020.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 465
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
466 References

[AK96] Ross Anderson and Markus Kuhn. Tamper resistance-a cautionary note. In
Proceedings of the second Usenix workshop on electronic commerce, volume 2,
pages 1–11, 1996.
[ANP20] Alexandre Adomnicai, Zakaria Najm, and Thomas Peyrin. Fixslicing: a new gift
representation: fast constant-time implementations of gift and gift-cofb on arm
cortex-m. IACR Transactions on Cryptographic Hardware and Embedded Systems,
pages 402–427, 2020.
[AP20] Alexandre Adomnicai and Thomas Peyrin. Fixslicing AES-like ciphers: New
bitsliced AES speed records on ARM-cortex M and RISC-V. Cryptology ePrint
Archive, 2020.
[APZ21] Melissa Azouaoui, Kostas Papagiannopoulos, and Dominik Zürner. Blind side-
channel SIFA. In 2021 Design, Automation & Test in Europe Conference &
Exhibition (DATE), pages 555–560. IEEE, 2021.
[Atm16] Atmel. AVR Instruction Set Manual. http://ww1.microchip.com/downloads/en/
devicedoc/atmel-0856-avr-instruction-set-manual.pdf, 2016.
[AV13] Kostas Papagiannopoulos Aram Verstegen. Present speed implementation. https://
github.com/kostaspap88/PRESENT_speed_implementation, 2013.
[AWKS12] Kahraman D Akdemir, Zhen Wang, Mark Karpovsky, and Berk Sunar. Design
of cryptographic devices resilient to fault injection attacks using nonlinear robust
codes. Fault analysis in cryptography, pages 171–199, 2012.
[BBB+ 07] Elaine Barker, William Barker, William Burr, William Polk, and Miles Smid. NIST
special publication 800-57. NIST Special publication, 2007.
[BBB+ 18] Anubhab Baksi, Shivam Bhasin, Jakub Breier, Mustafa Khairallah, and Thomas
Peyrin. Protecting block ciphers against differential fault attacks without re-keying.
In 2018 IEEE International Symposium on Hardware Oriented Security and Trust
(HOST), pages 191–194. IEEE, 2018.
[BBB+ 21] Anubhab Baksi, Shivam Bhasin, Jakub Breier, Mustafa Khairallah, Thomas Peyrin,
Sumanta Sarkar, and Siang Meng Sim. DEFAULT: Cipher Level Resistance
Against Differential Fault Attack. In Mehdi Tibouchi and Huaxiong Wang,
editors, Advances in Cryptology—ASIACRYPT 2021, pages 124–156, Cham, 2021.
Springer International Publishing.
[BBB+ 22] Lejla Batina, Shivam Bhasin, Jakub Breier, Xiaolu Hou, and Dirmanto Jap.
On implementation-level security of edge-based machine learning models. In
Security and Artificial Intelligence: A Crossdisciplinary Approach, pages 335–359.
Springer, 2022.
[BBH+ 20] Shivam Bhasin, Jakub Breier, Xiaolu Hou, Dirmanto Jap, Romain Poussier,
and Siang Meng Sim. SITM: See-in-the-middle side-channel assisted middle
round differential cryptanalysis on SPN block ciphers. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 95–122, 2020.
[BBJP19] Lejla Batina, Shivam Bhasin, Dirmanto Jap, and Stjepan Picek. {CSI} {NN}:
Reverse engineering of neural network architectures through electromagnetic side
channel. In 28th USENIX Security Symposium (USENIX Security 19), pages 515–
532, 2019.
[BC16] Jakub Breier and Chien-Ning Chen. On determining optimal parameters for testing
devices against laser fault attacks. In 2016 International Symposium on Integrated
Circuits (ISIC), pages 1–4. IEEE, 2016.
[BCDG10] Alexandre Berzati, Cécile Canovas-Dumas, and Louis Goubin. Public key
perturbation of randomized RSA implementations. In Cryptographic Hardware
and Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 306–319. Springer, 2010.
[BCG08] Alexandre Berzati, Cécile Canovas, and Louis Goubin. Perturbating RSA public
keys: An improved attack. In Cryptographic Hardware and Embedded Systems–
CHES 2008: 10th International Workshop, Washington, DC, USA, August 10–13,
2008. Proceedings 10, pages 380–395. Springer, 2008.
References 467

[BCMCC06] Eric Brier, Benoît Chevallier-Mames, Mathieu Ciet, and Christophe Clavier. Why
one should also secure RSA public key elements. In Cryptographic Hardware and
Embedded Systems-CHES 2006: 8th International Workshop, Yokohama, Japan,
October 10–13, 2006. Proceedings 8, pages 324–338. Springer, 2006.
[BD00] Dan Boneh and Glenn Durfee. Cryptanalysis of RSA with private key d less than
n/sup 0.292. IEEE transactions on Information Theory, 46(4):1339–1349, 2000.
[BD16] Elaine Barker and Quynh Dang. NIST special publication 800-57 part 1, revision
4. NIST Special publication, 2016.
[BDF98] Dan Boneh, Glenn Durfee, and Yair Frankel. An attack on RSA given a small
fraction of the private key bits. In International Conference on the Theory and
Application of Cryptology and Information Security, pages 25–34. Springer, 1998.
[BDF+ 09] Shivam Bhasin, Jean-Luc Danger, Florent Flament, Tarik Graba, Sylvain Guilley,
Yves Mathieu, Maxime Nassar, Laurent Sauvage, and Nidhal Selmane. Combined
SCA and DFA countermeasures integrable in a FPGA design flow. In 2009
International Conference on Reconfigurable Computing and FPGAs, pages 213–
218. IEEE, 2009.
[BDH+ 97] Feng Bao, Robert H Deng, Yongfei Han, A Jeng, A Desai Narasimhalu, and
T Ngair. Breaking public key cryptosystems on tamper resistant devices in the
presence of transient faults. In International Workshop on Security Protocols, pages
115–124. Springer, 1997.
[BDL97] Dan Boneh, Richard A DeMillo, and Richard J Lipton. On the importance of
checking cryptographic protocols for faults. In International conference on the
theory and applications of cryptographic techniques, pages 37–51. Springer, 1997.
[BDPA13] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. Keccak. In
Annual international conference on the theory and applications of cryptographic
techniques, pages 313–314. Springer, 2013.
[BDPVA07] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. Sponge
functions. In ECRYPT hash workshop, 2007.
[BECN+ 06] Hagai Bar-El, Hamid Choukri, David Naccache, Michael Tunstall, and Claire
Whelan. The sorcerer’s apprentice guide to fault attacks. Proceedings of the IEEE,
94(2):370–382, 2006.
[BEG13] Nasour Bagheri, Reza Ebrahimpour, and Navid Ghaedi. New differential fault
analysis on present. EURASIP Journal on Advances in Signal Processing, 2013:1–
10, 2013.
[Bei11] Amos Beimel. Secret-sharing schemes: A survey. In International conference on
coding and cryptology, pages 11–46. Springer, 2011.
[Ber09] Dennis S Bernstein. Matrix mathematics: theory, facts, and formulas. Princeton
university press, 2009.
[BFGV12] Josep Balasch, Sebastian Faust, Benedikt Gierlichs, and Ingrid Verbauwhede.
Theory and practice of a leakage resilient masking scheme. In Advances in
Cryptology–ASIACRYPT 2012: 18th International Conference on the Theory and
Application of Cryptology and Information Security, Beijing, China, December 2–
6, 2012. Proceedings 18, pages 758–775. Springer, 2012.
[BFP19] Claudio Bozzato, Riccardo Focardi, and Francesco Palmarini. Shaping the glitch:
optimizing voltage fault injection attacks. IACR Transactions on Cryptographic
Hardware and Embedded Systems, pages 199–224, 2019.
[BGE+ 17] Jan Burchard, Manl Gay, Ange-Salomé Messeng Ekossono, Jan Horáček, Bernd
Becker, Tobias Schubert, Martin Kreuzer, and Ilia Polian. Autofault: towards
automatic construction of algebraic fault attacks. In 2017 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 65–72. IEEE, 2017.
[BGK04] Johannes Blömer, Jorge Guajardo, and Volker Krummel. Provably secure masking
of AES. In International workshop on selected areas in cryptography, pages 69–
83. Springer, 2004.
468 References

[BGLP13] Ryad Benadjila, Jian Guo, Victor Lomné, and Thomas Peyrin. Implementing
lightweight block ciphers on x86 architectures. In International Conference on
Selected Areas in Cryptography, pages 324–351. Springer, 2013.
[BGLT04] Marco Bucci, Michele Guglielmo, Raimondo Luzzi, and Alessandro Trifiletti.
A power consumption randomization countermeasure for DPA-resistant crypto-
graphic processors. In Integrated Circuit and System Design. Power and Timing
Modeling, Optimization and Simulation: 14th International Workshop, PATMOS
2004, Santorini, Greece, September 15–17, 2004. Proceedings 14, pages 481–490.
Springer, 2004.
[BGM+ 03] Luca Benini, Angelo Galati, Alberto Macii, Enrico Macii, and Massimo Poncino.
Energy-efficient data scrambling on memory-processor interfaces. In Proceedings
of the 2003 international symposium on Low power electronics and design, pages
26–29, 2003.
[BGN+ 15] Begül Bilgin, Benedikt Gierlichs, Svetla Nikova, Ventzislav Nikov, and Vincent
Rijmen. Trade-offs for threshold implementations illustrated on AES. IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems,
34(7):1188–1200, 2015.
[BGV11] Josep Balasch, Benedikt Gierlichs, and Ingrid Verbauwhede. An in-depth and
black-box characterization of the effects of clock glitches on 8-bit MCUs. In
2011 Workshop on Fault Diagnosis and Tolerance in Cryptography, pages 105–
114. IEEE, 2011.
[BH15] Jakub Breier and Wei He. Multiple fault attack on PRESENT with a hardware
trojan implementation in FPGA. In Gabriel Ghinita and Pedro Peris-Lopez, editors,
2015 International Workshop on Secure Internet of Things, SIoT 2015, Vienna,
Austria, September 21–25, 2015, pages 58–64. IEEE Computer Society, 2015.
[BH17] Jakub Breier and Xiaolu Hou. Feeding two cats with one bowl: On designing a fault
and side-channel resistant software encoding scheme. In Topics in Cryptology–CT-
RSA 2017: The Cryptographers’ Track at the RSA Conference 2017, San Francisco,
CA, USA, February 14–17, 2017, Proceedings, pages 77–94. Springer, 2017.
[BH22] Jakub Breier and Xiaolu Hou. How practical are fault injection attacks, really?
IEEE Access, 10:113122–113130, 2022.
[BHJ+ 18] Jakub Breier, Xiaolu Hou, Dirmanto Jap, Lei Ma, Shivam Bhasin, and Yang Liu.
Practical fault attack on deep neural networks. In Proceedings of the 2018 ACM
SIGSAC Conference on Computer and Communications Security, pages 2204–
2206. ACM, 2018.
[BHL18] Jakub Breier, Xiaolu Hou, and Yang Liu. Fault attacks made easy: Differential
fault analysis automation on assembly code. IACR Transactions on Cryptographic
Hardware and Embedded Systems, pages 96–122, 2018.
[BHL19] Jakub Breier, Xiaolu Hou, and Yang Liu. On evaluating fault resilient encoding
schemes in software. IEEE Transactions on Dependable and Secure Computing,
18(3):1065–1079, 2019.
[BHOS22] Jakub Breier, Xiaolu Hou, Martín Ochoa, and Jesus Solano. Foobar: Fault fooling
backdoor attack on neural network training. IEEE Transactions on Dependable
and Secure Computing, 2022.
[BHT01] Eric Brier, Helena Handschuh, and Christophe Tymen. Fast primitives for internal
data scrambling in tamper resistant hardware. In Cryptographic Hardware and
Embedded Systems–CHES 2001: Third International Workshop Paris, France, May
14–16, 2001 Proceedings 3, pages 16–27. Springer, 2001.
[BHvW12] Lejla Batina, Jip Hogenboom, and Jasper GJ van Woudenberg. Getting more
from PCA: first results of using principal component analysis for extensive power
analysis. In Topics in Cryptology–CT-RSA 2012: The Cryptographers’ Track at
the RSA Conference 2012, San Francisco, CA, USA, February 27–March 2, 2012.
Proceedings, pages 383–397. Springer, 2012.
References 469

[Bih97] Eli Biham. A fast new DES implementation in software. In International Workshop
on Fast Software Encryption, pages 260–272. Springer, 1997.
[BILT04] Jean-Claude Bajard, Laurent Imbert, Pierre-Yvan Liardet, and Yannick Teglia.
Leak resistant arithmetic. In Cryptographic Hardware and Embedded Systems-
CHES 2004: 6th International Workshop Cambridge, MA, USA, August 11–13,
2004. Proceedings 6, pages 62–75. Springer, 2004.
[BJB18] Jakub Breier, Dirmanto Jap, and Shivam Bhasin. SCADPA: side-channel assisted
differential-plaintext attack on bit permutation based ciphers. In Jan Madsen and
Ayse K. Coskun, editors, 2018 Design, Automation & Test in Europe Conference
& Exhibition, DATE 2018, Dresden, Germany, March 19–23, 2018, pages 1129–
1134. IEEE, 2018.
[BJH+ 21] Jakub Breier, Dirmanto Jap, Xiaolu Hou, Shivam Bhasin, and Yang Liu. Sniff:
reverse engineering of neural networks with fault attacks. IEEE Transactions on
Reliability, 71(4):1527–1539, 2021.
[BJHB19] Jakub Breier, Dirmanto Jap, Xiaolu Hou, and Shivam Bhasin. On side channel
vulnerabilities of bit permutations in cryptographic algorithms. IEEE Transactions
on Information Forensics and Security, 15:1072–1085, 2019.
[BJHB23] Jakub Breier, Dirmanto Jap, Xiaolu Hou, and Shivam Bhasin. A
desynchronization-based countermeasure against side-channel analysis of neural
networks. In International Symposium on Cyber Security, Cryptology, and Machine
Learning, pages 296–306. Springer, 2023.
[BJKS21] Robert Buhren, Hans-Niklas Jacob, Thilo Krachenfels, and Jean-Pierre Seifert.
One glitch to rule them all: Fault injection attacks against AMD’s secure encrypted
virtualization. In Proceedings of the 2021 ACM SIGSAC Conference on Computer
and Communications Security, pages 2875–2889, 2021.
[BJP20] Shivam Bhasin, Dirmanto Jap, and Stjepan Picek. AES HD dataset—50 000 traces.
AISyLab repository, 2020. https://github.com/AISyLab/AES_HD.
[BK06] Johannes Blömer and Volker Krummel. Fault based collision attacks on AES. In
International Workshop on Fault Diagnosis and Tolerance in Cryptography, pages
106–120. Springer, 2006.
[BKH+ 19] Arthur Beckers, Masahiro Kinugawa, Yuichi Hayashi, Daisuke Fujimoto, Josep
Balasch, Benedikt Gierlichs, and Ingrid Verbauwhede. Design considerations for
em pulse fault injection. In International Conference on Smart Card Research and
Advanced Applications, pages 176–192. Springer, 2019.
[BKHL20] Jakub Breier, Mustafa Khairallah, Xiaolu Hou, and Yang Liu. A countermeasure
against statistical ineffective fault analysis. IEEE Transactions on Circuits and
Systems II: Express Briefs, 67(12):3322–3326, 2020.
[BKL+ 07] Andrey Bogdanov, Lars R Knudsen, Gregor Leander, Christof Paar, Axel
Poschmann, Matthew JB Robshaw, Yannick Seurin, and Charlotte Vikkelsoe.
Present: An ultra-lightweight block cipher. In International workshop on cryp-
tographic hardware and embedded systems, pages 450–466. Springer, 2007.
[Bla83] George R Blakely. A computer algorithm for calculating the product ab modulo m.
IEEE Transactions on Computers, 100(5):497–500, 1983.
[BLMR19] Christof Beierle, Gregor Leander, Amir Moradi, and Shahram Rasoolzadeh. Craft:
lightweight tweakable block cipher with efficient protection against DFA attacks.
IACR Transactions on Symmetric Cryptology, 2019(1):5–45, 2019.
[BMV07] Sanjay Burman, Debdeep Mukhopadhyay, and Kamakoti Veezhinathan. LFSR
based stream ciphers are vulnerable to power attacks. In International Conference
on Cryptology in India, pages 384–392. Springer, 2007.
[Bor06] Michele Boreale. Attacking right-to-left modular exponentiation with timely ran-
dom faults. In Fault Diagnosis and Tolerance in Cryptography: Third International
Workshop, FDTC 2006, Yokohama, Japan, October 10, 2006. Proceedings, pages
24–35. Springer, 2006.
470 References

[BOS03] Johannes Blömer, Martin Otto, and Jean-Pierre Seifert. A new CRT-RSA algorithm
secure against bellcore attacks. In Proceedings of the 10th ACM conference on
Computer and communications security, pages 311–320, 2003.
[BP82] HJ Beker and FC Piper. Communications security: a survey of cryptography. IEE
Proceedings A (Physical Science, Measurement and Instrumentation, Management
and Education, Reviews), 129(6):357–376, 1982.
[BPS+ 20] Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli, and Cécile
Dumas. Deep learning for side-channel analysis and introduction to ASCAD
database. Journal of Cryptographic Engineering, 10(2):163–188, 2020.
[BPS+ 21] Ryad Benadjila, Emmanuel Prouff, Rémi Strullu, Eleonora Cagli, and Cécile
Dumas. ASCAD SCA database. https://github.com/ANSSI-FR/ASCAD.git, 2021.
[BRBG16] Erik Bosman, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Dedup est
machina: Memory deduplication as an advanced exploitation vector. In 2016 IEEE
symposium on security and privacy (SP), pages 987–1004. IEEE, 2016.
[BS97] Eli Biham and Adi Shamir. Differential fault analysis of secret key cryptosystems.
In Advances in Cryptology–CRYPTO’97: 17th Annual International Cryptology
Conference Santa Barbara, California, USA August 17–21, 1997 Proceedings 17,
pages 513–525. Springer, 1997.
[BS08] Bhaskar Biswas and Nicolas Sendrier. Mceliece cryptosystem implementation:
Theory and practice. In International Workshop on Post-Quantum Cryptography,
pages 47–62. Springer, 2008.
[BS12] Eli Biham and Adi Shamir. Differential cryptanalysis of the data encryption
standard. Springer Science & Business Media, 2012.
[BSH75] Daniel Binder, Edward C Smith, and AB Holman. Satellite anomalies from galactic
cosmic rays. IEEE Transactions on Nuclear Science, 22(6):2675–2680, 1975.
[BT12] Alessandro Barenghi and Elena Trichina. Fault attacks on stream ciphers. In Fault
Analysis in Cryptography, pages 239–255. Springer, 2012.
[Buc04] Johannes Buchmann. Introduction to cryptography, volume 335. Springer, 2004.
[Cal75] Stephen Calebotta. CMOS, the ideal logic family. National Semiconductor CMOS
Databook, Rev, 1:2–3, 1975.
[CB19] Andrea Caforio and Subhadeep Banik. A study of persistent fault analysis.
In Security, Privacy, and Applied Cryptography Engineering: 9th International
Conference, SPACE 2019, Gandhinagar, India, December 3–7, 2019, Proceedings
9, pages 13–33. Springer, 2019.
[CCD+ 21] Pierre-Louis Cayrel, Brice Colombier, Vlad-Florin Drăgoi, Alexandre Menu, and
Lilian Bossuet. Message-recovery laser fault injection attack on the classic
mceliece cryptosystem. In Annual International Conference on the Theory and
Applications of Cryptographic Techniques, pages 438–467. Springer, 2021.
[CCT+ 18] Samuel Chef, Chung Tah Chua, Jing Yun Tay, Yu Wen Siah, Shivam Bhasin,
J Breier, and Chee Lip Gan. Descrambling of embedded SRAM using a laser probe.
In 2018 IEEE International Symposium on the Physical and Failure Analysis of
Integrated Circuits (IPFA), pages 1–6. IEEE, 2018.
[CD+ 15] Ronald Cramer, Ivan Bjerre Damgård, et al. Secure multiparty computation.
Cambridge University Press, 2015.
[CFGR10] Christophe Clavier, Benoit Feix, Georges Gagnerot, and Mylene Roussellet. Pas-
sive and active combined attacks on AES combining fault attacks and side channel
analysis. In 2010 Workshop on Fault Diagnosis and Tolerance in Cryptography,
pages 10–19. IEEE, 2010.
[CFZK21] Huili Chen, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. Proflip: Targeted tro-
jan attack with progressive bit flips. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, pages 7718–7727, 2021.
[CG16] Claude Carlet and Sylvain Guilley. Complementary dual codes for counter-
measures to side-channel attacks. Adv. Math. Commun., 10(1):131–150, 2016.
References 471

[CH17] Ang Cui and Rick Housley. BADFET: Defeating Modern Secure Boot Using
Second-Order Pulsed Electromagnetic Fault Injection. In 11th USENIX Workshop
on Offensive Technologies (WOOT 17), 2017.
[Cho22] Charles Q. Choi. IBM Unveils 433-Qubit Osprey Chip. IEEE Spectrum, November
2022.
[CJRR99] Suresh Chari, Charanjit S Jutla, Josyula R Rao, and Pankaj Rohatgi. Towards sound
approaches to counteract power-analysis attacks. In Advances in Cryptology—
CRYPTO’99: 19th Annual International Cryptology Conference Santa Barbara,
California, USA, August 15–19, 1999 Proceedings 19, pages 398–412. Springer,
1999.
[CJW10] Nicolas T Courtois, Keith Jackson, and David Ware. Fault-algebraic attacks on
inner rounds of DES. In E-Smart’10 Proceedings: The Future of Digital Security
Technologies. Strategies Telecom and Multimedia, 2010.
[CK09] Jean-Sébastien Coron and Ilya Kizhvatov. An efficient method for random delay
generation in embedded software. In Cryptographic Hardware and Embed-
ded Systems-CHES 2009: 11th International Workshop Lausanne, Switzerland,
September 6–9, 2009 Proceedings, pages 156–170. Springer, 2009.
[CK10] Jean-Sébastien Coron and Ilya Kizhvatov. Analysis and improvement of the
random delay countermeasure of ches 2009. In Cryptographic Hardware and
Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 95–109. Springer, 2010.
[CK18] Jean-Sébastien Coron and Ilya Kizhvatov. Trace sets with random delays. https://
github.com/ikizhvatov/randomdelays-traces.git, 2018.
[Cla07] Christophe Clavier. Secret external encodings do not prevent transient fault
analysis. In Cryptographic Hardware and Embedded Systems-CHES 2007: 9th
International Workshop, Vienna, Austria, September 10–13, 2007. Proceedings 9,
pages 181–194. Springer, 2007.
[Cor99] Jean-Sébastien Coron. Resistance against differential power analysis for elliptic
curve cryptosystems. In Cryptographic Hardware and Embedded Systems:
First InternationalWorkshop, CHES’99 Worcester, MA, USA, August 12–13, 1999
Proceedings 1, pages 292–302. Springer, 1999.
[COZZ23] Yukun Cheng, Changhai Ou, Fan Zhang, and Shihui Zheng. DLPFA: Deep learning
based persistent fault analysis against block ciphers. Cryptology ePrint Archive,
2023.
[CRR03] Suresh Chari, Josyula R Rao, and Pankaj Rohatgi. Template attacks. In
Cryptographic Hardware and Embedded Systems-CHES 2002: 4th International
Workshop Redwood Shores, CA, USA, August 13–15, 2002 Revised Papers 4, pages
13–28. Springer, 2003.
[CT03] Jean-Sébastien Coron and Alexei Tchulkine. A new algorithm for switching
from arithmetic to boolean masking. In International Workshop on Cryptographic
Hardware and Embedded Systems, pages 89–97. Springer, 2003.
[CVM+ 21] Zitai Chen, Georgios Vasilakis, Kit Murdock, Edward Dean, David Oswald, and
Flavio D Garcia. VoltPillager: Hardware-based fault injection attacks against Intel
SGX Enclaves using the SVID voltage scaling interface. In 30th USENIX Security
Symposium (USENIX Security 21), pages 699–716, 2021.
[DAP+ 22] Anuj Dubey, Afzal Ahmad, Muhammad Adeel Pasha, Rosario Cammarota, and
Aydin Aysu. Modulonet: Neural networks meet modular arithmetic for efficient
hardware masking. IACR Transactions on Cryptographic Hardware and Embedded
Systems, pages 506–556, 2022.
[dBLW03] Bert den Boer, Kerstin Lemke, and Guntram Wicke. A DPA attack against the
modular reduction within a crt implementation of RSA. In Cryptographic Hard-
ware and Embedded Systems-CHES 2002: 4th International Workshop Redwood
Shores, CA, USA, August 13–15, 2002 Revised Papers 4, pages 228–243. Springer,
2003.
472 References

[DCA20] Anuj Dubey, Rosario Cammarota, and Aydin Aysu. Maskednet: The first hardware
inference engine aiming power side-channel protection. In 2020 IEEE Inter-
national Symposium on Hardware Oriented Security and Trust (HOST), pages
197–208. IEEE, 2020.
[DCRB+ 16] Thomas De Cnudde, Oscar Reparaz, Begül Bilgin, Svetla Nikova, Ventzislav
Nikov, and Vincent Rijmen. Masking AES with shares in hardware. In
International Conference on Cryptographic Hardware and Embedded Systems,
pages 194–212. Springer, 2016.
[DCSA22] Anuj Dubey, Rosario Cammarota, Vikram Suresh, and Aydin Aysu. Guarding
machine learning hardware against physical side-channel attacks. ACM Journal
on Emerging Technologies in Computing Systems (JETC), 18(3):1–31, 2022.
[DEK+ 18] Christoph Dobraunig, Maria Eichlseder, Thomas Korak, Stefan Mangard, Florian
Mendel, and Robert Primas. SIFA: exploiting ineffective fault inductions on
symmetric cryptography. IACR Transactions on Cryptographic Hardware and
Embedded Systems, pages 547–572, 2018.
[DLM20] Mathieu Dumont, Mathieu Lisart, and Philippe Maurine. Modeling and simulating
electromagnetic fault injection. IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, 40(4):680–693, 2020.
[DO22] Shaked Delarea and Yossi Oren. Practical, low-cost fault injection attacks on
personal smart devices. Applied Sciences, 12(1):417, 2022.
[DPRS11] Julien Doget, Emmanuel Prouff, Matthieu Rivain, and François-Xavier Standaert.
Univariate side channel attacks and leakage modeling. Journal of Cryptographic
Engineering, 1:123–144, 2011.
[DR02] Joan Daemen and Vincent Rijmen. The design of Rijndael, volume 2. Springer,
2002.
[Dud14] Richard M Dudley. Uniform central limit theorems, volume 142. Cambridge
university press, 2014.
[Dur19] Rick Durrett. Probability: theory and examples, volume 49. Cambridge university
press, 2019.
[Dwo15] Morris Dworkin. SHA-3 Standard: Permutation-Based Hash and Extendable-
Output Functions, 2015-08-04 2015.
[DZD+ 18] A Adam Ding, Liwei Zhang, François Durvaux, François-Xavier Standaert, and
Yunsi Fei. Towards sound and optimal leakage detection procedure. In Smart Card
Research and Advanced Applications: 16th International Conference, CARDIS
2017, Lugano, Switzerland, November 13–15, 2017, Revised Selected Papers,
pages 105–122. Springer, 2018.
[EJ96] Artur Ekert and Richard Jozsa. Quantum computation and shor’s factoring
algorithm. Reviews of Modern Physics, 68(3):733, 1996.
[ESH+ 11] Sho Endo, Takeshi Sugawara, Naofumi Homma, Takafumi Aoki, and Akashi
Satoh. An on-chip glitchy-clock generator for testing fault injection attacks.
Journal of Cryptographic Engineering, 1(4):265–270, 2011.
[Far70] PG Farrell. Linear binary anticodes. Electronics Letters, 13(6):419–421, 1970.
[FJLT13] Thomas Fuhr, Éliane Jaulmes, Victor Lomné, and Adrian Thillard. Fault attacks
on AES with faulty ciphertexts only. In 2013 Workshop on Fault Diagnosis and
Tolerance in Cryptography, pages 108–118. IEEE, 2013.
[FMP03] Pierre-Alain Fouque, Gwenaëlle Martinet, and Guillaume Poupard. Attacking
unbalanced RSA-crt using spa. In Cryptographic Hardware and Embedded
Systems-CHES 2003: 5th International Workshop, Cologne, Germany, September
8–10, 2003. Proceedings 5, pages 254–268. Springer, 2003.
[Fou98] Electronic Frontier Foundation. Cracking DES: Secrets of encryption research,
wiretap politics and chip design. https://cryptome.org/jya/cracking-des/cracking-
des.htm, 1998.
[FRVD08] Pierre-Alain Fouque, Denis Réal, Frédéric Valette, and Mhamed Drissi. The carry
leakage on the randomized exponent countermeasure. In International Workshop
References 473

on Cryptographic Hardware and Embedded Systems, pages 198–213. Springer,


2008.
[FV03] Pierre-Alain Fouque and Frédéric Valette. The doubling attack–why upwards is
better than downwards. In Cryptographic Hardware and Embedded Systems-CHES
2003: 5th International Workshop, Cologne, Germany, September 8–10, 2003.
Proceedings 5, pages 269–280. Springer, 2003.
[GBTP08] Benedikt Gierlichs, Lejla Batina, Pim Tuyls, and Bart Preneel. Mutual information
analysis: A generic side-channel distinguisher. In International Workshop on
Cryptographic Hardware and Embedded Systems, pages 426–442. Springer, 2008.
[GE21] Craig Gidney and Martin Ekerå. How to factor 2048 bit RSA integers in 8 hours
using 20 million noisy qubits. Quantum, 5:433, 2021.
[GGJR+ 11] Benjamin Jun Gilbert Goodwill, Josh Jaffe, Pankaj Rohatgi, et al. A testing
methodology for side-channel resistance validation. In NIST non-invasive attack
testing workshop, volume 7, pages 115–136, 2011.
[GGP09] Laurie Genelle, Christophe Giraud, and Emmanuel Prouff. Securing AES
implementation against fault attacks. In 2009 Workshop on Fault Diagnosis and
Tolerance in Cryptography (FDTC), pages 51–62. IEEE, 2009.
[GHNZ09] Zheng Gong, Pieter H Hartel, Svetla Nikova, and Bo Zhu. Towards secure and
practical MACs for body sensor networks. In INDOCRYPT, pages 182–198.
Springer, 2009.
[GHO15] Richard Gilmore, Neil Hanley, and Maire O’Neill. Neural network based attack
on a masked implementation of AES. In 2015 IEEE International Symposium on
Hardware Oriented Security and Trust (HOST), pages 106–111. IEEE, 2015.
[GHP04] Sylvain Guilley, Philippe Hoogvorst, and Renaud Pacalet. Differential power anal-
ysis model and some results. In Smart Card Research and Advanced Applications
VI: IFIP 18th World Computer Congress TC8/WG8. 8 & TC11/WG11. 2 Sixth
International Conference on Smart Card Research and Advanced Applications
(CARDIS) 22–27 August 2004 Toulouse, France, pages 127–142. Springer, 2004.
[GJJ22] Qian Guo, Andreas Johansson, and Thomas Johansson. A key-recovery side-
channel attack on classic mceliece implementations. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 800–827, 2022.
[GM11] Louis Goubin and Ange Martinelli. Protecting AES with shamir’s secret sharing
scheme. In Cryptographic Hardware and Embedded Systems–CHES 2011: 13th
International Workshop, Nara, Japan, September 28–October 1, 2011. Proceedings
13, pages 79–94. Springer, 2011.
[GMO01] Karine Gandolfi, Christophe Mourtel, and Francis Olivier. Electromagnetic analy-
sis: Concrete results. In Cryptographic Hardware and Embedded Systems–CHES
2001: Third International Workshop Paris, France, May 14–16, 2001 Proceedings
3, pages 251–261. Springer, 2001.
[GMWM16] Daniel Gruss, Clémentine Maurice, Klaus Wagner, and Stefan Mangard. Flush+
flush: a fast and stealthy cache attack. In Detection of Intrusions and Malware,
and Vulnerability Assessment: 13th International Conference, DIMVA 2016, San
Sebastián, Spain, July 7–8, 2016, Proceedings 13, pages 279–299. Springer, 2016.
[Goc11] Mark S Gockenbach. Finite-dimensional linear algebra. CRC Press, 2011.
[Gou01] Louis Goubin. A sound method for switching between boolean and arithmetic
masking. In Cryptographic Hardware and Embedded Systems–CHES 2001: Third
International Workshop Paris, France, May 14–16, 2001 Proceedings 3, pages 3–
15. Springer, 2001.
[GP99] Louis Goubin and Jacques Patarin. DES and differential power analysis the
“duplication” method. In Cryptographic Hardware and Embedded Systems:
First InternationalWorkshop, CHES’99 Worcester, MA, USA, August 12–13, 1999
Proceedings 1, pages 158–172. Springer, 1999.
[GSK06] Gunnar Gaubatz, Berk Sunar, and Mark G Karpovsky. Non-linear residue codes for
robust public-key arithmetic. In Fault Diagnosis and Tolerance in Cryptography:
474 References

Third International Workshop, FDTC 2006, Yokohama, Japan, October 10, 2006.
Proceedings, pages 173–184. Springer, 2006.
[GST12] Benedikt Gierlichs, Jörn-Marc Schmidt, and Michael Tunstall. Infective com-
putation and dummy rounds: Fault protection for block ciphers without check-
before-output. In Progress in Cryptology–LATINCRYPT 2012: 2nd International
Conference on Cryptology and Information Security in Latin America, Santiago,
Chile, October 7–10, 2012. Proceedings 2, pages 305–321. Springer, 2012.
[Hab65] Donald H Habing. The use of lasers to simulate radiation-induced transients
in semiconductor devices and circuits. IEEE Transactions on Nuclear Science,
12(5):91–100, 1965.
[HBB+ 16] Wei He, Jakub Breier, Shivam Bhasin, Noriyuki Miura, and Makoto Nagata. Ring
oscillator under laser: Potential of PLL-based countermeasure against laser fault
injection. In Fault Diagnosis and Tolerance in Cryptography (FDTC), 2016
Workshop on, pages 102–113. IEEE, 2016.
[HBB21] Xiaolu Hou, Jakub Breier, and Shivam Bhasin. DNFA: Differential no-fault
analysis of bit permutation based ciphers assisted by side-channel. In 2021 Design,
Automation & Test in Europe Conference & Exhibition (DATE), pages 182–187.
IEEE, 2021.
[HBB22] Xiaolu Hou, Jakub Breier, and Shivam Bhasin. SBCMA: Semi-blind com-
bined middle-round attack on bit-permutation ciphers with application to AEAD
schemes. IEEE Transactions on Information Forensics and Security, 17:3677–
3690, 2022.
[HBJ+ 21] Xiaolu Hou, Jakub Breier, Dirmanto Jap, Lei Ma, Shivam Bhasin, and Yang Liu.
Physical security of deep learning on edge devices: Comprehensive evaluation of
fault injection attack vectors. Microelectronics Reliability, 120:114116, 2021.
[HBK23] Xiaolu Hou, Jakub Breier, and Mladen Kovacevic. Another look at side-channel
resistant encoding schemes. IACR Cryptol. ePrint Arch., page 1698, 2023.
[HBZL19] Xiaolu Hou, Jakub Breier, Fuyuan Zhang, and Yang Liu. Fully automated
differential fault analysis on software implementations of block ciphers. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 1–29,
2019.
[HDD11] Philippe Hoogvorst, Guillaume Duc, and Jean-Luc Danger. Software implementa-
tion of dual-rail representation. COSADE, February, pages 24–25, 2011.
[Her96] Israel N Herstein. Abstract algebra. Prentice Hall, 1996.
[HFK+ 19] Sanghyun Hong, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, and Tudor
Dumitras. Terminal brain damage: Exposing the graceless degradation in deep
neural networks under hardware fault attacks. In 28th USENIX Security Symposium
(USENIX Security 19), pages 497–514, 2019.
[HH11] Ludger Hemme and Lars Hoffmann. Differential fault analysis on the sha1
compression function. In 2011 Workshop on Fault Diagnosis and Tolerance in
Cryptography, pages 54–62. IEEE, 2011.
[HHS+ 11] Yu-ichi Hayashi, Naofumi Homma, Takeshi Sugawara, Takaaki Mizuki, Takafumi
Aoki, and Hideaki Sone. Non-invasive EMI-based fault injection attack against
cryptographic modules. In 2011 IEEE International Symposium on Electromag-
netic Compatibility, pages 763–767. IEEE, 2011.
[HLMS14] Ronglin Hao, Bao Li, Bingke Ma, and Ling Song. Algebraic fault attack on the
sha-256 compression function. International Journal of Research in Computer
Science, 4(2):1, 2014.
[HOM06] Christoph Herbst, Elisabeth Oswald, and Stefan Mangard. An AES smart card
implementation resistant to power analysis attacks. In International conference on
applied cryptography and network security, pages 239–252. Springer, 2006.
[HPS98] Jeffrey Hoffstein, Jill Pipher, and Joseph H Silverman. NTRU: A ring-based public
key cryptosystem. In International algorithmic number theory symposium, pages
267–288. Springer, 1998.
References 475

[HS13] Michael Hutter and Jörn-Marc Schmidt. The temperature side channel and heating
fault attacks. In International Conference on Smart Card Research and Advanced
Applications, pages 219–235. Springer, 2013.
[HSP20] Max Hoffmann, Falk Schellenberg, and Christof Paar. Armory: fully automated
and exhaustive fault simulation on arm-m binaries. IEEE Transactions on
Information Forensics and Security, 16:1058–1073, 2020.
[Hun12] Thomas W Hungerford. Algebra, volume 73. Springer Science & Business Media,
2012.
[HZ12] Annelie Heuser and Michael Zohner. Intelligent machine homicide: Breaking
cryptographic devices using support vector machines. In Constructive Side-
Channel Analysis and Secure Design: Third International Workshop, COSADE
2012, Darmstadt, Germany, May 3–4, 2012. Proceedings 3, pages 249–264.
Springer, 2012.
[JAB+ 03] M Rabaey Jan, Chandrakasan Anantha, Nikolic Borivoje, et al. Digital integrated
circuits: a design perspective. Prentice Hall, 2003.
[Jea16] Jérémy Jean. TikZ for Cryptographers. https://www.iacr.org/authors/tikz/, 2016.
[JP04] Jean Jacod and Philip Protter. Probability essentials. Springer Science & Business
Media, 2004.
[JPY01] Marc Joye, Pascal Paillier, and Sung-Ming Yen. Secure evaluation of modular
functions. In 2001 International Workshop on Cryptology and Network Security,
pages 227–229. Citeseer, 2001.
[JQBD97] Marc Joye, Jean-Jacques Quisquater, Feng Bao, and Robert H Deng. RSA-type
signatures in the presence of transient faults. In IMA International Conference on
Cryptography and Coding, pages 155–160. Springer, 1997.
[JVDVF+ 22] Patrick Jattke, Victor Van Der Veen, Pietro Frigo, Stijn Gunter, and Kaveh Razavi.
Blacksmith: Scalable rowhammering in the frequency domain. In 2022 IEEE
Symposium on Security and Privacy (SP), pages 716–734. IEEE, 2022.
[JY03] Marc Joye and Sung-Ming Yen. The montgomery powering ladder. In
Cryptographic Hardware and Embedded Systems-CHES 2002: 4th International
Workshop Redwood Shores, CA, USA, August 13–15, 2002 Revised Papers, pages
291–302. Springer, 2003.
[KAF+ 10] Thorsten Kleinjung, Kazumaro Aoki, Jens Franke, Arjen K Lenstra, Emmanuel
Thomé, Joppe W Bos, Pierrick Gaudry, Alexander Kruppa, Peter L Montgomery,
Dag Arne Osvik, et al. Factorization of a 768-bit RSA modulus. In Annual
Cryptology Conference, pages 333–350. Springer, 2010.
[KBJ+ 22] Niclas Kühnapfel, Robert Buhren, Hans Niklas Jacob, Thilo Krachenfels, Christian
Werling, and Jean-Pierre Seifert. Em-fault it yourself: Building a replicable EMFI
setup for desktop and server hardware. In 2022 IEEE Physical Assurance and
Inspection of Electronics (PAINE), pages 1–7. IEEE, 2022.
[KDB+ 22] Satyam Kumar, Vishnu Asutosh Dasu, Anubhab Baksi, Santanu Sarkar, Dirmanto
Jap, Jakub Breier, and Shivam Bhasin. Side channel attack on stream ciphers:
A three-step approach to state/key recovery. IACR Transactions Cryptographic
Hardware and Embedded. Systems, 2022(2):166–191, 2022.
[KDK+ 14] Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk
Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory
without accessing them: An experimental study of dram disturbance errors. ACM
SIGARCH Computer Architecture News, 42(3):361–372, 2014.
[KHN+ 19] Mustafa Khairallah, Xiaolu Hou, Zakaria Najm, Jakub Breier, Shivam Bhasin,
and Thomas Peyrin. Sok: On DFA vulnerabilities of substitution-permutation
networks. In Steven D. Galbraith, Giovanni Russello, Willy Susilo, Dieter
Gollmann, Engin Kirda, and Zhenkai Liang, editors, Proceedings of the 2019
ACM Asia Conference on Computer and Communications Security, AsiaCCS 2019,
Auckland, New Zealand, July 09–12, 2019, pages 403–414. ACM, 2019.
476 References

[KJ01] Paul C Kocher and Joshua M Jaffe. Secure modular exponentiation with leak
minimization for smartcards and other cryptosystems, October 2 2001. US Patent
6,298,442.
[KJJ99] Paul Kocher, Joshua Jaffe, and Benjamin Jun. Differential power analysis. In
Advances in Cryptology—CRYPTO’99: 19th Annual International Cryptology
Conference Santa Barbara, California, USA, August 15–19, 1999 Proceedings 19,
pages 388–397. Springer, 1999.
[KJJ10] Paul C Kocher, Joshua M Jaffe, and Benjamin C Jun. Cryptographic computation
using masking to prevent differential power analysis and other attacks, February 23
2010. US Patent 7,668,310.
[KJJR11] Paul Kocher, Joshua Jaffe, Benjamin Jun, and Pankaj Rohatgi. Introduction to
differential power analysis. Journal of Cryptographic Engineering, 1:5–27, 2011.
[KJP14] Raghavan Kumar, Philipp Jovanovic, and Ilia Polian. Precise fault-injections using
voltage and temperature manipulation for differential cryptanalysis. In 2014 IEEE
20th International On-Line Testing Symposium (IOLTS), pages 43–48. IEEE, 2014.
[KKG03] Ramesh Karri, Grigori Kuznetsov, and Michael Goessel. Parity-based con-
current error detection of substitution-permutation network block ciphers. In
Cryptographic Hardware and Embedded Systems-CHES 2003: 5th International
Workshop, Cologne, Germany, September 8–10, 2003. Proceedings 5, pages 113–
124. Springer, 2003.
[KKT04] Mark Karpovsky, Konrad J Kulikowski, and Alexander Taubin. Robust protection
against fault-injection attacks on smart cards implementing the advanced encryp-
tion standard. In International Conference on Dependable Systems and Networks,
2004, pages 93–101. IEEE, 2004.
[KKY+ 89] Yasuhiro Konishi, Masaki Kumanoya, Hiroyuki Yamasaki, Katsumi Dosaka, and
Tsutomu Yoshihara. Analysis of coupling noise between adjacent bit lines in
megabit drams. IEEE Journal of Solid-State Circuits, 24(1):35–42, 1989.
[KM20] Martin S Kelly and Keith Mayes. High precision laser fault injection using low-
cost components. In 2020 IEEE International Symposium on Hardware Oriented
Security and Trust (HOST), pages 219–228. IEEE, 2020.
[KMBM17] Fatma Kahri, Hassen Mestiri, Belgacem Bouallegue, and Mohsen Machhout. Fault
attacks resistant architecture for keccak hash function. International Journal of
Advanced Computer Science and Applications, 8(5), 2017.
[Koç94] CK Koç. High-speed RSA implementation technical report. RSA Laboratories,
Redwood City, 1994.
[Koc96] Paul C Kocher. Timing attacks on implementations of diffie-hellman, RSA,
DSS, and other systems. In Advances in Cryptology—CRYPTO’96: 16th Annual
International Cryptology Conference Santa Barbara, California, USA August 18–
22, 1996 Proceedings 16, pages 104–113. Springer, 1996.
[Kos02] Thomas Koshy. Elementary number theory with applications. Academic press,
2002.
[KPH+ 19] Jaehun Kim, Stjepan Picek, Annelie Heuser, Shivam Bhasin, and Alan Hanjalic.
Make some noise. unleashing the power of convolutional neural networks for
profiled side-channel analysis. IACR Transactions on Cryptographic Hardware
and Embedded Systems, pages 148–179, 2019.
[KPP+ 22] Alexandr Alexandrovich Kuznetsov, Oleksandr Volodymyrovych Potii, Niko-
lay Alexandrovich Poluyanenko, Yurii Ivanovich Gorbenko, and Natalia Kryvin-
ska. Stream Ciphers in Modern Real-time IT Systems. Springer, 2022.
[KQ07] Chong Hee Kim and Jean-Jacques Quisquater. Fault attacks for crt based RSA:
New attacks, new results, and new countermeasures. In IFIP International
Workshop on Information Security Theory and Practices, pages 215–228. Springer,
2007.
References 477

[KS09] Emilia Käsper and Peter Schwabe. Faster and timing-attack resistant AES-GCM.
In International Workshop on Cryptographic Hardware and Embedded Systems,
pages 1–17. Springer, 2009.
[KSV13] Duško Karaklajić, Jörn-Marc Schmidt, and Ingrid Verbauwhede. Hardware
designer’s guide to fault attacks. IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, 21(12):2295–2306, 2013.
[Kwa00] Matthew Kwan. Reducing the gate count of bitslice DES. IACR Cryptol. ePrint
Arch., 2000(51):51, 2000.
[LBM15] Liran Lerman, Gianluca Bontempi, and Olivier Markowitch. A machine learning
approach against a masked AES: Reaching the limit of side-channel attacks with a
learning model. Journal of Cryptographic Engineering, 5:123–139, 2015.
[Len96] Arjen K Lenstra. Memo on RSA signature generation in the presence of faults.
Technical report, EPFL, 1996.
[LSG+ 10] Yang Li, Kazuo Sakiyama, Shigeto Gomisawa, Toshinori Fukunaga, Junko Taka-
hashi, and Kazuo Ohta. Fault sensitivity analysis. In Cryptographic Hardware
and Embedded Systems, CHES 2010: 12th International Workshop, Santa Barbara,
USA, August 17–20, 2010. Proceedings 12, pages 320–334. Springer, 2010.
[LWLX17] Yannan Liu, Lingxiao Wei, Bo Luo, and Qiang Xu. Fault injection attack on deep
neural network. In Proceedings of the 36th International Conference on Computer-
Aided Design, pages 131–138. IEEE, 2017.
[LX04] San Ling and Chaoping Xing. Coding theory: a first course. Cambridge University
Press, 2004.
[LZC+ 21] Xiangjun Lu, Chi Zhang, Pei Cao, Dawu Gu, and Haining Lu. Pay attention to
raw traces: A deep learning architecture for end-to-end profiling attacks. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 235–274,
2021.
[Mah45] Patrick Mahon. History of hut 8 to December 1941 (1945). B. Jack Copeland, page
265, 1945.
[Man03] Stefan Mangard. A simple power-analysis (spa) attack on implementations of the
AES key expansion. In Information Security and Cryptology—ICISC 2002: 5th
International Conference Seoul, Korea, November 28–29, 2002 Revised Papers 5,
pages 343–358. Springer, 2003.
[May03] Alexander May. New RSA vulnerabilities using lattice reduction methods. PhD
thesis, Citeseer, 2003.
[MBFC22] Saurav Maji, Utsav Banerjee, Samuel H Fuller, and Anantha P Chandrakasan.
A threshold implementation-based neural network accelerator with power and
electromagnetic side-channel countermeasures. IEEE Journal of Solid-State
Circuits, 2022.
[MDB+ 02] Jack A Mandelman, Robert H Dennard, Gary B Bronner, John K DeBrosse, Rama
Divakaruni, Yujun Li, and Carl J Radens. Challenges and future directions for the
scaling of dynamic random-access memory (DRAM). IBM Journal of Research
and Development, 46(2.3):187–212, 2002.
[MDS99a] Thomas S Messerges, Ezzy A Dabbish, and Robert H Sloan. Investigations of
power analysis attacks on smartcards. Smartcard, 99:151–161, 1999.
[MDS99b] Thomas S Messerges, Ezzy A Dabbish, and Robert H Sloan. Power analysis
attacks of modular exponentiation in smartcards. In Cryptographic Hardware and
Embedded Systems: First InternationalWorkshop, CHES’99 Worcester, MA, USA,
August 12–13, 1999 Proceedings 1, pages 144–157. Springer, 1999.
[Mes00] Thomas S Messerges. Securing the AES finalists against power analysis attacks.
In International Workshop on Fast Software Encryption, pages 150–164. Springer,
2000.
[MMR19] Thorben Moos, Amir Moradi, and Bastian Richter. Static power side-channel
analysis—an investigation of measurement factors. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, 28(2):376–389, 2019.
478 References

[MMS01a] David May, Henk L Muller, and Nigel P Smart. Non-deterministic processors.
In Information Security and Privacy: 6th Australasian Conference, ACISP 2001
Sydney, Australia, July 11–13, 2001 Proceedings 6, pages 115–129. Springer, 2001.
[MMS01b] David May, Henk L Muller, and Nigel P Smart. Random register renaming to foil
DPA. In Cryptographic Hardware and Embedded Systems—CHES 2001: Third
International Workshop Paris, France, May 14–16, 2001 Proceedings 3, pages 28–
38. Springer, 2001.
[Mon85] Peter L Montgomery. Modular multiplication without trial division. Mathematics
of computation, 44(170):519–521, 1985.
[Mon87] Peter L Montgomery. Speeding the pollard and elliptic curve methods of
factorization. Mathematics of computation, 48(177):243–264, 1987.
[MOP08] Stefan Mangard, Elisabeth Oswald, and Thomas Popp. Power analysis attacks:
Revealing the secrets of smart cards, volume 31. Springer Science & Business
Media, 2008.
[MPC00] Lauren May, Lyta Penna, and Andrew Clark. An implementation of bitsliced DES
on the pentium MMX TM processor. In Australasian Conference on Information
Security and Privacy, pages 112–122. Springer, 2000.
[MPG05] Stefan Mangard, Thomas Popp, and Berndt M Gammel. Side-channel leakage of
masked CMOS gates. In Cryptographers’ Track at the RSA Conference, pages
351–365. Springer, 2005.
[MPP16] Houssem Maghrebi, Thibault Portigliatti, and Emmanuel Prouff. Breaking crypto-
graphic implementations using deep learning techniques. In Security, Privacy, and
Applied Cryptography Engineering: 6th International Conference, SPACE 2016,
Hyderabad, India, December 14–18, 2016, Proceedings 6, pages 3–26. Springer,
2016.
[MS77] Florence Jessie MacWilliams and Neil James Alexander Sloane. The theory of
error correcting codes, volume 16. Elsevier, 1977.
[MS00] Rita Mayer-Sommer. Smartly analyzing the simplicity and the power of simple
power analysis on smartcards. In International Workshop on Cryptographic
Hardware and Embedded Systems, pages 78–92. Springer, 2000.
[MSB16] Houssem Maghrebi, Victor Servant, and Julien Bringer. There is wisdom in
harnessing the strengths of your enemy: Customized encoding to thwart side-
channel attacks. In Fast Software Encryption: 23rd International Conference, FSE
2016, Bochum, Germany, March 20–23, 2016, Revised Selected Papers 23, pages
223–243. Springer, 2016.
[MSGR10] Marcel Medwed, François-Xavier Standaert, Johann Großschädl, and Francesco
Regazzoni. Fresh re-keying: Security against side-channel and fault attacks for
low-cost devices. In International Conference on Cryptology in Africa, pages 279–
296. Springer, 2010.
[MSY06] Tal G Malkin, François-Xavier Standaert, and Moti Yung. A comparative
cost/security analysis of fault attack countermeasures. In Fault Diagnosis and Tol-
erance in Cryptography: Third International Workshop, FDTC 2006, Yokohama,
Japan, October 10, 2006. Proceedings, pages 159–172. Springer, 2006.
[MVOV18] Alfred J Menezes, Paul C Van Oorschot, and Scott A Vanstone. Handbook of
applied cryptography. CRC press, 2018.
[MWK+ 22] Catinca Mujdei, Lennert Wouters, Angshuman Karmakar, Arthur Beckers, Jose
Maria Bermudo Mera, and Ingrid Verbauwhede. Side-channel analysis of lattice-
based post-quantum cryptography: Exploiting polynomial multiplication. ACM
Transactions on Embedded Computing Systems, 2022.
[MWM21] Thorben Moos, Felix Wegener, and Amir Moradi. Dl-la: Deep learning leakage
assessment: A modern roadmap for SCA evaluations. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 552–598, 2021.
References 479

[MZMM16] Zdenek Martinasek, Vaclav Zeman, Lukas Malina, and Josef Martinasek. K-
nearest neighbors algorithm in profiling power analysis attacks. Radioengineering,
25(2):365–382, 2016.
[NIS01] NIST. Federal information processing standards publication (fips) 197. Advanced
Encryption Standard (AES), 2001.
[NIS19] NIST. FIPS 140-3: Security Requirements for Cryptographic Modules, National
Institute of Standards and Technology. Technical report, Federal Inf. Process. Stds.
(NIST FIPS), National Institute of Standards and Technology, Gaithersburg, MD,
2019.
[Nov02] Roman Novak. Spa-based adaptive chosen-ciphertext attack on RSA implemen-
tation. In International Workshop on Public Key Cryptography, pages 252–262.
Springer, 2002.
[NRS11] Svetla Nikova, Vincent Rijmen, and Martin Schläffer. Secure hardware implemen-
tation of nonlinear functions in the presence of glitches. Journal of Cryptology,
24:292–321, 2011.
[NY21] Yusuke Nozaki and Masaya Yoshikawa. Shuffling countermeasure against power
side-channel attack for MLP with software implementation. In 2021 IEEE
4th International Conference on Electronics and Communication Engineering
(ICECE), pages 39–42. IEEE, 2021.
[NYGD22] Len Luet Ng, Kim Ho Yeap, Magdalene Wan Ching Goh, and Veerendra Dakulagi.
Power consumption in CMOS circuits. In Electromagnetic Field in Advancing
Science and Technology. IntechOpen, 2022.
[O’D14] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press,
2014.
[O’F23] Colin O’Flynn. PicoEMP: A low-cost EMFI platform compared to BBI and voltage
fault injection using TDC and external VCC measurements. Cryptology ePrint
Archive, 2023.
[Ogg] Frédérique Oggier. Lecture notes. https://feog.github.io/. Accessed: 2012-11-30.
[ORBG17] Marco Oliverio, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Secure Page
Fusion with VUsion: https://www.vusec.net/projects/VUsion. In Proceedings of
the 26th Symposium on Operating Systems Principles, pages 531–545, 2017.
[Org17] European Cyber Security Organisation. Overview of existing cybersecurity stan-
dards and certification schemes v2, wg1—standardisation, certification, labelling
and supply chain management, 2017.
[ORJ+ 13] Rachid Omarouayache, Jérémy Raoult, Sylvie Jarrix, Laurent Chusseau, and
Philippe Maurine. Magnetic microprobe design for em fault attack. In 2013
International Symposium on Electromagnetic Compatibility, pages 949–954. IEEE,
2013.
[OS05] Elisabeth Oswald and Kai Schramm. An efficient masking scheme for AES
software implementations. In International Workshop on Information Security
Applications, pages 292–305. Springer, 2005.
[Osw] David Oswald. Lecture notes: Hardware and embedded systems security. https://
github.com/david-oswald/hwsec_lecture_notes. Accessed: 2012-12-03.
[PBMB17] Sikhar Patranabis, Jakub Breier, Debdeep Mukhopadhyay, and Shivam Bhasin.
One plus one is more than two: a practical combination of power and fault analysis
attacks on present and present-like block ciphers. In 2017 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 25–32. IEEE, 2017.
[PBP21] Guilherme Perin, Ileana Buhan, and Stjepan Picek. Learning when to stop: a mutual
information approach to prevent overfitting in profiled side-channel analysis. In
Constructive Side-Channel Analysis and Secure Design: 12th International Work-
shop, COSADE 2021, Lugano, Switzerland, October 25–27, 2021, Proceedings 12,
pages 53–81. Springer, 2021.
[PCP20] Guilherme Perin, Łukasz Chmielewski, and Stjepan Picek. Strength in numbers:
Improving generalization with ensembles in machine learning-based profiled side-
480 References

channel analysis. IACR Transactions on Cryptographic Hardware and Embedded


Systems, pages 337–364, 2020.
[PGP+ 19] Ilia Polian, Mael Gay, Tobias Paxian, Matthias Sauer, and Bernd Becker. Auto-
matic construction of fault attacks on cryptographic hardware implementations.
Automated Methods in Cryptographic Fault Analysis, pages 151–170, 2019.
[PHJ+ 19] Stjepan Picek, Annelie Heuser, Alan Jovic, Shivam Bhasin, and Francesco Regaz-
zoni. The curse of class imbalance and conflicting metrics with machine learning
for side-channel evaluations. IACR Transactions on Cryptographic Hardware and
Embedded Systems, 2019(1):1–29, 2019.
[PM05] Thomas Popp and Stefan Mangard. Masked dual-rail pre-charge logic: DPA-
resistance without routing constraints. In International Workshop on Crypto-
graphic Hardware and Embedded Systems, pages 172–186. Springer, 2005.
[PMK+ 11] Axel Poschmann, Amir Moradi, Khoongming Khoo, Chu-Wee Lim, Huaxiong
Wang, and San Ling. Side-channel resistant crypto for less than 2,300 ge. Journal
of Cryptology, 24:322–345, 2011.
[PNP+ 20] Athanasios Papadimitriou, Konstantinos Nomikos, Mihalis Psarakis, Ehsan Aerabi,
and David Hely. You can detect but you cannot hide: Fault assisted side channel
analysis on protected software-based block ciphers. In 2020 IEEE International
Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems
(DFT), pages 1–6. IEEE, 2020.
[PP09] Christof Paar and Jan Pelzl. Understanding cryptography: a textbook for students
and practitioners. Springer Science & Business Media, 2009.
[PPM17] Robert Primas, Peter Pessl, and Stefan Mangard. Single-trace side-channel attacks
on masked lattice-based encryption. In Cryptographic Hardware and Embedded
Systems–CHES 2017: 19th International Conference, Taipei, Taiwan, September
25–28, 2017, Proceedings, pages 513–533. Springer, 2017.
[PQ03] Gilles Piret and Jean-Jacques Quisquater. A differential fault attack technique
against SPN structures, with application to the AES and khazad. In Cryptographic
Hardware and Embedded Systems-CHES 2003: 5th International Workshop,
Cologne, Germany, September 8–10, 2003. Proceedings 5, pages 77–88. Springer,
2003.
[PR13] Emmanuel Prouff and Matthieu Rivain. Masking against side-channel attacks:
A formal security proof. In Annual International Conference on the Theory and
Applications of Cryptographic Techniques, pages 142–159. Springer, 2013.
[Pro13] Emmanuel Prouff. Side channel attacks against block ciphers implementations and
countermeasures. Tutorial presented in CHES, 2013.
[PS19] Sandro Pinto and Nuno Santos. Demystifying ARM TrustZone: A Comprehensive
Survey. ACM Computing Surveys (CSUR), 51(6):1–36, 2019.
[PSKH18] Aesun Park, Kyung-Ah Shim, Namhun Koo, and Dong-Guk Han. Side-channel
attacks on post-quantum signature schemes based on multivariate quadratic
equations:-rainbow and UOV. IACR Transactions on Cryptographic Hardware and
Embedded Systems, pages 500–523, 2018.
[PSQ07] Eric Peeters, François-Xavier Standaert, and Jean-Jacques Quisquater. Power
and electromagnetic analysis: Improved model, consequences and comparisons.
Integration, 40(1):52–60, 2007.
[PV13] Konstantinos Papagiannopoulos and Aram Verstegen. Speed and size-optimized
implementations of the present cipher for tiny avr devices. In Radio Frequency
Identification: Security and Privacy Issues 9th International Workshop, RFIDsec
2013, Graz, Austria, July 9–11, 2013, Revised Selected Papers 9, pages 161–175.
Springer, 2013.
[PY06] Raphael C W Phan and Sung-Ming Yen. Amplifying side-channel attacks with
techniques from block cipher cryptanalysis. In Smart Card Research and Advanced
Applications: 7th IFIP WG 8.8/11.2 International Conference, CARDIS 2006,
References 481

Tarragona, Spain, April 19–21, 2006. Proceedings 7, pages 135–150. Springer,


2006.
[QWL+ 20] Pengfei Qiu, Dongsheng Wang, Yongqiang Lyu, Ruidong Tian, Chunlu Wang, and
Gang Qu. Voltjockey: A new dynamic voltage scaling-based fault injection attack
on Intel SGX. IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 40(6):1130–1143, 2020.
[QWLQ19] Pengfei Qiu, Dongsheng Wang, Yongqiang Lyu, and Gang Qu. Voltjockey:
Breaching trustzone by software-controlled voltage manipulation over multi-core
frequencies. In Proceedings of the 2019 ACM SIGSAC Conference on Computer
and Communications Security, pages 195–209, 2019.
[RAL17] Tiago Reis, Diego F Aranha, and Julio López. Present runs fast. In International
Conference on Cryptographic Hardware and Embedded Systems, pages 644–664.
Springer, 2017.
[RBBC18] Prasanna Ravi, Shivam Bhasin, Jakub Breier, and Anupam Chattopadhyay. PPAP
and iPPAP: PLL-based protection against physical attacks. In 2018 IEEE Computer
Society Annual Symposium on VLSI, ISVLSI 2018, Hong Kong, China, July 8–11,
2018, pages 620–625. IEEE Computer Society, 2018.
[RBRC20] Prasanna Ravi, Shivam Bhasin, Sujoy Sinha Roy, and Anupam Chattopadhyay.
Drop by drop you break the rock-exploiting generic vulnerabilities in lattice-
based PKE/KEMs using EM-based physical attacks. IACR Cryptol. ePrint Arch.,
2020:549, 2020.
[RCDB22] Prasanna Ravi, Anupam Chattopadhyay, Jan Pieter D’Anvers, and Anubhab Baksi.
Side-channel and fault-injection attacks over lattice-based post-quantum schemes
(kyber, dilithium): Survey and new results. ACM Transactions on Embedded
Computing Systems, 2022.
[RCYF22] Adnan Siraj Rakin, Md Hafizul Islam Chowdhuryy, Fan Yao, and Deliang Fan.
Deepsteal: Advanced model extractions leveraging efficient weight stealing in
memories. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1157–
1174. IEEE, 2022.
[RDMB+ 18] Oscar Reparaz, Lauren De Meyer, Begül Bilgin, Victor Arribas, Svetla Nikova,
Ventzislav Nikov, and Nigel Smart. Capa: the spirit of beaver against physical
attacks. In Advances in Cryptology–CRYPTO 2018: 38th Annual International
Cryptology Conference, Santa Barbara, CA, USA, August 19–23, 2018, Proceed-
ings, Part I 38, pages 121–151. Springer, 2018.
[RGN13] Pablo Rauzy, Sylvain Guilley, and Zakaria Najm. Formally proved security of
assembly code against leakage. IACR Cryptol. ePrint Arch., 2013:554, 2013.
[RHF19] Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. Bit-flip attack: Crushing neural
network with progressive bit search. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, pages 1211–1220, 2019.
[RHF20] Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. TBT: Targeted neural network
attack with bit trojan. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 13198–13207, 2020.
[RHL+ 21] Adnan Siraj Rakin, Zhezhi He, Jingtao Li, Fan Yao, Chaitali Chakrabarti, and
Deliang Fan. T-BFA: Targeted bit-flip adversarial weight attack. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence, 44(11):7928–7939, 2021.
[Riv09] Matthieu Rivain. Differential fault analysis on DES middle rounds. In CHES,
volume 5747, pages 457–469. Springer, 2009.
[RLK11] Thomas Roche, Victor Lomné, and Karim Khalfallah. Combined fault and side-
channel attack on protected implementations of AES. In Smart Card Research and
Advanced Applications: 10th IFIP WG 8.8/11.2 International Conference, CARDIS
2011, Leuven, Belgium, September 14–16, 2011, Revised Selected Papers 10, pages
65–83. Springer, 2011.
[RM07] Bruno Robisson and Pascal Manet. Differential behavioral analysis. In
Cryptographic Hardware and Embedded Systems-CHES 2007: 9th International
482 References

Workshop, Vienna, Austria, September 10–13, 2007. Proceedings 9, pages 413–


426. Springer, 2007.
[Ros20] Sheldon M Ross. Introduction to probability and statistics for engineers and
scientists. Academic press, 2020.
[RP10] Matthieu Rivain and Emmanuel Prouff. Provably secure higher-order masking
of AES. In International Workshop on Cryptographic Hardware and Embedded
Systems, pages 413–427. Springer, 2010.
[RRB+ 19] Prasanna Ravi, Debapriya Basu Roy, Shivam Bhasin, Anupam Chattopadhyay,
and Debdeep Mukhopadhyay. Number “not used” once-practical fault attack on
pqm4 implementations of nist candidates. In Constructive Side-Channel Analysis
and Secure Design: 10th International Workshop, COSADE 2019, Darmstadt,
Germany, April 3–5, 2019, Proceedings 10, pages 232–250. Springer, 2019.
[RS09] Mathieu Renauld and François-Xavier Standaert. Algebraic side-channel attacks.
In International Conference on Information Security and Cryptology, pages 393–
410. Springer, 2009.
[RWPP21] Jorai Rijsdijk, Lichao Wu, Guilherme Perin, and Stjepan Picek. Reinforcement
learning for hyperparameter tuning in deep learning-based side-channel analysis.
IACR Transactions on Cryptographic Hardware and Embedded Systems, pages
677–707, 2021.
[RZC+ 21] Damien Robissout, Gabriel Zaid, Brice Colombier, Lilian Bossuet, and Amaury
Habrard. Online performance evaluation of deep learning networks for profiled
side-channel analysis. In Constructive Side-Channel Analysis and Secure Design:
11th International Workshop, COSADE 2020, Lugano, Switzerland, April 1–3,
2020, Revised Selected Papers 11, pages 200–218. Springer, 2021.
[SA93] Jerry M Soden and Richard E Anderson. Ic failure analysis: techniques and tools
for quality reliability improvement. Proceedings of the IEEE, 81(5):703–715,
1993.
[SA02] Sergei P Skorobogatov and Ross J Anderson. Optical fault induction attacks. In
International workshop on cryptographic hardware and embedded systems, pages
2–12. Springer, 2002.
[Sau13] Laurent Sauvage. Electric probes for fault injection attack. In 2013 Asia-Pacific
Symposium on Electromagnetic Compatibility (APEMC), pages 1–4. IEEE, 2013.
[SBM18] Pascal Sasdrich, René Bock, and Amir Moradi. Threshold implementation in
software: Case study of present. In Constructive Side-Channel Analysis and
Secure Design: 9th International Workshop, COSADE 2018, Singapore, April 23–
24, 2018, Proceedings 9, pages 227–244. Springer, 2018.
[SC78] Hiroaki Sakoe and Seibi Chiba. Dynamic programming algorithm optimization
for spoken word recognition. IEEE transactions on acoustics, speech, and signal
processing, 26(1):43–49, 1978.
[Sch00] Bruce Schneier. A self-study course in block-cipher cryptanalysis. Cryptologia,
24(1):18–33, 2000.
[Sei05] Jean-Pierre Seifert. On authenticated computing and RSA-based authentication.
In Proceedings of the 12th ACM conference on Computer and communications
security, pages 122–127, 2005.
[SGD08] Nidhal Selmane, Sylvain Guilley, and Jean-Luc Danger. Practical setup time
violation attacks on AES. In 2008 Seventh European Dependable Computing
Conference, pages 91–96. IEEE, 2008.
[SH07] Jörn-Marc Schmidt and Michael Hutter. Optical and EM fault-attacks on CRT-
based RSA: Concrete results. 2007.
[SH08] Jörn-Marc Schmidt and Christoph Herbst. A practical fault attack on square
and multiply. In 2008 5th Workshop on Fault Diagnosis and Tolerance in
Cryptography, pages 53–58. IEEE, 2008.
[Sha45] Claude E Shannon. A mathematical theory of cryptography. Mathematical Theory
of Cryptography, 1945.
References 483

[Sha97] A Shamir. Method and apparatus for protecting public key schemes from timing
and fault attacks. In EUROCRYPT’97, 1997.
[Sha00] Adi Shamir. Protecting smart cards from passive power analysis with detached
power supplies. In Cryptographic Hardware and Embedded Systems–CHES
2000: Second International Workshop Worcester, MA, USA, August 17–18, 2000
Proceedings 2, pages 71–77. Springer, 2000.
[SHS16] Bodo Selmke, Johann Heyszl, and Georg Sigl. Attack on a DFA protected AES
by simultaneous laser fault injections. In 2016 Workshop on Fault Diagnosis and
Tolerance in Cryptography (FDTC), pages 36–46. IEEE, 2016.
[SI20a] SOG-IS. Application of attack potential to smartcards and similar devices, v3.1,
2020.
[SI20b] SOG-IS. Attack methods for smartcards and similar devices, 2020.
[Sie88] Waclaw Sierpinski. Elementary Theory of Numbers: Second English Edition
(edited by A. Schinzel). Elsevier, 1988.
[Siv17] Nimisha Sivaraman. Design of magnetic probes for near field measurements and
the development of algorithms for the prediction of EMC. PhD thesis, Université
Grenoble Alpes, 2017.
[SJB+ 18] Sayandeep Saha, Dirmanto Jap, Jakub Breier, Shivam Bhasin, Debdeep
Mukhopadhyay, and Pallab Dasgupta. Breaking redundancy-based countermea-
sures with random faults and power side channel. In 2018 Workshop on Fault
Diagnosis and Tolerance in Cryptography (FDTC), pages 15–22. IEEE, 2018.
[SM12] Pushpa Saini and Rajesh Mehra. A novel technique for glitch and leakage power
reduction in CMOS vlsi circuits. International Journal of Advanced Computer
Science and Applications, 3(10), 2012.
[SM15] Tobias Schneider and Amir Moradi. Leakage assessment methodology: A clear
roadmap for side-channel evaluations. In Cryptographic Hardware and Embedded
Systems–CHES 2015: 17th International Workshop, Saint-Malo, France, Septem-
ber 13–16, 2015, Proceedings 17, pages 495–513. Springer, 2015.
[SMG16] Tobias Schneider, Amir Moradi, and Tim Güneysu. Parti–towards combined
hardware countermeasures against side-channel and fault-injection attacks. In
Advances in Cryptology–CRYPTO 2016: 36th Annual International Cryptology
Conference, Santa Barbara, CA, USA, August 14–18, 2016, Proceedings, Part II
36, pages 302–332. Springer, 2016.
[SMKLM02] Yen Sung-Ming, Seungjoo Kim, Seongan Lim, and Sangjae Moon. RSA speedup
with residue number system immune against hardware fault cryptanalysis. In
international conference on information security and cryptology, pages 397–413.
Springer, 2002.
[SMR09] Dhiman Saha, Debdeep Mukhopadhyay, and Dipanwita RoyChowdhury. A
diagonal fault attack on the advanced encryption standard. Cryptology ePrint
Archive, 2009.
[SMY09] François-Xavier Standaert, Tal G Malkin, and Moti Yung. A unified framework
for the analysis of side-channel key recovery attacks. In Advances in Cryptology-
EUROCRYPT 2009, pages 443–461. Springer, 2009.
[Sor84] Arthur Sorkin. Lucifer, a cryptographic algorithm. Cryptologia, 8(1):22–42, 1984.
[SP06] Kai Schramm and Christof Paar. Higher order masking of the AES. In Topics
in Cryptology–CT-RSA 2006: The Cryptographers’ Track at the RSA Conference
2006, San Jose, CA, USA, February 13–17, 2005. Proceedings, pages 208–225.
Springer, 2006.
[SS16] Peter Schwabe and Ko Stoffelen. All the AES you need on cortex-m3 and m4.
In International Conference on Selected Areas in Cryptography, pages 180–194.
Springer, 2016.
[Sta10] François-Xavier Standaert. Introduction to side-channel attacks. Secure integrated
circuits and systems, pages 27–42, 2010.
484 References

[Sti05] Douglas R Stinson. Cryptography: theory and practice. Chapman and Hall/CRC,
2005.
[SVK+ 03] H Saputra, N Vijaykrishnan, M Kandemir, MJ Irwin, and R Brooks. Masking
the energy behaviour of encryption algorithms. IEE Proceedings-Computers and
Digital Techniques, 150(5):274–284, 2003.
[SWM18] Robert Schilling, Mario Werner, and Stefan Mangard. Securing conditional
branches in the presence of fault attacks. In 2018 Design, Automation & Test in
Europe Conference & Exhibition (DATE), pages 1586–1591. IEEE, 2018.
[SWP03] Kai Schramm, Thomas Wollinger, and Christof Paar. A new class of collision
attacks and its application to DES. In Fast Software Encryption: 10th International
Workshop, FSE 2003, Lund, Sweden, February 24–26, 2003. Revised Papers 10,
pages 206–222. Springer, 2003.
[TAV02] Kris Tiri, Moonmoon Akmal, and Ingrid Verbauwhede. A dynamic and differential
CMOS logic with signal independent power consumption to withstand differential
power analysis on smart cards. In Proceedings of the 28th European solid-state
circuits conference, pages 403–406. IEEE, 2002.
[TBM14] Harshal Tupsamudre, Shikha Bisht, and Debdeep Mukhopadhyay. Destroying fault
invariant with randomization: A countermeasure for AES against differential fault
attacks. In Cryptographic Hardware and Embedded Systems–CHES 2014: 16th
International Workshop, Busan, South Korea, September 23–26, 2014. Proceedings
16, pages 93–111. Springer, 2014.
[THM07] Stefan Tillich, Christoph Herbst, and Stefan Mangard. Protecting AES software
implementations on 32-bit processors against power analysis. In Applied Cryptog-
raphy and Network Security: 5th International Conference, ACNS 2007, Zhuhai,
China, June 5–8, 2007. Proceedings 5, pages 141–157. Springer, 2007.
[TIA+ 23] M Caner Tol, Saad Islam, Andrew J Adiletta, Berk Sunar, and Ziming Zhang.
Don’t knock! rowhammer at the backdoor of DNN models. In 2023 53rd Annual
IEEE/IFIP International Conference on Dependable Systems and Networks (DSN),
pages 109–122. IEEE, 2023.
[Tim19] Benjamin Timon. Non-profiled deep learning-based side-channel attacks with sen-
sitivity analysis. IACR Transactions on Cryptographic Hardware and Embedded
Systems, pages 107–131, 2019.
[TKA+ 18] Andrei Tatar, Radhesh Krishnan Konoth, Elias Athanasopoulos, Cristiano Giuf-
frida, Herbert Bos, and Kaveh Razavi. Throwhammer: Rowhammer attacks
over the network and defenses. In 2018 USENIX Annual Technical Conference
(USENIX ATC 18), pages 213–226, 2018.
[TMA11] Michael Tunstall, Debdeep Mukhopadhyay, and Subidh Ali. Differential fault
analysis of the advanced encryption standard using a single fault. In Information
Security Theory and Practice. Security and Privacy of Mobile Devices in Wireless
Communication: 5th IFIP WG 11.2 International Workshop, WISTP 2011, Her-
aklion, Crete, Greece, June 1–3, 2011. Proceedings 5, pages 224–233. Springer,
2011.
[TSS+ 06] Pim Tuyls, Geert Jan Schrijen, Boris Skoric, Jan Van Geloven, Nynke Verhaegh,
and Rob Wolters. Read-proof hardware from protective coatings. In Ches,
volume 6, pages 369–383. Springer, 2006.
[TSS17] Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo. CLKSCREW: Expos-
ing the Perils of Security-Oblivious Energy Management. In 26th USENIX Security
Symposium (USENIX Security 17), pages 1057–1074, 2017.
[TV06] Kris Tiri and Ingrid Verbauwhede. A digital design flow for secure integrated
circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 25(7):1197–1208, 2006.
[UXT+ 22] Rei Ueno, Keita Xagawa, Yutaro Tanaka, Akira Ito, Junko Takahashi, and Naofumi
Homma. Curse of re-encryption: A generic power/EM analysis on post-quantum
References 485

KEMs. IACR Transactions on Cryptographic Hardware and Embedded Systems,


pages 296–322, 2022.
[Vai01] PP Vaidyanathan. Generalizations of the sampling theorem: Seven decades after
nyquist. IEEE Transactions on Circuits and Systems I: Fundamental Theory and
Applications, 48(9):1094–1109, 2001.
[VCGRS13] Nicolas Veyrat-Charvillon, Benoît Gérard, Mathieu Renauld, and François-Xavier
Standaert. An optimal key enumeration algorithm and its application to side-
channel attacks. In Selected Areas in Cryptography: 19th International Conference,
SAC 2012, Windsor, ON, Canada, August 15–16, 2012, Revised Selected Papers 19,
pages 390–406. Springer, 2013.
[VCGS14] Nicolas Veyrat-Charvillon, Benoît Gérard, and François-Xavier Standaert. Soft
analytical side-channel attacks. In Advances in Cryptology–ASIACRYPT 2014:
20th International Conference on the Theory and Application of Cryptology and
Information Security, Kaoshiung, Taiwan, ROC, December 7–11, 2014. Proceed-
ings, Part I 20, pages 282–296. Springer, 2014.
[VDVFL+ 16] Victor Van Der Veen, Yanick Fratantonio, Martina Lindorfer, Daniel Gruss,
Clémentine Maurice, Giovanni Vigna, Herbert Bos, Kaveh Razavi, and Cristiano
Giuffrida. Drammer: Deterministic rowhammer attacks on mobile platforms. In
Proceedings of the 2016 ACM SIGSAC conference on computer and communica-
tions security, pages 1675–1689, 2016.
[VEW12] Camille Vuillaume, Takashi Endo, and Paul Wooderson. RSA key generation:
new attacks. In Constructive Side-Channel Analysis and Secure Design: Third
International Workshop, COSADE 2012, Darmstadt, Germany, May 3–4, 2012.
Proceedings 3, pages 105–119. Springer, 2012.
[Vig08] David Vigilant. RSA with crt: A new cost-effective solution to thwart fault attacks.
In International Workshop on Cryptographic Hardware and Embedded Systems,
pages 130–145. Springer, 2008.
[VW01] Manfred Von Willich. A technique with an information-theoretic basis for
protecting secret data from differential power attacks. In IMA International
Conference on Cryptography and Coding, pages 44–62. Springer, 2001.
[vWWB11] Jasper GJ van Woudenberg, Marc F Witteman, and Bram Bakker. Improving
differential power analysis by elastic alignment. In Topics in Cryptology–CT-RSA
2011: The Cryptographers’ Track at the RSA Conference 2011, San Francisco, CA,
USA, February 14–18, 2011. Proceedings, pages 104–119. Springer, 2011.
[Wal02] Colin D Walter. Mist: An efficient, randomized exponentiation algorithm for resist-
ing power analysis. In Topics in Cryptology—CT-RSA 2002: The Cryptographers’
Track at the RSA Conference 2002 San Jose, CA, USA, February 18–22, 2002
Proceedings, pages 53–66. Springer, 2002.
[Wel47] Bernard L Welch. The generalization of ‘student’s’ problem when several different
population varlances are involved. Biometrika, 34(1–2):28–35, 1947.
[WHJ+ 21] Yoo-Seung Won, Xiaolu Hou, Dirmanto Jap, Jakub Breier, and Shivam Bhasin.
Back to the basics: Seamless integration of side-channel pre-processing in deep
neural networks. IEEE Transactions on Information Forensics and Security,
16:3215–3227, 2021.
[WJB20] Yoo-Seung Won, Dirmanto Jap, and Shivam Bhasin. Push for more: On compari-
son of data augmentation and smote with optimised deep learning architecture for
side-channel. In Information Security Applications: 21st International Conference,
WISA 2020, Jeju Island, South Korea, August 26–28, 2020, Revised Selected Papers
21, pages 227–241. Springer, 2020.
[WKKG04] Kaijie Wu, Ramesh Karri, Grigori Kuznetsov, and Michael Goessel. Low
cost concurrent error detection for the advanced encryption standard. In 2004
International Conference on Test, pages 1242–1248. IEEE, 2004.
[WP20] Lichao Wu and Stjepan Picek. Remove some noise: On pre-processing of side-
channel measurements with autoencoders. IACR Transactions on Cryptographic
Hardware and Embedded Systems, pages 389–415, 2020.
486 References

[WPP22] Lichao Wu, Guilherme Perin, and Stjepan Picek. I choose you: Automated
hyperparameter tuning for deep learning-based side-channel analysis. IEEE
Transactions on Emerging Topics in Computing, 2022.
[WvWM11] Marc F Witteman, Jasper GJ van Woudenberg, and Federico Menarini. Defeating
RSA multiply-always and message blinding countermeasures. In Topics in
Cryptology–CT-RSA 2011: The Cryptographers’ Track at the RSA Conference
2011, San Francisco, CA, USA, February 14–18, 2011. Proceedings, pages 77–88.
Springer, 2011.
[WW10] Gaoli Wang and Shaohui Wang. Differential fault analysis on present key schedule.
In 2010 International Conference on Computational Intelligence and Security,
pages 362–366. IEEE, 2010.
[XIU+ 21] Keita Xagawa, Akira Ito, Rei Ueno, Junko Takahashi, and Naofumi Homma.
Fault-injection attacks against nist’s post-quantum cryptography round 3 KEM
candidates. In International Conference on the Theory and Application of
Cryptology and Information Security, pages 33–61. Springer, 2021.
[XLZ+ 18] Sen Xu, Xiangjun Lu, Kaiyu Zhang, Yang Li, Lei Wang, Weijia Wang, Haihua
Gu, Zheng Guo, Junrong Liu, and Dawu Gu. Similar operation template attack on
RSA-crt as a case study. Science China Information Sciences, 61:1–17, 2018.
[XZY+ 20] Guorui Xu, Fan Zhang, Bolin Yang, Xinjie Zhao, Wei He, and Kui Ren. Pushing
the limit of PFA: enhanced persistent fault analysis on block ciphers. IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems,
40(6):1102–1116, 2020.
[Yeh14] James J Yeh. Real analysis: theory of measure and integration. World Scientific
Publishing Company, 2014.
[YJ00] Sung-Ming Yen and Marc Joye. Checking before output may not be enough against
fault-based cryptanalysis. IEEE Transactions on computers, 49(9):967–970, 2000.
[YKM06] Sung-Ming Yen, Dongryeol Kim, and SangJae Moon. Cryptanalysis of two
protocols for RSA with crt based on fault infection. In International Workshop
on Fault Diagnosis and Tolerance in Cryptography, pages 53–61. Springer, 2006.
[YMY+ 20] Honggang Yu, Haocheng Ma, Kaichen Yang, Yiqiang Zhao, and Yier Jin. Deepem:
Deep neural networks model recovery through em side-channel information leak-
age. In 2020 IEEE International Symposium on Hardware Oriented Security and
Trust (HOST), pages 209–218. IEEE, 2020.
[YRF20] Fan Yao, Adnan Siraj Rakin, and Deliang Fan. {DeepHammer}: Depleting the
intelligence of deep neural networks through targeted chain of bit flips. In 29th
USENIX Security Symposium (USENIX Security 20), pages 1463–1480, 2020.
[ZBHV20] Gabriel Zaid, Lilian Bossuet, Amaury Habrard, and Alexandre Venelli. Method-
ology for efficient CNN architectures in profiling attacks. IACR Transactions on
Cryptographic Hardware and Embedded Systems, pages 1–36, 2020.
[ZDT+ 14] Loic Zussa, Amine Dehbaoui, Karim Tobich, Jean-Max Dutertre, Philippe Mau-
rine, Ludovic Guillaume-Sage, Jessy Clediere, and Assia Tria. Efficiency of a
glitch detector against electromagnetic fault injection. In 2014 Design, Automation
& Test in Europe Conference & Exhibition (DATE), pages 1–6. IEEE, 2014.
[ZLZ+ 18] Fan Zhang, Xiaoxuan Lou, Xinjie Zhao, Shivam Bhasin, Wei He, Ruyi Ding,
Samiya Qureshi, and Kui Ren. Persistent fault analysis on block ciphers. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 150–172,
2018.
[ZZJ+ 20] Fan Zhang, Yiran Zhang, Huilong Jiang, Xiang Zhu, Shivam Bhasin, Xinjie Zhao,
Zhe Liu, Dawu Gu, and Kui Ren. Persistent fault attack in practice. IACR
Transactions on Cryptographic Hardware and Embedded Systems, pages 172–195,
2020.
[ZZY+ 19] Yiran Zhang, Fan Zhang, Bolin Yang, Guorui Xu, Bin Shao, Xinjie Zhao, and Kui
Ren. Persistent fault injection in FPGA via BRAM modification. In 2019 IEEE
Conference on Dependable and Secure Computing (DSC), pages 1–6. IEEE, 2019.
Index

Symbols parity-check matrix, 63


.Mn×n (R), 23 size, 57
.R
d
, 67 Bit, 20
.ϕ(n), 38 Bitwise AND, 27
n-bit binary string, 27 Bitwise OR, 27
.Zn , 34 Bitwise XOR, 27

.Zn , 38 Blakely’s method, 182
Boolean function, 159
algebraic normal form, 160
A indicator function, 160
Advanced Encryption Standard (AES), truth table, 159
139 Borel set, 67
AES T-tables, 158 Byte, 27
Affine cipher, 109

C
B Caesar cipher, 108
Bellcore attack, 392 CBC mode, 127
Binary code, 57 Chinese Remainder Theorem, 44
anticode, 65 Correlation coefficient, 81
binary .(n, M)−code, 57 Cryptographic primitives, 102
binary .(n, M, d)−code, 58 Cryptosystem, 104
binary .[n, k, d]− linear code, 61 block cipher, 105
codeword, 57 block length, 132
dimension, 61 Feistel cipher, 133
dual code, 62 key length, 132
error correcting, 59 key schedule, 132
error-detecting, 58 master key, 132
generator matrix, 63 round function, 132
length, 57 Sbox, 133
linear, 60 SPN cipher, 133
maximum distance, 65 computationally secure, 107
(minimum) distance, 58 perfectly secure, 107, 124
.n−repetition code, 61 secure in practice, 107
parity-check code, 62 stream cipher, 105

© The Editor(s) (if applicable) and The Author(s), under exclusive license to 487
Springer Nature Switzerland AG 2024
X. Hou, J. Breier, Cryptography and Embedded Systems Security,
https://doi.org/10.1007/978-3-031-62205-2
488 Index

D order, 14
Data Encryption Standard (DES), 135 order of an element, 15
Difference distribution table, 276 symmetric group of degree n, 14
Differential fault analysis, 354 Guessing entropy, 270
Distribution, 72
.χ −distribution, 83
2

Gaussian distribution, 80 H
multivariate normal distribution, 80 Hamming distance, 58
normal distribution, 77 Hamming weight, 62
standard normal distribution, 74 Hash function, 103
.t−distribution, 84 Hill cipher, 114
uniform, 74

I
E Infective countermeasure, 386, 414
ECB mode, 126 Integer
Equivalence class, 33 base.−b representation, 6
Equivalence relation, 33 Bézout’s identity, 7
Euler’s Theorem, 39 binary representation, 6
Euler’s totient function, 38 bit length, 6
Event, 65 composite (number), 10
independent, 69 congruence class modulo n, 34
congruent modulo n, 33
Euclidean algorithm, 9
F Euclid’s division, 8
Fault mask, 353 extended Euclidean algorithm, 10
Fault model, 353 Fundamental Theorem of Arithmetic, 11
Fermat’s Little Theorem, 40 greatest common divisor, 7
Field, 18 hexadecimal representation, 6
characteristic, 19 modulus, 33
.Fp n , 20 prime (number), 10
finite field, 18 Integral domain, 18
isomorphism, 20 Interval estimator, 88
subfield, 19
Forgery, 168
existential forgery, 168 K
selective forgery, 168 Kerckhoffs’ principle, 106
Frequency analysis, 117
Function, 3
bijective, 4 L
codomain, 3 Linear congruence, 42
composition, 4 Logical AND, 17
domain, 3 Logical XOR, 13
injective, 4
inverse, 4
surjective, 4 M
Matrix, 21
addition, 22
G adjoint matrix, 25
Garner’s algorithm, 176 determinant, 24
Gauss’s algorithm, 176 diagonal, 21
Group, 12 identity matrix, 21
abelian, 12 inverse, 23
cyclic, 15 multiplication, 22
Index 489

rank, 449 Random vector, 80


scalar product, 22 Ring, 16
square matrix, 21 commutative, 16
transpose, 21 unit, 17
Measurable space, 66 zero divisor, 17
Montgomery powering ladder, 174 RSA, 165
Montgomery product algorithm (MonPro), RSA signatures, 168
192, 194

S
N Safe error attack, 402
Nibble, 27 Sample mean, 86
Sample space, 65
Sample variance, 86
O
Set, 1
OFB mode, 127
cardinality, 1
One-time pad, 123
Cartesian product, 2
complement, 2
P difference, 2
Permutation, 13 intersection, 2
Persistent fault analysis, 375 power set, 1
Point estimator, 88 union, 2
Polynomial, 48 Shamir’s countermeasure, 411
congruence class, 51 Shift cipher, 108
congruent modulo .f (x), 51 Square and multiply algorithm, 171
degree, 48 left-to-right, 172
Division Algorithm, 49 right-to-left, 171
greatest common divisor, 52 Statistical fault analysis, 368
polynomial ring, 48 Student’s t-test, 98
reducible, 49 Substitution cipher, 111
PRESENT, 149 Success rate, 270
Probability measure, 67 System of simultaneous congruences, 42
Bayes’ Theorem, 70
conditional probability, 69
probability, 67 T
probability space, 67 Test vector leakage assessment (TVLA), 218
uniform, 68

V
R Vector space, 26
Random variable, 71 basis, 30
continuous, 73 dimension, 31
expectation, 75 generating set, 29
covariance, 78 linearly independent, 29
cumulative distribution function (CDF), 72 orthogonal complement, 32
discrete, 72 scalar, 26
expectation, 74 subspace, 28
independent, 78 vector, 26
normal random variable, 77 Vigenère cipher, 113
probability density function (PDF), 73
probability mass function (PMF), 73
standard normal random variable, 74 W
uncorrelated, 79 Welch’s .t−test, 99
variance, 76 Word size of an architecture, 106

You might also like