0% found this document useful (0 votes)
652 views432 pages

Cryptography and Coding

Libro de criptografía

Uploaded by

Ariel López
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
652 views432 pages

Cryptography and Coding

Libro de criptografía

Uploaded by

Ariel López
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture Notes in Computer Science 4887

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Moshe Y. Vardi
Rice University, Houston, TX, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Steven D. Galbraith (Ed.)

Cryptography
and Coding

11th IMA International Conference


Cirencester, UK, December 18-20, 2007
Proceedings

13
Volume Editor

Steven D. Galbraith
Royal Holloway University of London
Mathematics Department
Egham, Surrey, TW20 0EX, UK
E-mail: [Link]@[Link]

Library of Congress Control Number: 2007940956

CR Subject Classification (1998): E.3-4, G.2.1, C.2, J.1

LNCS Sublibrary: SL 4 – Security and Cryptology

ISSN 0302-9743
ISBN-10 3-540-77271-5 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-77271-2 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
[Link]
© Springer-Verlag Berlin Heidelberg 2007
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12203093 06/3180 543210
Preface

The 11th IMA Conference on Cryptography and Coding was held at the Royal
Agricultural College, Cirencester, UK during December 18–20, 2007. As usual,
the venue provided a relaxed and convivial atmosphere for attendees to enjoy
the conference programme and discuss current and future research ideas.
The programme comprised three invited talks and 22 contributed papers. The
invited speakers were Jonathan Katz (University of Maryland, USA), Patrick
Solé (Ecole Polytechnique de l’Université de Nice-Sophia Antipolis, France) and
Whit Diffie (Sun Microsystems, USA). Special thanks are due to these speakers.
Two of the invited speakers provided papers, included in this volume, which high-
light the connections between cryptography, coding theory and discrete mathe-
matics.
The contributed talks were selected from 48 submissions. The accepted pa-
pers cover a range of topics in mathematics and computer science, including
symmetric and public key cryptography, Boolean functions, sequences, efficient
implementation and side-channel analysis.
I would like to thank all the people who helped with the conference pro-
gramme and organization. First, I thank the Steering Committee for their guid-
ance on the general format of the conference and for suggestions of members
of the Programme Committee. I also heartily thank the Programme Committee
and the sub-reviewers listed on the following pages for their thoroughness during
the review process. Each paper was reviewed by at least three people. There was
significant online discussion about a number of papers.
The submission and review process was greatly simplified by the ichair soft-
ware developed by Thomas Baignères and Matthieu Finiasz. Thanks also to Jon
Hart for running the submissions Web server and Sriram Srinivasan for designing
and maintaining the conference Web page.
Thanks go to the authors of all submitted papers. I also thank the authors
of accepted papers for revising their papers according to referee suggestions and
returning latex source files in good time. The revised versions were not checked by
the Programme Committee so authors bear full responsibility for their contents.
I thank the staff at Springer for their help with producing the proceedings.
I thank Hewlett-Packard and Vodafone for their sponsorship of this event.
Finally, I wish to thank the conference staff of the Institute for Mathematics
and its Applications, especially Lucy Nye and Sammi Lauesen, for their help
with running the conference and handling the finances.

October 2007 Steven Galbraith


Cryptography and Coding 2007
Royal Agricultural College, Cirencester, UK
December 18–20, 2007

Sponsored by
The Institute of Mathematics and its Applications
in cooperation with
Hewlett-Packard Laboratories and Vodafone Ltd.

Programme Chair
Steven Galbraith Royal Holloway University of London

Steering Committee
Bahram Honary Lancaster University
Chris Mitchell Royal Holloway
Kenny Paterson Royal Holloway
Fred Piper Royal Holloway
Nigel Smart University of Bristol
Mike Walker Vodafone Ltd. and Royal Holloway

Programme Committee
Steve Babbage Vodafone Group Services Ltd.
Nigel Boston South Carolina/Wisconsin
Pascale Charpin INRIA Rocquencourt
Liqun Chen Hewlett-Packard
Carlos Cid Royal Holloway
YoungJu Choie Postech, Korea
Arjen Lenstra EPFL, Lausanne
Alexander May Ruhr Universität Bochum
Gary McGuire University College Dublin
Alfred Menezes University of Waterloo
David Naccache Ecole Normale Supérieure
Matthew Parker University of Bergen
Matt Robshaw France Telecom
Ana Sălăgean Loughborough University
Berry Schoenmakers Technical University Eindhoven
Michael Scott Dublin City University
Amin Shokrollahi EPFL, Lausanne
Nigel Smart University of Bristol
Frederik Vercauteren K. U. Leuven
Gilles Zemor Université Bordeaux
VIII Organization

External Reviewers
Joppe Bos Christophe De Cannière Anne Canteaut
Claude Carlet Mahdi Cheraghchi Alex Dent
Fabien Galand Philippe Guillot Darrel Hankerson
Marcelo Kaihara Cedric Lauradoux Lorenz Minder
Marine Minier Stephane Manuel Ashley Montanaro
Sean Murphy Dag Arne Osvik Gregory Neven
Dan Page Kenny Paterson Benny Pinkas
Louis Salvail Amin Shokrollahi Andrey Sidorenko
Patrick Solé Martijn Stam Søren Steffen Thomsen
Eran Tromer José Villegas
Table of Contents

Invited Papers
Efficient Cryptographic Protocols Based on the Hardness of Learning
Parity with Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Jonathan Katz

Galois Rings and Pseudo-random Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 16


Patrick Solé and Dmitrii Zinoviev

Signatures I
Finding Invalid Signatures in Pairing-Based Batches . . . . . . . . . . . . . . . . . . 34
Laurie Law and Brian J. Matt

How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts . . . . . . . . . 54


Tetsuya Izu, Takeshi Shimoyama, and Masahiko Takenaka

Boolean Functions
Efficient Computation of the Best Quadratic Approximations of Cubic
Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Nicholas Kolokotronis, Konstantinos Limniotis, and
Nicholas Kalouptsidis

On the Walsh Spectrum of a New APN Function . . . . . . . . . . . . . . . . . . . . 92


Carl Bracken, Eimear Byrne, Nadya Markin, and Gary McGuire

Non-linear Cryptanalysis Revisited: Heuristic Search for


Approximations to S-Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Juan M.E. Tapiador, John A. Clark, and Julio C. Hernandez-Castro

Block Cipher Cryptanalysis


Cryptanalysis of the EPBC Authenticated Encryption Mode . . . . . . . . . . 118
Chris J. Mitchell

Blockwise-Adaptive Chosen-Plaintext Attack and Online Modes of


Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Gregory V. Bard

Algebraic Cryptanalysis of the Data Encryption Standard . . . . . . . . . . . . . 152


Nicolas T. Courtois and Gregory V. Bard
X Table of Contents

Side Channels

Cryptographic Side-Channels from Low Power Cache Memory . . . . . . . . . 170


Philipp Grabher, Johann Großschädl, and Daniel Page

New Branch Prediction Vulnerabilities in OpenSSL and Necessary


Software Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Onur Acıiçmez, Shay Gueron, and Jean-Pierre Seifert

Linear Complexity
Remarks on the New Attack on the Filter Generator and the Role of
High Order Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Panagiotis Rizomiliotis

Modified Berlekamp-Massey Algorithm for Approximating the k-Error


Linear Complexity of Binary Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Alexandra Alecu and Ana Sălăgean

Public Key Encryption


Efficient KEMs with Partial Message Recovery . . . . . . . . . . . . . . . . . . . . . . 233
Tor E. Bjørstad, Alex W. Dent, and Nigel P. Smart

Randomness Reuse:Extensions and Improvements . . . . . . . . . . . . . . . . . . . . 257


Manuel Barbosa and Pooya Farshim

On the Connection Between Signcryption and One-Pass Key


Establishment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
M. Choudary Gorantla, Colin Boyd, and
Juan Manuel González Nieto

Curves
Optimised Versions of the Ate and Twisted Ate Pairings . . . . . . . . . . . . . . 302
Seiichi Matsuda, Naoki Kanayama, Florian Hess, and Eiji Okamoto

Extractors for Jacobian of Hyperelliptic Curves of Genus 2 in Odd


Characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Reza Rezaeian Farashahi

Constructing Pairing-Friendly Elliptic Curves Using Gröbner Basis


Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Waldyr D. Benits Junior and Steven D. Galbraith
Table of Contents XI

RSA Implementation
Efficient 15,360-bit RSA Using Woop-Optimised Montgomery
Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Kamel Bentahar and Nigel P. Smart

Toward Acceleration of RSA Using 3D Graphics Hardware . . . . . . . . . . . . 364


Andrew Moss, Daniel Page, and Nigel P. Smart

Signatures II
Multi-key Hierarchical Identity-Based Signatures . . . . . . . . . . . . . . . . . . . . . 384
Hoon Wei Lim and Kenneth G. Paterson

Verifier-Key-Flexible Universal Designated-Verifier Signatures . . . . . . . . . 403


Raylin Tso, Juan Manuel González Nieto, Takeshi Okamoto,
Colin Boyd, and Eiji Okamoto

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423


Efficient Cryptographic Protocols Based on the
Hardness of Learning Parity with Noise

Jonathan Katz

Dept. of Computer Science


University of Maryland
jkatz@[Link]

Abstract. The problem of learning parity with noise (The LPN prob-
lem), which can be re-cast as the problem of decoding a random linear
code, has attracted attention recently as a possible tool for developing
highly-efficient cryptographic primitives suitable for resource-constrained
devices such as RFID tags. This article surveys recent work aimed at de-
signing efficient authentication protocols based on the conjectured hard-
ness of this problem.

1 Introduction
1.1 The LPN Problem
Fix a binary vector (i.e., a bit-string) s of length k. Given a sequence of randomly-
chosen binary vectors a1 , . . . , a along with the values of their inner-product
zi = s, ai  with s, it is a simple matter to reconstruct s in its entirety as soon as
 is slightly larger than k. (All that is needed is to wait until the set {ai } contains
k linearly-independent vectors.) In the presence of noise, however, where each
bit zi is flipped (independently) with probability ε, determining s becomes much
more difficult. We refer to the problem of learning s in this latter case as the
problem of learning parity with noise, or the LPN problem.
Formally, let Berε be the Bernoulli distribution with parameter ε ∈ (0, 12 ) (so
if ν ∼ Berε then Pr[ν = 1] = ε and Pr[ν = 0] = 1 − ε), and let As,ε be the
distribution defined by:
 
a ← {0, 1}k ; ν ← Berε : (a, s, a ⊕ ν) .

Let As,ε also denote an oracle which outputs (independent) samples according
to this distribution. Algorithm M is said to (t, q, δ)-solve the LPNε problem if
 
Pr s ← {0, 1}k : M As,ε (1k ) = s ≥ δ,

and furthermore M runs in time at most t and makes at most q queries to its
oracle. (This formulation of the LPN problem follows [18]; an alternative but

Supported in part by NSF CyberTrust grant #0627306 and NSF CAREER award
#0447075.

S.D. Galbraith (Eds.): Cryptography and Coding 2007, LNCS 4887, pp. 1–15, 2007.

c Springer-Verlag Berlin Heidelberg 2007
2 J. Katz

essentially equivalent formulation allows M to output any s satisfying at least a


(1−ε) fraction of the equations returned by As,ε .) In asymptotic terms, the LPNε
problem is “hard” if every probabilistic polynomial-time algorithm M solves the
LPNε problem with only negligible probability (where the algorithm’s running
time and success probability are functions of k).
Note that ε is usually taken to be a fixed constant independent of k, as we will
assume here. The value of ε to use depends on a number of tradeoffs and design
decisions: although, roughly speaking, the LPNε problem becomes “harder” as
ε increases, a larger value of ε also affects the error rate (for honest parties) in
schemes based on the LPN problem; this will becomes more clear in the sections
that follow. For concreteness, the reader can think of ε ≈ 18 .
The hardness of the LPNε problem, for any constant ε ∈ (0, 12 ), has been
studied in many previous works. It can be formulated also as the problem of
decoding a random linear code, and is known to be N P-complete [2] as well as
hard to approximate within a factor better than 2 (where the optimization prob-
lem is phrased as finding an s satisfying the most equations) [12]. These worst-
case hardness results are complemented by numerous studies of the average-case
hardness of the problem [3,4,6,20,13,14,24]. Currently, the best algorithms for
solving the LPNε problem [4,9,23] require t, q = 2Θ(k/ log k) to achieve δ = O(1).
We refer the reader to [23] for additional heuristic improvements, as well as a
tabulation of the time required to solve the LPNε problem (for various settings
of the parameters) using the best-known algorithm.
The LPN problem can be generalized to fields other than F2 (or even other
algebraic structures such as rings), and these generalizations have interesting
cryptographic consequences also [24]. Such extensions will not be discussed here.

1.2 Cryptographic Applications of the LPN Problem

It is not too difficult to see that hardness of the LPNε problem implies the
existence of a one-way function. More interesting is that such hardness would
imply efficient and direct constructions of pseudorandom generators [3,24]; see
Lemma 1 for an indication of the basic underlying ideas. Furthermore, gener-
ating an instance of the distribution As,ε is extremely “cheap”, requiring only
k bit-wise “AND” operations and k − 1 “XOR” operations.1 Finally, as men-
tioned earlier, the best-known algorithms for solving the LPNε problem are only
slightly sub-exponential in the length k of the hidden vector. Taken together,
these observations suggest the possibility of using the LPNε problem to construct
efficient cryptographic primitives and protocols, as first suggested in [3].
Actually, if the LPNε problem is indeed “hard” enough, there is the potential
of using it to construct extremely efficient cryptographic primitives, suitable
either for implementation by humans (using pencil-and-paper) [13,14] or for
implementation on low-cost radio-frequency identification (RFID) tags [17] or
sensor nodes. Focusing on the case of RFID tags, Juels and Weis [17,25] estimate
1
This assumes that generating the appropriate random coins is “free”, which may not
be a reasonable assumption in practice.
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 3

that current RFID tags contain, in the best case, ≈ 2000 gate equivalents that
can be dedicated to performing security functions; even optimized block cipher
implementations may require many more gates than this (see [8,1] for current
state-of-the-art).

1.3 Efficient Authentication Based on the LPN Problem


In the remainder of this work, we survey recent work directed toward developing
authentication protocols based on the LPN problem; these protocols have been
suggested as suitable for the secure identification of RFID tags. All protocols we
will consider are intended for the shared-key (i.e., symmetric-key) setting, and
provide unidirectional authentication only; typically, this would permit an RFID
tag, acting as a prover, to authenticate itself to a tag reader acting as a verifier.
We begin with a brief outline of the history of the developments, and defer all
technical details to the sections that follow.
The first protocol we will present — following [17], we will refer to it as the HB
protocol — was introduced by Hopper and Blum [13,14] and provides security
against a passive (eavesdropping) adversary. Juels and Weis [17,25] were the
first to rigorously prove security of the HB protocol, and to suggest its use for
RFID authentication. (Hopper and Blum proposed it as a way to authenticate
humans using pencil-and-paper only.) Juels and Weis also proposed a second
protocol, called HB+ , that could be proven secure against an active attacker
who can impersonate the tag reader to an RFID tag. In each case, Juels and
Weis focus on a single, “basic authentication step” of the protocol and prove
that a computationally-bounded adversary cannot succeed in impersonating a
tag in this case with probability noticeably better than 1/2; that is, a single
iteration of the protocol has soundness error 1/2. The implicit assumption is
that repeating these “basic authentication steps” sufficiently-many times yields
a protocol with negligible soundness error, though this intuition was not formally
proven by Juels and Weis.
Two papers of my own [18,19] (along with Ji-Sun Shin and Adam Smith)
provide a simpler and improved analysis of the HB and HB+ protocols. Be-
sides giving what is arguably a cleaner framework for analyzing the security
of these protocols, the proofs in these works also yield the following concrete
improvements: (1) they show that the HB+ protocol remains secure under arbi-
trary concurrent executions of the protocol; this, in particular, means that the
HB+ protocol can be parallelized so as to run in 3 rounds (regardless of the
desired soundness error); (2) the proofs explicitly incorporate the dependence
of the soundness error on the number of iterations of a “basic authentication
step”; and (3) the proofs deal with the inherent error probability in even honest
executions of the protocol. (The reader is referred to [18] for further detailed dis-
cussion of these points.) The initial work [18] was limited to the case of ε < 1/4;
subsequent work [19] extended these results to the case of arbitrary ε < 1/2.
In work tangential to the above, Gilbert et al. [10] show that the HB+ protocol
is not secure against a man-in-the-middle attack, in the sense that a man-in-the-
middle attacker is able to reconstruct the entire secret key of the RFID tag after
4 J. Katz

sufficiently-many interactions. (The reader is additionally referred to the work


of Wool et al. [21,22], for an illuminating discussion on the feasibility of man-in-
the-middle attacks in RFID systems.) This has motivated numerous proposals
(e.g., [5]) of HB-variants that are claimed to be secure against the specific attack
of Gilbert et al., but I am not aware of any HB-variant that is provably-secure
against all man-in-the-middle attacks. To my mind, the existence of a man-in-
the-middle attack on the HB+ protocol shows that the protocol must be used
with care, but does not rule out its usefulness; specifically, I view the aim of
the line of research considered here to be the development of protocols which
are exceptionally efficient while still guaranteeing some useful level of (provable)
security. The possibility of man-in-the-middle attacks does not mean that it
is useless to explore the security of authentication protocols in weaker attack
models. Furthermore, as a practical matter, Juels and Weis [17, Appendix A]
note that the man-in-the-middle attack of [10] does not apply in a detection-based
system where numerous failed authentication attempts immediately raise an
alarm. Nevertheless, the design of an HB-variant with provable security against
man-in-the-middle attacks remains an interesting open problem.

1.4 Overview of This Paper


The remainder of this paper is devoted to a description of the HB and HB+ pro-
tocols, as well as the technical proofs of security for these protocols (adapted from
[18,19]). It is not the goal of this paper to replace [18,19]; instead, the main moti-
vation is to give a high-level treatment of the proofs with a focus on those aspects
that might be of greatest interest to coding-theorists. Proof steps that are “techni-
cal” but otherwise uninteresting will be glossed over, and some proofs are omitted
entirely. The interested reader can find full details of all proofs in [18,19].

2 Definitions and Preliminaries


We have already formally defined the LPN problem in the Introduction. Here, we
state and prove the main technical lemma on which we will rely. We also define
notion(s) of security for identification; these are standard, but some complications
arise due to the fact that the HB/HB+ protocols do not have perfect completeness.

2.1 A Technical Lemma


In this section we prove a key technical lemma due to Regev [24, Sect. 4] (though
without the explicit dependence on the parameters given below, which is taken
from [18]): hardness of the LPNε problem implies “pseudorandomness” of As,ε .
Specifically, let Uk+1 denote the uniform distribution on (k + 1)-bit strings. The
following lemma shows that oracle access to As,ε (for randomly-chosen s) is
indistinguishable from oracle access to Uk+1 .
Lemma 1. Say there exists an algorithm D making q oracle queries, running
in time t, and such that
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 5

    
Pr s ← {0, 1}k : DAs,ε (1k ) = 1 − Pr DUk+1 (1k ) = 1  ≥ δ.
 
Then there exists an algorithm
 M making
 q  = O q · δ −2 log k oracle queries,
running in time t = O t · kδ −2 log k , and such that
 
Pr s ← {0, 1}k : M As,ε (1k ) = s ≥ δ/4.

(Various tradeoffs are possible between the number of queries/running time of


M and its success probability in solving LPNε ; see [24, Sect. 4]. We do not discuss
these here.)
Proof (Sketch). Algorithm M As,ε (1k ) proceeds as follows:

1. Fix random coins for D.


2. Estimate the probability that D outputs 1 when it interacts with oracle
Uk+1 . Call this estimate p.
3. For i ∈ [k] do:
(a) Estimate the probability that D outputs 1 when it interacts with an
oracle implementing the following distribution:
def  
hybi = a ← {0, 1}k ; c ← {0, 1}; ν ← Berε : (a ⊕ (c · ei ), s, a ⊕ ν) ,

where ei is the vector with 1 at position i and 0s elsewhere. Note that


M can generate this distribution using its own access to oracle As,ε .
Call the estimate obtained in this step pi .
(b) If |pi − p| ≥ δ/4 set si = 0; else set si = 1.
4. Output s = (s1 , . . . , sk ).

Let us analyze the behavior of M . First, note that with “high” probability
over choice of s and random coins for D it holds that
  A   
Pr D s,ε (1k ) = 1 − Pr DUk+1 (1k ) = 1  ≥ δ/2, (1)

where the probabilities are now taken only over the answers D receives from its
oracle. We restrict our attention to s, ω for which Eq. (1) holds and show that
in this case M outputs s = s with probability at least 1/2. The lemma follows.
Setting the accuracy of our estimations appropriately, we can ensure that
  U  
Pr D k+1 (1k ; ω) = 1 − p  ≤ δ/16 (2)

except with probability at most O(1/k). Now focus on a particular iteration i of


steps 3(a) and 3(b). We may once again ensure that
  hyb k  
Pr D i (1 ; ω) = 1 − pi  ≤ δ/16 (3)

except with probability at most O(1/k). Applying a union bound (and setting
parameters appropriately) we see that with probability at least 1/2 both Eqs. (2)
6 J. Katz

and (3) hold (the latter for all i ∈ [k]), and so we assume this to be the case for
the rest of the proof.
An easy observation is that if si = 0 then hybi = As,ε , while if si = 1 then
hybi = Uk+1 . It follows that if si = 0 then
  hyb k   
Pr D i (1 ; ω) = 1 − Pr DUk+1 (1k ; ω) = 1  ≥ δ/2

(by Eq. (1)), and so |pi − p | ≥ 2δ − 2 · 16


δ
= 3δ
8 (by Eqs. (2) and (3)) and

si = 0 = si . When si = 1 then
   
Pr Dhybi (1k ; ω) = 1 = Pr DUk+1 (1k ; ω) = 1 ,
and so |pi − p | ≤ 2 · 16
δ
= δ8 (again using Eqs. (2) and (3)) and si = 1 = si . Since
this holds for all i ∈ [k], we conclude that s = s.

2.2 Overview of the HB/HB+ Protocols, and Security Definitions


The HB and HB+ protocols as analyzed here consist of n parallel iterations
of a “basic authentication step.” (As remarked in the Introduction, the fact
that these iterations can be run in parallel follows from the proofs we will give
here.) We describe the basic authentication step for the HB protocol, and defer
a discussion of the HB+ protocol to Section 3.2. In the HB protocol, a tag T and
a reader R share a random secret key s ∈ {0, 1}k ; a basic authentication step
consists of the reader sending a random challenge a ∈ {0, 1}k to the tag, which
replies with z = s, a ⊕ ν for ν ∼ Berε . The reader can then verify whether the
?
response z of the tag satisfies z = s, a; we say the iteration is successful if this
is the case. See Figure 1.

T (s, ε) R(s)

 a a ← {0, 1}k
ν ← Berε
z := s, a ⊕ ν z -
?
verify: z = s, a

Fig. 1. The basic authentication step of the HB protocol

Even for an honest tag a basic iteration is unsuccessful with probability ε.


For this reason, a reader accepts upon completion of all n iterations of the
basic authentication step as long as at most ≈ ε · n of these iterations were
unsuccessful. More precisely, let u = u(k) be such that ε · n ≤ u; then the reader
accepts as long as the number of unsuccessful iterations is at most2 u. (Overall,
2
As suggested in [18], an improvement in practice is to also fix a lower bound l and
accept iff the number of unsuccessful iterations is in the range [l, u]. Setting l = 0 (as
we do here) makes no difference in an asymptotic sense.
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 7

then, the entire HB protocol is parameterized by ε, u, and n.) Since εn is the


expected number of unsuccessful iterations for an honest tag, the completeness
error εc (i.e., the probability that an honest tag is rejected) can be calculated
via a Chernoff bound. In particular, we have that for any positive constant δ,
setting u = (1 + δ)εn suffices to achieve εc negligible in n.
Observe that by sending random answers in each of the n iterations, an ad-
versary trying to impersonate a valid tag succeeds with probability
u

∗ def −n n
δε,u,n = 2 · ;
i=0
i

that is, δε,u,n is the best possible soundness error we can hope to achieve for the
given setting of the parameters. Asymptotically, as long as u ≤ (1 − δ) · n/2 for
positive constant δ, the success of this trivial attack will be negligible in n. (This
can again be analyzed using a Chernoff bound.) Our definitions of security will

be expressed in terms of the adversary’s ability to do better than δε,u,n .
Let Ts,ε,n denote the tag algorithm in the HB protocol when the tag holds
HB

secret key s (note that the tag algorithm is independent of u), and let RHB s,ε,u,n
similarly denote the algorithm run by the tag reader. We denote a complete exe-

cution of the HB protocol between a party T̂ and the reader R by T̂ , RHB s,ε,u,n
and say this equals 1 iff the reader accepts.
For a passive attack on the HB protocol, we imagine an adversary A running
in two stages: in the first stage the adversary obtains q transcripts3 of (honest)
executions of the protocol by interacting with an oracle transHB s,ε,n (this models
eavesdropping); in the second stage, the adversary interacts with the reader and
tries to impersonate the tag. We define the adversary’s advantage as
   
def transHB ∗
Advpassive
A,HB (ε, u, n) = Pr s ← {0, 1} k
; A s,ε,n (1k ) : A, RHB
s,ε,u,n = 1 − δε,u,n .

As we will describe in Section 3.2, the HB+ protocol uses two keys s1 , s2 . We
+ +
let TsHB
1 ,s2 ,ε,n
denote the tag algorithm in this case, and let RHB
s1 ,s2 ,ε,u,n denote
the algorithm run by the tag reader. For the case of an active attack on the
HB+ protocol, we again imagine an adversary running in two stages: in the first
stage the adversary interacts at most q times with the honest tag algorithm
(with concurrent executions allowed), while in the second stage the adversary
interacts only with the reader. The adversary’s advantage in this case is
Advactive
A,HB+ (ε, u, n)
 
HB+
= Pr s1 , s2 ← {0, 1}k ; ATs1 ,s2 ,ε,n (1k ) : A, RHB ∗
def +
s1 ,s2 ,ε,u,n = 1 − δε,u,n .

3
Following [13,14,17], a transcript comprises only the messages exchanged between
the parties and does not include the reader’s decision of whether or not to accept.
If the adversary is given this additional information, the adversary’s advantage may
increase by (at most) an additive factor of q · εc . Note further that, since u can be
set so that the reader accepts the honest tag with all but negligible probability, this
has no effect as far as asymptotic security is concerned.
8 J. Katz

We remark that in both the HB and HB+ protocols, the tag reader’s actions
are independent of the secret key(s) it holds except for its final decision whether
or not to accept. So, allowing the adversary to interact with the reader multiple
times (even concurrently) does not give the adversary much additional advantage
(other than the fact that, as usual, the probability that the adversary succeeds in
at least one impersonation attempt scales linearly with the number of attempts).

3 Proofs of Security for the HB and HB+ Protocols


3.1 Security of the HB Protocol Against Passive Attacks
We first prove security assuming ε < 1/4, and then show how the argument can
be extended for the case of ε < 1/2.

Theorem 1. Say there exists an adversary A eavesdropping on at most q execu-


tions of the HB protocol, running in time t, and achieving Advpassive
A,HB (ε, u, n) ≥ δ.
Then there exists an algorithm D making (q + 1) · n oracle queries, running in
time O(t), and such that
    
Pr s ← {0, 1}k : DAs,ε (1k ) = 1 − Pr DUk+1 (1k ) = 1 
2u

∗ −n n
≥ δ + δε,u,n − εc − 2 · .
i=0
i

Asymptotically, for any ε < 14 and n = Θ(k) all terms of the above expression
(other than δ) are negligible for appropriate choice of u. Thus, assuming the
hardness of the LPNε problem for ε < 1/4 and appropriate choice of n, u, the
HB protocol is secure against a passive adversary.

Proof. D, given access to an oracle returning (k + 1)-bit strings (a, z), proceeds
as follows:

1. D runs the first phase of A. Each time A requests to view a transcript of


the protocol, D obtains n samples {(ai , zi )}ni=1 from its oracle and returns
these to A.
2. For A’s second phase, D again obtains n samples {(āi , z̄i )}ni=1 from its
oracle. D then sends the challenge (ā1 , . . . , ān ) to A and receives in return
a response Z  = (z1 , . . . , zn ).
3. D outputs 1 iff Z̄ = (z̄1 , . . . , z̄n ) and Z  differ in at most 2u entries.

When D’soracle is Uk+1 , it is clear that D outputs 1 with probability ex-


2u  
actly 2−n · i=0 ni since Z̄ is in this case uniformly distributed and inde-
pendent of everything else. On the other hand, when D’s oracle is As,ε then
the transcripts D provides to A during the first phase of A’s execution are
distributed identically to real transcripts in an execution of the HB protocol.
def
Let Z ∗ = (s, ā1  , . . . , s, ān ) be the vector of “correct” answers to the chal-
lenge (ā1 , . . . , ān ) sent by D in the second phase. Then with probability at least
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 9


δ + δε,l,u,n it holds that Z  and Z ∗ differ in at most u entries (since A successfully
impersonates the tag with this probability). Also, since Z̄ is distributed exactly
as the answers of an honest tag, Z̄ and Z ∗ differ in at most u positions except

with probability at most εc . It follows that with probability at least δ +δε,u,n −εc

the vectors Z and Z̄ differ in at most 2u entries, and so D outputs 1 with at
least this probability.
2 u  
When ε ≥ 1/4 then 2u ≥ 2ε · n ≥ n/2 and so 2−n · i=0 ni ≥ 1/2; thus,
the above theorem does not guarantee meaningful security in this case and a
different analysis is needed.
Theorem 2. For n = Θ(k) and appropriate setting of u, the HB protocol is
(asymptotically) secure for any ε < 1/2.
Proof. Let u = ε+ n, where ε+ is a constant satisfying ε < ε+ < 12 . Set ε++ to
be a constant satisfying ε+ − 2ε+ ε + ε < ε++ < 12 . The reduction D we use here
is similar to that used in the previous proof, except that D now outputs 1 only
def
if Z̄ and Z  differ in at most u = ε++ · n entries.
When D’s oracle is Uk+1 , it is once again clear that D outputs 1 with proba-
 u  
bility 2−n · i=0 ni . Since u < n/2, this quantity is negligible in k.
When D’s oracle is As,ε then the transcripts D provides to A during the
first phase of A’s execution are distributed identically to real transcripts in an
def
execution of the HB protocol. Letting Z ∗ = (s, ā1  , . . . , s, ān ) be the vector
of correct answers to the challenge (ā1 , . . . , ān ) sent by D in the second phase,
it follows that with probability δ (i.e., the impersonation probability of A) the
vector of responses Z  given by A differs from Z ∗ in at most u entries. We show
below that, conditioned on this event, Z  and Z̄ differ in at most u entries with
all but negligible probability. Thus, D outputs 1 in this case with probability
negligibly close to δ. We conclude from Lemma 1 that δ must be negligible.
Let wt(Z) denote the weight of a vector Z; i.e., the number of entries of Z
equal to 1. Note that the distance between two binary vectors Z1 , Z2 is equal to
wt(Z1 ⊕ Z2 ). It remains to show that, conditioned on wt(Z  ⊕ Z ∗ ) ≤ u, we have
wt(Z  ⊕ Z̄) ≤ u with all but negligible probability.
Write Z  = Z ∗ ⊕ w for some vector w of weight at most u = ε+ n. The vector
Z̄ is selected by the following process: choose an error vector e by setting each
position of e (independently) to 1 with probability ε, and then set Z̄ = Z ∗ ⊕ e.
We see that the probability that Z̄ differs from Z  in at most u entries is equal
to the probability that

wt(Z  ⊕ Z̄) = wt(w ⊕ e) ≤ u .

It is easy to see that this probability is minimized when wt(w) = u, and so we


assume this to be the case. The random variable wt(w ⊕ e) can be written as a
sum of n indicator random variables, one for each position of the vector w ⊕ e.
The expectation of wt(w ⊕ e) is

u · (1 − ε) + (n − u) · ε = (ε+ − 2ε+ ε + ε) · n.
10 J. Katz

Since ε++ is a constant strictly larger than (ε+ − 2ε+ ε + ε), a Chernoff bound
then implies that wt(w ⊕ e) ≤ ε++ n with all but negligible probability.

3.2 Security of the HB+ Protocol Against Active Attacks


The HB protocol is insecure against an active attack, as an adversary can repeat-
edly query the tag with the same challenge (a1 , . . . , an ) and thereby determine
with high probability the correct values of s, a1  , . . . , s, an  (after which solv-
ing for s is easy). To combat such an attack, Juels and Weis [17] propose to
modify the HB protocol by having the tag and reader share two (independent)
keys s1 , s2 ∈ {0, 1}k . A basic authentication step now consists of three rounds:
first the tag sends a random “blinding factor” b ∈ {0, 1}k ; the reader replies
with a random challenge a ∈ {0, 1}k as before; and finally the tag replies with
z = s1 , b ⊕ s2 , a ⊕ ν for ν ∼ Berε . As in the HB protocol, the tag reader can
?
then verify whether the response z of the tag satisfies z = s1 , b ⊕ s2 , a, and
we again say the iteration is successful if this is the case. See Figure 2.

T (s1 , s2 , ε) R(s1 , s2 )

b ← {0, 1}k b -
 a a ← {0, 1}k
ν ← Berε
z := s1 , b ⊕ s2 , a ⊕ ν z -
?
verify: z = s1 , b ⊕ s2 , a

Fig. 2. The basic authentication step of the HB+ protocol

The actual HB+ protocol consists of n parallel iterations of the basic authenti-
cation step (and so the entire protocol requires only three rounds). The protocol
also depends upon parameters u as in the case of the HB protocol, and the values

εc and δε,u,n are defined exactly as there.
The following result shows that the HB+ protocol is secure against active
attacks, assuming the hardness of the LPNε problem. An important point is
that the proof does not require any rewinding of the adversary in simulating
the first phase of the attack, and the proof therefore holds even when various
executions of the HB+ protocol are run in parallel (or even concurrently).
We refer to [18] for a simpler proof when ε < 1/4, and present here only the
more complicated proof (from [19]) that deals with any ε < 1/2. This, more
complicated proof relies on known bounds on the size of constant-weight codes
having certain minimum distance [15,16,11].

Theorem 3. Let ε < 1/2 and assume the LPNε problem is hard. Let n = Θ(k)
and u = ε+ n where ε+ is a constant satisfying ε < ε+ < 1/2. Then the HB+
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 11

protocol with this setting of the parameters has negligible completeness error, and
is (asymptotically) secure against active attacks.

Proof. A standard Chernoff bound shows that the completeness error is negligi-
ble for the given setting of the parameters. For any probabilistic polynomial-time
(ppt) adversary A attacking the HB+ protocol, we construct a ppt adversary
D attempting to distinguish whether it is given oracle access to As,ε or to Uk+1
(as in Lemma 1). Relating the advantage of D to the advantage of A gives the
stated result.
D, given access to an oracle returning (k + 1)-bit strings (b, z̄), proceeds as
follows:

1. D chooses s2 ∈ {0, 1}k uniformly at random. Then, it runs the first phase
of A. To simulate a basic authentication step, D does the following: it obtains
a sample (b, z̄) from its oracle and sends b as the initial message. A replies
with a challenge a, and then D responds with z = z̄ ⊕ s2 , a.
2. When A is ready for the second phase of its attack, A sends an initial mes-
sage b1 , . . . , bn . In response, D chooses random a11 , . . . , a1n ∈ {0, 1}k , sends
these challenges to A, and records A’s response z11 , . . . , zn1 . Then D rewinds
A, chooses random a21 , . . . , a2n ∈ {0, 1}k , sends these to A, and records A’s
response z12 , . . . , zn2 .
def  
3. Let zi⊕ := zi1 ⊕ zi2 and set Z ⊕ = z1⊕ , . . . , zn⊕ . Let âi = a1i ⊕ a2i and
def
ẑi = s2 , âi , and set Ẑ = (ẑ1 , . . . , ẑn ). D outputs 1 iff Z ⊕ and Ẑ differ in
fewer than u = ε++ · n entries, for some constant ε++ < 12 to be fixed later.

Let us analyze the behavior of D:


Case 1: Say D’s oracle is Uk+1 . In step 1, since z̄ is uniformly distributed
and independent of everything else, the answers z that D returns to A are
uniformly distributed and independent of everything else. It follows that A’s
view is independent of the secret s2 chosen by D.
The {âi }ni=1 are uniformly and independently distributed, and so except with
n
probability 22k they are linearly independent and non-zero (via a standard ar-
gument; see [18]). Assuming this to be the case, Ẑ is uniformly distributed over
{0, 1}n from the point of view of A. But then the probability that Z ⊕ and Ẑ
u   
differ in fewer than u entries is at most 2−n · i=0 ni . Since u /2 is a constant
strictly less than 12 , we conclude that D outputs 1 in this case with negligible
n u   
probability 22k + 2−n · i=0 ni .
Case 2: Say D’s oracle is As1 ,ε for randomly-chosen s1 . In this case, D provides a
perfect simulation for the first phase of A. Let ω denote all the randomness used
to simulate the first phase of A, which includes the keys s1 , s2 , the randomness
of A, and the randomness used in responding to A’s queries. For a fixed ω, let δω
denote the probability (over random challenges a1 , . . . , an sent by the tag reader)
that A successfully impersonates the tag in the second phase. The probability
that A successfully responds to both sets of queries a11 , . . . , a1n and a21 , . . . , a2n
12 J. Katz

sent by D is δω2 . The overall probability that A successfully responds to both


sets of queries is thus
   2
Expω δω2 ≥ Expω (δω ) = δA
2
,

using Jensen’s inequality.


We show below that conditioned on both challenges being answered success-
fully (and for appropriate choice of ε++ ), Z ⊕ differs from Ẑ in fewer than u
entries with constant probability. Putting everything together, we conclude that
2
D outputs 1 in this case with probability Ω(δA ). It follows from Lemma 1 that
δA must be negligible.
We now prove the above claim regarding the probability that Z ⊕ differs from
Ẑ in fewer than u entries. Set 12 > ε++ > 12 · (1 − (1 − 2ε+ )2 ). Fixing all
randomness used in the first phase (as above) induces a function fA from queries
a1 , . . . , an (with each ai ∈ {0, 1}k ) to vectors (z1 , . . . , zn ) (with each zi ∈ {0, 1})
given by the response function of A in the second phase. Define the function
fcorrect that returns the “correct” answers for a particular query; i.e.,
def
fcorrect(a1 , . . . , an ) = (s1 , b1  ⊕ s2 , a1  , . . . , s1 , bn  ⊕ s2 , an )

(recall that b1 , . . . , bn are the vectors sent by A in the first round). Define

def
Δ(a1 , . . . , an ) = fA (a1 , . . . , an ) ⊕ fcorrect (a1 , . . . , an ),

and say a query a1 , . . . , an is good if4 wt(Δ(a1 , . . . , an )) ≤ u. That is, a query


a1 , . . . , an is good if A’s response is within distance u of the “correct” response;
i.e., A successfully impersonates the tag in response to such a query.
Let D denote the distribution over Δ(a1 , . . . , an ) induced by a uniform choice
of a good query a1 , . . . , an (we assume at least one good query exists since we
are only interested in analyzing this case). Note that, by definition of a good
query, each vector in the support of D has weight at most u. Our goal is to show
that with constant probability over Δ1 , Δ2 generated according to D, we have
wt(Δ1 ⊕ Δ2 ) < u . We remark that this claim does not involve any assumptions
regarding the probability that a randomly-chosen query is good.
To see how this maps on to the reduction being analyzed above, note that
conditioning on the event that A successfully responds to queries a11 , . . . , a1n and
a21 , . . . , a2n is equivalent to choosing these two queries uniformly from the set of
def
good queries. Setting Δ1 = Δ(a11 , . . . , a1n ) and Δ2 analogously, we have

Δ1 ⊕ Δ2
= fA (a11 , . . . , a1n ) ⊕ fcorrect(a11 , . . . , a1n ) ⊕ fA (a21 , . . . , a2n ) ⊕ fcorrect(a21 , . . . , a2n )
= Z ⊕ ⊕ fcorrect(a11 , . . . , a1n ) ⊕ fcorrect (a21 , . . . , a2n ).
4
As in the proof of the previous theorem, the weight wt(Z) of a vector Z is the
number of its entries equal to 1.
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 13

D cannot compute fcorrect (a11 , . . . , a1n ) or fcorrect (a21 , . . . , a2n ) since it does not
know s1 . However, it can compute

fcorrect(a11 , . . . , a1n ) ⊕ fcorrect(a21 , . . . , a2n )


    
= s1 , b1  ⊕ s2 , a11 , . . . , s1 , bn  ⊕ s2 , a1n
    
+ s1 , b1  ⊕ s2 , a21 , . . . , s1 , bn  ⊕ s2 , a2n
       
= s2 , a11 ⊕ s2 , a21 , . . . , s2 , a1n ⊕ s2 , a2n
   
= s2 , (a11 ⊕ a21 ) , . . . , s2 , (a1n ⊕ a2n ) = Ẑ.

We thus see that Z ⊕ and Ẑ differ in fewer than u entries exactly when Δ1 and
Δ2 differ in fewer than u = ε++ n entries. It is the latter probability that we
now analyze.
def
Let δ be a (positive) constant such that ε++ = 12 · (1 − δ). Let γ = 1 − 2ε+ ,
++ 2
and note that by our choice of ε we have δ < γ . Set
def 1−δ
c = + 1.
γ2 − δ
We show that for two vectors Δ1 , Δ2 chosen independently according to distri-
bution D, we have wt(Δ1 ⊕ Δ2 ) < ε++ n with (constant) probability at least c12 .
Assume not. So
1
Pr[Δ1 , Δ2 ← D : wt(Δ1 ⊕ Δ2 ) < ε++ n] < .
c2
But then, by a union bound,
1
Pr[Δ1 , . . . , Δc ← D : ∃i = j s.t. wt(Δ1 ⊕ Δ2 ) < ε++ n] < .
2
In particular, there exist c vectors Δ1 , . . . , Δc in the support of D whose pairwise
distances are all at least ε++ n = 12 · (1 − δ)n. Furthermore, each Δi has weight
at most u = 12 · (1 − γ)n since it lies in the support of D. However, the Johnson
bound [15,16] (our notation was chosen to be consistent with the formulation
in [11, Theorem 1]), which gives bounds on the size of constant-weight codes of
certain minimum distance, shows that no such set {Δi }ci=1 exists.

4 Conclusions and Open Questions


There are a number of interesting questions regarding the HB and HB+ protocols
themselves. First, it would be nice to improve the concrete security reductions
obtained here, or to propose new protocols with tighter security reductions. It
would also be interesting to obtain simpler proofs even for ε ≥ 1/4 (there seems
to be no inherent reason why the analysis should become more complex in that
case). As discussed in the Introduction, it would also be very interesting to see
an efficient protocol based on the LPN problem that is provably resistant to
man-in-the-middle attacks.
14 J. Katz

More generally, it remains to be seen whether the LPN problem can be used to
construct efficient cryptographic protocols for other tasks. Some hope is provided
by Regev’s result [24] showing that public-key encryption can be based on a
generalization of the LPN problem.

Acknowledgments
I would like to thank Steven Galbraith for inviting me to contribute this survey
to the 11th IMA International Conference on Cryptography and Coding Theory.

References
1. Bogdanov, A., Knudsen, L., Leander, G., Paar, C., Poschmann, A., Robshaw,
M., Seurin, Y., Vikkelsoe, C.: PRESENT: An Ultra-Lightweight Block Cipher.
In: CHES 2007. Workshop on Cryptographic Hardware and Embedded Systems
(to appear, 2007)
2. Berlekamp, E.R., McEliece, R.J., van Tilborg, H.C.A.: On the Inherent Intractabil-
ity of Certain Coding Problems. IEEE Trans. Info. Theory 24, 384–386 (1978)
3. Blum, A., Furst, M., Kearns, M., Lipton, R.: Cryptographic Primitives Based on
Hard Learning Problems. 773, 278–291 (1994)
4. Blum, A., Kalai, A., Wasserman, H.: Noise-Tolerant Learning, the Parity Problem,
and the Statistical Query Model. J. ACM 50(4), 506–519 (2003)
5. Bringer, J., Chabanne, H., Dottax, E.: HB++ : A Lightweight Authentication Pro-
tocol Secure against Some Attacks, [Link]
6. Chabaud, F.: On the Security of Some Cryptosystems Based on Error-Correcting
Codes. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 131–139.
Springer, Heidelberg (1995)
7. Diffie, W., Hellman, M.: New Directions in Cryptography. IEEE Trans. Info. The-
ory 22(6), 644–654 (1976)
8. Feldhofer, M., Dominikus, S., Wolkerstorfer, J.: Strong Authentication for RFID
Systems using the AES Algorithm. In: Joye, M., Quisquater, J.-J. (eds.) CHES
2004. LNCS, vol. 3156, pp. 357–370. Springer, Heidelberg (2004)
9. Fossorier, M.P.C., Mihaljevı́c, M.J., Imai, H., Cui, Y., Matsuura, K.: A Novel
Algorithm for Solving the LPN Problem and its Application to Security Evaluation
of the HB Protocol for RFID Authentication, [Link]
10. Gilbert, H., Robshaw, M., Silbert, H.: An Active Attack against HB+ : A Provably
Secure Lightweight Authentication Protocol, [Link]
11. Guruswami, V., Sudan, M.: Extensions to the Johnson Bound (unpublished
manuscript 2001), [Link]
12. Håstad, J.: Optimal Inapproximability Results. J. ACM 48(4), 798–859 (2001)
13. Hopper, N., Blum, M.: A Secure Human-Computer Authentication Scheme. Tech-
nical Report CMU-CS-00-139, Carnegie Mellon University (2000)
14. Hopper, N., Blum, M.: Secure Human Identification Protocols. In: Boyd, C. (ed.)
ASIACRYPT 2001. LNCS, vol. 2248, pp. 52–66. Springer, Heidelberg (2001)
15. Johnson, S.M.: Upper Bound for Error-Correcting Codes. IEEE Trans. Info. The-
ory 8, 203–207 (1962)
16. Johnson, S.M.: Improved Asympototic Bounds for Error-Correcting Codes. IEEE
Trans. Info. Theory 9, 198–205 (1963)
Efficient Cryptographic Protocols Based on the Hardness of Learning Parity 15

17. Juels, A., Weis, S.: Authenticating Pervasive Devices with Human Protocols,
vol. 3621, pp. 293–308 (2005),
[Link]
pdfs/[Link]
18. Katz, J., Shin, J.-S.: Parallel and Concurrent Security of the HB and HB+ Pro-
tocols. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, Springer,
Heidelberg (2006)
19. Katz, J., Smith, A.: Analyzing the HB and HB+ Protocols in the Large Error Case,
Available at [Link]
20. Kearns, M.: Efficient Noise-Tolerant Learning from Statistical Queries. J.
ACM 45(6), 983–1006 (1998)
21. Kfir, Z., Wool, A.: Picking Virtual Pockets using Relay Attacks on Contactless
Smartcard Systems, [Link]
22. Kirschenbaum, I., Wool, A.: How to Build a Low-Cost, Extended-Range RFID
Skimmer, [Link]
23. Levieil, E., Fouque, P.-A.: An Improved LPN Algorithm. In: De Prisco, R., Yung,
M. (eds.) SCN 2006. LNCS, vol. 4116, pp. 348–359. Springer, Heidelberg (2006)
24. Regev, O.: On Lattices, Learning with Errors, Random Linear Codes, and Cryp-
tography. In: 37th ACM Symposium on Theory of Computing, pp. 84–93. ACM,
New York (2005)
25. Weis, S.: New Foundations for Efficient Authentication, Commutative Cryptogra-
phy, and Private Disjointness Testing. Ph.D. thesis, MIT (2006)
Galois Rings
and
Pseudo-random Sequences

Patrick Solé1 and Dmitrii Zinoviev1,2


1
CNRS-I3S, Les Algorithmes, Euclide B, 2000, route des Lucioles, 06 903 Sophia
Antipolis, France
sole@[Link], zinoviev@[Link]
2
Institute for Problems of Information Transmission, Russian Academy of Sciences,
Bol’shoi Karetnyi, 19, GSP-4, Moscow, 127994, Russia
dzinov@[Link]

Abstract. We survey our constructions of pseudo random sequences


(binary, Z8 , Z2l ,. . . ) from Galois rings. Techniques include a local Weil
bound for character sums, and several kinds of Fourier transform. Ap-
plications range from cryptography (boolean functions, key generation),
to communications (multi-code CDMA), to signal processing (PAPR re-
duction).

Keywords: Correlation, Galois Rings, MSB, CDMA, OFDM, PAPR.

1 Introduction

Since the seminal work of Claude Shannon in the 40’s the algebraic structure of
coding alphabets was a finite field. However, there was a push towards finite rings
due to modulation requirements, a 4− PSK modulation being more powerful
than an antipodal modulation, for instance [1]. This led researchers in the 70’s
to consider cyclic codes over Z/M Z, with a special attention for M a power of
a prime [2,19]. In the 90’s cyclic codes over rings received a lot of attention due
to many pure applications: the solution of the Kerdock Preparata riddle [8], and
a new construction of the Leech lattice [3], (both for M = 4), to name but a
few. In this golden dawn, new and powerful mathematical techniques arose, the
most profound being probably a p−adic analogue of the Weil bound on character
sums [10], and families of low correlation sequences (based on weighted degree
trace codes) tailored for that bound [20].
In the present decade, new engineering applications emerged that required
that machinery, and are significantly different, but not unrelated to the classical
use in Spread Spectrum and CDMA systems illustrated in [11,7], or to take a
more recent example the 3GPP standard (see [Link]). In particular an
important quantity for the safe processing of electronic equipment is the Peak
to Average Power Ratio (PAPR), signals with high dynamic range being
potentially harmful to operational amplifiers. Two models of signalling sequences

S.D. Galbraith (Eds.): Cryptography and Coding 2007, LNCS 4887, pp. 16–33, 2007.

c Springer-Verlag Berlin Heidelberg 2007
Galois Rings and Pseudo-random Sequences 17

both due to Paterson and co-workers lead to somewhat different mathematical


formulation. In the first scenario, motivated by OFDM applications, and based
on DFT calculation, we are to control some hybrid character sums (i.e. involving
not only additive but also multiplicative characters) in relation to many valued
sequences [17]. In the second approach, motivated by multi-code CDMA, we
are to control the nonlinearity of binary sequences of length a power of 2
[16]. This notion of nonlinearity coincides with the one familiar from boolean
functions theory. In both models sequences live in families and are supposed to
form a code either spherical in the first case or binary in the second. Both papers
[17,16] already use weighted degree trace codes and our contribution is mostly
of extension from Z4 to Z2 for  > 2 [22], or to non primitive periods [24]. In
general, the use of larger rings provides extra flexibility in the design choice; the
sequences families constructed in that way having a larger size but a somewhat
larger PAPR. For a book on PAPR reduction see [13]. For recent results on codes
for PAPR reduction see [18]. For books on Galois rings see [25,26].
A distinct, older motivation to study pseudo random sequences over rings is
the use of the Most Significant Bit (MSB) to obtain highly nonlinear binary
sequences, mostly for cryptographic purposes (key generation in stream ciphers).
This is due to Dai and her school [5,6]. In [21] we improve some bounds of
autocorrelation and imbalance of such sequences in [6].
The material is organized as follows. Section 2 contains definitions and no-
tation on Galois rings. Section 3 collects the bounds we need on characters
sums. In particular the bound on Fourier coefficients of the most significant bit
(MSB) function that was obtained in [12]. Section 4 defines the polynomials
over Galois rings needed to define the trace codes. Section 5 considers the bi-
nary sequence constructed from sequences over Z8 that was introduced in [12].
Section 6 considers short binary sequences of [22], motivated by [16]. Section
7 contains the binary sequences constructed by taking the MSB of 2l −ary se-
quences [21].Section 8 contains the PAPR reduction sequences over large rings
of [24]. Section 9 considers quadriphase sequences of maximal period of [23].

2 Preliminaries

Let R = GR(2l , m) denote the Galois ring of characteristic 2l . It is the unique


Galois extension of degree m of Z2l , with 2lm elements.

R = GR(2l , m) = Z2l [X]/(h(X)).

where h(X) is a basic irreducible polynomial of degree m. Let ξ be an element in


GR(2l , m) that generates the Teichmüller set T of GR(2l , m) which reduces to F2m
modulo 2. Specifically, let T ={0, 1, ξ, ξ 2 , . . . , ξ 2 −2 } and T ∗={1, ξ, ξ 2 , . . . , ξ 2 −2 }.
m m


We use the convention that ξ = 0.
The 2-adic expansion of x ∈ GR(2l , m) is given by

x = x0 + 2x1 + · · · + 2l−1 xl−1 ,


18 P. Solé and D. Zinoviev

where x0 , x1 , . . . , xl−1 ∈ T . The Frobenius operator F is defined for such an x


as
F (x0 + 2x1 + · · · + 2l−1 xl−1 ) = x20 + 2x21 + · · · + 2l−1 x2l−1 ,
and the trace Tr, from GR(2l , m) down to Z2l , as

m−1
Tr := F j.
j=0

We also define another trace tr from F2m down to F2 as



m−1
j
tr(x) := x2 .
j=0

Throughout this note, we let n = 2m and R∗ = R\2R.


Let MSB : Zn2l → Zn2 be the most-significant-bit map, i.e.
MSB(y) = yl−1 , where y = y0 + 2y1 + . . . + 2l−1 yl−1 ∈ Z2l ,
is its 2-adic expansion.

3 DFT and the Local Weil Bound


We assume henceforth in the whole paper that l ≥ 3. Let l be a positive integer
l
and ω = e2πi/2 be a primitive 2l -th root of 1 in C. Let ψk be the additive
character of Z2l such that
ψk (x) = ω kx .
Let μ : Z2l → {±1} be the be the mapping μ(t) = (−1)c , where c is the most
significant bit of t ∈ Z2l , i.e. it maps 0, 1, ..., 2l−1 − 1 to +1 and 2l−1 , 2l−1 +
1, ..., 2l − 1 to −1. Our goal is to express this map as a linear combination of
characters. Recall the Fourier transformation formula on Z2l :
2 −1 2 −1
1 
l l

μ = μj ψj , where μj = l μ(x)ψj (−x). (1)


j=0
2 x=0

Combining Lemma 4.1 and Corollary 7.4 of [12], we obtain


Lemma 1. Let q = 8. For the constants μj = (1 + ζ −j + ζ −2j + ζ −3j )/4 j =
1, 3, 5, 7 we have
μ = μ1 ψ1 + μ3 ψ3 + μ5 ψ5 + μ7 ψ7 ,
and μj = 0, for even j. Furthermore

(|μ1 | + |μ3 | + |μ5 | + |μ7 |)2 = 2 + 2.
Let q = 2l where l ≥ 4. Then

q−1
2
|μj | < ln(q) + 1. (2)
j=0
π
Galois Rings and Pseudo-random Sequences 19

For all β = 0 in the ring R = GR(2l , m), we denote by Ψβ the character

Ψβ : R → C∗ , x → ω Tr(βx) .
Note that for the defined above ψk and Ψβ , we have:
ψk (Tr(βx)) = Ψβk (x). (3)
Let f (X) denote a polynomial in R[X] and let
f (X) = F0 (X) + 2F1 (X) + . . . + 2l−1 Fl−1 (X)
denote its 2-adic expansion. Let di be the degree in x of Fi . Let Ψ be an arbitrary
additive character of R, and set Df to be the weighted degree of f , defined as
Df = max {d0 2l−1 , d1 2l−2 , . . . , dl−1 }.
With the above notation, we have (under mild technical conditions) the bound

| Ψ (f (x))| ≤ (Df − 1)2m/2 . (4)
x∈T

See [10] for details.

4 Polynomials over the Galois Ring GR(2l, m)


Recall that R = GR(2l , m). A polynomial

d
f (X) = cj xj ∈ R[X]
j=0

is called canonical if cj = 0 for all even j.


Given an integer D ≥ 4, define
SD = {f (X) ∈ R[X] | Df ≤ D, f is canonical},
where Df is the weighted degree of f . Observe that SD is an GR(2l , m)−module.
Recall [21, Lemma 4.1]. For a weaker condition on D see [18, Theorem 6.13].
Lemma 2. For any integer D ≥ 4, we have:

|SD | = 2(D−D/2 )m ,


l

where x
is the largest integer ≤ x.
Recall the following property of the weighted degree [22, Lemma 3.1]
Lemma 3. Let f (X) ∈ R[X] and α ∈ R∗ = R\2R is a unit of R and let
g(X) = f (αX) ∈ R[X]. Then
Dg = Df ,
where Df , Dg are respectively the weighted degrees of the polynomials f (X) and
g(X).
20 P. Solé and D. Zinoviev

5 Binary Sequences over Z8


In this section we consider the case of GR(8, m) and will consider the periodic
sequences c0 , c1 , . . . of period 2m − 1. Let α ∈ R∗ , then define

ct = MSB(Tr(αξ t )), (5)

where t = 0, . . . , n − 2 and n = 2m . This sequence was introduced in [12].


We now have the following results on, respectively, the imbalance and the
crosscorrelation function of the binary sequence (ct )t∈N .

Theorem 1. With notation as above, we have


 n−2  
  √ √
 (−1)ct  ≤ 2 + 2(3 2m + 1).
t=0

Proof. By definition of ψj and Ψα , (where 0 = α ∈ R), for any 0 ≤ j ≤ 7, we


have:
ψj (Tr(αx)) = Ψαj (x).
As we have ct = MSB(Tr(αξ t )), and by (1), we obtain that (−1)ct is equal to:


7 
7
μ(Tr(αξ t )) = μj ψj (Tr(αξ t )) = μj Ψαj (ξ t )). (6)
j=0 j=0

Changing the order of summation, we obtain that:


n−2 
7 
n−2
(−1)ct = μj Ψαj (ξ t ). (7)
t=0 j=0 t=0

Inequality (4) implies that:


   √
 
 Ψλ (x) ≤ 3 2m + 1, (8)
x∈T ∗

for all λ ∈ GR(8, m), λ = 0. Thus, the absolute value of the Right Hand Side of
(7) can be estimated from above by:

√ 
7
(3 2m + 1) |μj |.
j=0

Applying Lemma 1 the Theorem follows.


Theorem 2. With notation as above, and for all phase shifts τ, 0 < τ < 2m − 1,
let  
Θ(τ ) = (−1)ct (−1)ct+τ ,
t∈I
Galois Rings and Pseudo-random Sequences 21

where ct = MSB(Tr(αξ t )) and ct = MSB(Tr(α ξ t )) are two elements such that
for any pair 0 ≤ j1 , j2 ≤ 7 the following condition holds
j1 α + j2 α ξ τ = 0.
We then have the bound
√ √
|Θ(τ )| ≤ 3(2 + 2) 2m .
Proof. As we have ct = MSB(Tr(αξ t )) and ct = MSB(Tr(α ξ t )), where t ranges
between 0 and n − 2 and by (1), applying (6) and changing the order of summa-
tion, we obtain that:

7 
7 
n−2
Θ(τ ) = μj1 μj2 Ψj1 (αξ t )Ψj2 (α ξ t+τ ). (9)
j1 =0 j2 =0 t=0

By definition of Ψ , we have:
Ψj1 (αξ t )Ψj2 (α ξ t+τ ) = Ψβ (ξ t ),
where β = j1 α + j2 α ξ τ = 0. Applying (4), we obtain:
 n−2   n−2 
    √
 Ψj1 (αξ t )Ψj2 (α ξ t+τ ) =  Ψβ (ξ t ) ≤ 3 2m + 1. (10)
t=0 t=0

Applying Lemma 1, we obtain


7 
7  2
l
−1 2 √
|μj1 μj2 | = |μj | = 2 + 2. (11)
j1 =0 j2 =0 j=0

Combining it with (10) the result follows.


6 Binary Sequences of Period 2m − 1


In this section we will consider periodic sequences c0 , c1 , . . . of period n − 1. For
any integer D ≥ 4, let f ∈ SD and set
ct = MSB(Tr(f (ξ t ))), (12)
where t = 0, . . . , n − 2.
We now have the following results on, respectively, the imbalance and the
crosscorrelation function of the binary sequence defined by (12).
Theorem 3. With notations as above, we have:
 n−2 
  √
 (−1)ct  ≤ (2l ln(2)/π + 1)(D − 1) n,
t=0

where ct = MSB(Tr(f (ξ t ))).


22 P. Solé and D. Zinoviev

Proof. As we have ct = MSB(Tr(f (ξ t ))), and by (1), we obtain that (−1)ct is


equal to:
2
l
−1 2
l
−1
t t
μ(Tr(f (ξ ))) = μj ψj (Tr(f (ξ ))) = μj Ψj (f (ξ t )).
j=0 j=0

Changing the order of summation, we obtain that:


n−2 2
l
−1 
ct
(−1) = μj Ψj (f (x)). (13)
t=0 j=0 x∈T ∗

Applying (4), the absolute value of the Right Hand Side of (13) can be esti-
mated from above by:
−1
2 l

((Df − 1) 2m + 1) |μj |. (14)
j=0

Applying Lemma 1 the sum (14) can be estimated from above by:

(2l ln(2)/π + 1)(Df − 1) 2m .
The result follows.

Theorem 4. With notation as above, and for all phase shifts τ, in the range
0 < τ < 2m − 1, let (n = 2m )

n−2

Θ(τ ) = (−1)ct (−1)ct+τ ,
t=0

ct
where ct = MSB(Tr(f1 (ξ ))) and = MSB(Tr(f2 (ξ t ))). We then have the bound
t

(l ≥ 4):
 2
2l √
|Θ(τ )| ≤ ln(2) + 1 [1 + (D − 1) 2m ],
π
where D = max{Df1 , Df2 } and for any pair 0 ≤ j1 , j2 ≤ 2l − 1 the following
condition holds
j1 f1 (x) + j2 f2 (xξ τ ) = 0.
Proof. As we have ct = MSB(Tr(f1 (ξ t ))) and ct = MSB(Tr(f2 (ξ t ))), where t
ranges between 0 and n − 2 and by (1), we obtain that (−1)ct is equal to:
2
l
−1 2
l
−1
t t
μ(Tr(f1 (ξ ))) = μj ψj (Tr(f1 (ξ ))) = μj Ψj (f1 (ξ t )).
j=0 j=0

Changing the order of summation, we obtain that:


2
l
−1 2
l
−1 
n−2
Θ(τ ) = μj1 μj2 Ψj1 (f1 (ξ t ))Ψj2 (f2 (ξ t+τ )). (15)
j1 =0 j2 =0 t=0
Galois Rings and Pseudo-random Sequences 23

By definition of Ψ , we have:

Ψj1 (f1 (ξ t ))Ψj2 (f2 (ξ t+τ )) = Ψ (g(ξ t )),


where g(X) = j1 f1 (X) + j2 f2 (Xξ τ ). Note that if f (X) ∈ SD then f (Xξ τ ) ∈ SD
since, by Lemma 3 the change of variable X → Xξ τ does not increase the
weighted degree. Moreover SD is an R-linear space. Thus the polynomial g(X)
belongs to SD along with f1 and f2 . Applying (4), we obtain:
 n−2    
    √
 Ψj1 (f1 (ξ t ))Ψj2 (f2 (ξ t+τ )) =  Ψ (g(x)) ≤ 1 + (D − 1) 2m . (16)
t=0 x∈T ∗

Applying Lemma 1, we obtain


2
l
−1 2
l
−1  2
l
−1 2  2
2l ln(2)
|μj1 μj2 | = |μj | ≤ +1 . (17)
j1 =0 j2 =0 j=0
π

Combining (16) with (17) the result follows.


7 Binary Codes and Sequences of Maximal Length


Let γ = ξ(1 + 2λ) ∈ R, where ξ ∈ T and λ ∈ R∗ . Assume 1 + 2λ is of order 2l−1 .
Since ξ is of order 2m − 1 then γ is an element of order N = 2l−1 (2m − 1).
In this section we consider the periodic sequences c0 , c1 , . . . of period N . Let
α ∈ R∗ , then define
ct = MSB(Tr(αγ t )), (18)
where t = 0, . . . , N − 1. This sequence was introduced and studied in [21].
We now have the following results on, respectively, the imbalance and the
crosscorrelation function of the binary sequence (ct )t∈N , (18) under the MSB
map.
First, we need the following technical lemma:
Lemma 4. Let γ = ξ(1 + 2λ) ∈ R, where ξ ∈ T ∗ , λ ∈ R∗ = R\2R, and
1 + 2λ ∈ R is an element of order 2l−1 . Set N = 2l−1 (2m − 1). Then, for any
0 = β ∈ R, we have:
−1


N −1 
2l−1
j
Ψβ (γ ) = Ψβ(1+2λ)j (x) .
j=0 j=0 x∈T ∗

Proof. Since (2l−1 , 2m − 1) = 1, as j ranges over {0, 1, . . . , N − 1}, the set of


ordered pairs
{(j (mod 2l−1 ), j (mod 2m − 1))}
runs over all pairs (j1 , j2 ), where j1 ∈ {0, 1, . . . , 2l−1 −1} and j2 ∈ {0, 1, . . . , 2m −
2}. Thus the set

{γ t ; t = 0, 1, . . . , N − 1} = {ξ t (1 + 2λ)t ; t = 0, 1, . . . , N − 1}.
24 P. Solé and D. Zinoviev

is equal to the cartesian product of sets:

{ξ t1 ; t1 = 0, 1, . . . , 2m − 2} × {(1 + 2λ)t2 ; t = 0, 1, . . . , 2l−1 − 1}. (19)

From (19) we obtain:


N −1 −1 2
2l−1 m
−2
j
Ψβ (γ ) = Ψβ ((1 + 2λ)j1 ξ j2 ),
j=0 j1 =0 j2 =0

whereas observing that

Ψβ ((1 + 2λ)j1 ξ j2 ) = Ψβ  (ξ j2 ),

where β  = β(1 + 2λ)j1 , yields:

−1 2
2l−1 −2
m

Ψβ(1+2λ)j1 (ξ j2 ).
j1 =0 j2 =0

The Lemma follows.


Theorem 5. With notation as above, we have:


N −1 
  √
 (−1)ct  ≤ (2l ln(2)/π + 1)2l−1 [(2l−1 − 1) 2m + 1].
t=0

Proof. Recall that γ = ξ(1 + 2λ) ∈ R, where ξ ∈ T ∗ is a generator of the


Teichmüller set and λ ∈ R∗ . By definition of ψj and Ψα , (where 0 = q ∈ R), for
any 0 ≤ j ≤ 2l − 1, we have:

ψj (Tr(αx)) = Ψαj (x).

As we have ct = MSB(Tr(αγ t )), and by (1), we obtain that (−1)ct is equal to:

2
l
−1 2
l
−1
t t
μ(Tr(αγ )) = μj ψj (Tr(αγ )) = μj Ψαj (γ t ).
j=0 j=0

Changing the order of summation, we obtain that:


N −1 2
l
−1 
N −1
ct
(−1) = μj Ψαj (γ t ). (20)
t=0 j=0 t=0

Applying Lemma 4, we have:

−1


N −1
2l−1 
t
Ψαj (γ ) = Ψαj(1+2λ)t (x) . (21)
t=0 t=0 x∈T ∗
Galois Rings and Pseudo-random Sequences 25

Applying Lemma 3.1 of [21] to each of the 2l−1 sums over x, we obtain that:
 N
−1  √
 
 Ψαj (γ t ) ≤ 2l−1 [(2l−1 − 1) 2m + 1].
t=0

Thus, the absolute value of the Right Hand Side of (20) can be estimated from
above by:
2
l
−1

2l−1 [(2l−1 − 1) 2m + 1] |μj |. (22)
j=0

Recall Corollary 7.4 of [12] which states that for l ≥ 4 the following estimate
holds:
2l
−1
2l ln(2)
|μj | < + 1.
j=0
π

Thus (22) can be estimated from above by:



(2l ln(2)/π + 1)2l−1 [(2l−1 − 1) 2m + 1].

The Lemma follows.



We now proceed to bound the crosscorrelation.
Theorem 6. With notation as above, and for all phase shifts τ, 0 < τ < 2m − 1,
let

N −1

Θ(τ ) = (−1)ct (−1)ct+τ ,
t=0

where ct = M SB(Tr(αγ )) and = M SB(Tr(α γ t )) are such that for any pair
t
ct
0 ≤ j1 , j2 ≤ 2l − 1 the following condition holds

j1 α + j2 α ξ τ = 0.

We then have (for l ≥ 4) the bound :


 2
2l √ √
|Θ(τ )| ≤ ln(2) + 1 2l−1 [(2l−1 − 1) 2m + 1] = Cl 2m .
π

Here Cl is a constant in l of order l2 22l , i.e.

Cl = (ln(2)/π)2 (l2 22l + o(1)).

Proof. Again let γ = ξ(1+2λ) ∈ R, where ξ ∈ T is a generator of the Teichmüller


set and λ ∈ R∗ . As we have ct = MSB(Tr(αγ t )), and by (1), we obtain that
(−1)ct is equal to:
2
l
−1 2
l
−1
t t
μ(Tr(αγ )) = μj ψj (Tr(αγ )) = μj Ψαj (γ t ).
j=0 j=0
26 P. Solé and D. Zinoviev

Changing the order of summation, we obtain that:

2
l
−1 2
l
−1 
N −1
Θ(τ ) = μj1 μj2 Ψβ (γ t ).
j1 =0 j2 =0 t=0

Here β = j1 α + j2 α γ τ = 0. Applying Corollary 7.4 of [12] (for l ≥ 4), we have:


⎛ ⎞2
2
l
−1 2
l
−1 2
l
−1  2
2l
|μj1 μj2 | = ⎝ |μj |⎠ ≤ ln(2) + 1 . (23)
j1 =0 j2 =0 j=0
π

Applying Lemma 4, we have:

−1


N −1
2l−1 
j
Ψβ (γ ) = Ψβ(1+2λ)j (x) ,
j=0 j=0 x∈T

where 0 = β(1 + 2λ)j ∈ R so that each sum over x can be estimated using
Lemma 3.1 of [21]. Thus, we have:

 N
−1  √
 
 Ψβ (γ j ) ≤ 2l−1 [(2l−1 − 1) 2m + 1]. (24)
j=0

Combining (23) with (24) the Lemma follows.


8 Non Binary Codes of Maximal Length

Let γ = ξ(1 + 2λ) ∈ R, where ξ ∈ T and λ ∈ R∗ . Assume 1 + 2λ is of order 2l−1 .


Since ξ is of order 2m − 1 then γ is an element of order N = 2l−1 (2m − 1).
Following [6, Lemma 2], we define the code of length N :
−1
Sl,m,D = {(Tr(f (γ t )))N
t=0 | f ∈ SD }. (25)

This sequence was introduced and studied in [24].


We prepare the upper bound on the PAPR of codewords by a result on a
character sum.

Theorem 7. Let γ = ξ(1 + 2λ) ∈ R, where ξ ∈ T ∗ , λ ∈ R∗ = R\2R, and


1 + 2λ ∈ R is an element of order 2l−1 . Set N = 2l−1 (2m − 1). Then for any
j ∈ [0, 2l−1 − 1], we have

 N
−1  √
 
 Ψ (f (γ k ))e2πikj/N  ≤ 2l−1 [Df 2m + 1]. (26)
k=0
Galois Rings and Pseudo-random Sequences 27

Proof. Since (2l−1 , 2m − 1) = 1, as j ranges over {0, 1, . . . , N − 1}, the set of


pairs
{(j (mod 2l−1 ), j (mod 2m − 1))}
runs over all pairs (j1 , j2 ), where j1 ∈ {0, 1, . . . , 2l−1 } and j2 ∈ {0, 1, . . . , 2m −1}.
Thus the set
{γ k ; k = 0, 1, . . . , N − 1} = {ξ k (1 + 2λ)k ; k = 0, 1, . . . , N − 1}.
is equal to the direct product of sets:
{ξ k1 ; t1 = 0, 1, . . . , 2m − 1} × {(1 + 2λ)k2 ; t = 0, 1, . . . , 2l−1 − 1}, (27)
where
k ≡ k1 (mod 2m − 1), k ≡ k2 (mod 2l−1 ).
By the CRT there exist integers c1 , c2 such that for all k = 0, 1, . . . , N − 1 we
can write
k = 2l−1 c1 k1 + (2m − 1)c2 k2
m
−1 l−1
Consequently since γ = ξ(1 + 2λ), where ξ 2 = 1 and (1 + 2λ)2 = 1, the
sum of the left hand side of (26) is equal to:

−1 2
2l−1 −2
m

Ψ (f (ξ k1 (1 + 2λ)k2 ))e2πikj/N . (28)


k2 =0 k1 =0

Set fk2 (x) = f (x(1 + 2λ)k2 ). Then, expressing k as a function of k1 and k2 , the
above sum is equal to:

−1 2
2l−1 −2
m
m
−1) 2πic2 k2 j/2l−1
Ψ (fk2 (ξ k1 ))e2πic1 k1 j/(2 e .
k2 =0 k1 =0

Using Lemma 3 the weighted degree of fk2 is bounded by Df . Applying (4) to


each of the 2l−1 inner sums yields:
 2
m
−2  √
 
Ψ (fk2 (ξ t ))e2πic1 k1 j/(2 −1)  ≤ Df 2m + 1.
m

t=0

Thus, the absolute value of (28) can be estimated from above by



2l−1 [Df 2m + 1].
The result follows.

This character sum estimate translates immediately in terms of PAPR.
Corollary 1. For every c ∈ Sl,m,D the PAPR is at most

2l−1 √ 2 2
(1 + D 2m )2 log(2N ) + 2 ,
2 −1
m π
where log stands for the natural logarithm.
28 P. Solé and D. Zinoviev

We prepare the bound on the minimum Euclidean distance of the code Sl,m,D
by a correlation approach.
Theorem 8. With notation as above, and for all phase shifts τ, in the range
0 < τ < N , let

N −1

Θ(τ ) = ω ct −ct+τ ,
t=0
where ct = Tr(f1 (γ t )) and ct = Tr(f2 (γ t )). We then have the bound (l ≥ 4):

|Θ(τ )| ≤ 2l−1 [(Df − 1) 2m + 1].
Proof. As we have ct = Tr(f1 (γ t )) and ct = Tr(f2 (γ t )), where t ranges between
0 and N − 1 and γ = ξ(1 + 2λ) is of order N = 2l−1 (2m − 1).
We obtain that:

N −1
Θ(τ ) = Ψ (f1 (γ t ) − f2 (γ t+τ )). (29)
t=0

By definition of Ψ , we have:
Ψ (f1 (γ t ) − f2 (γ t+τ )) = Ψ (f3 (γ t )),
where f3 (x) = f1 (x) − f2 (xγ τ ). Note that if f (x) ∈ SD then by Lemma 3
f (xγ τ ) ∈ SD since the change of variables x → xγ τ does not increase the
weighted degree. Moreover SD is an R-linear space. Thus the polynomial f3 (x)
belongs to SD along with f1 and f2 . Further, as t ranges between 0 and N − 1,
the set {γ t ; t = 0, 1, . . . , N − 1} is equal to the product
{ξ k1 ; t1 = 0, 1, . . . , 2m − 1} × {(1 + 2λ)k2 ; t = 0, 1, . . . , 2l−1 − 1}.
Thus, in (29), the sum over t is equal to
−1 2
2l−1 −2 −1 2 −2
m
2l−1 m

Ψ (f3 (ξ k1 (1 + 2λ)k2 )) = Ψ (gk2 (ξ k1 )), (30)


k2 =0 k1 =0 k2 =0 k1 =0

where gk2 (x) = f3 (x(1 + 2λ)k2 ) and for any k2 , gk2 ∈ SD . Applying (4) to each
of the 2l−1 sums:
 2
m
−2  √
 
 Ψ (fk2 (ξ t )) ≤ (Df − 1) 2m + 1.
t=0

Thus, the absolute value of (30) can be estimated above by



2l−1 [(Df − 1) 2m + 1]. (31)
The result follows.

We are now in a position to estimate the minimum Euclidean distance dE of our
code Sl,m,D .
Corollary 2. For all integers l ≥ 4 and m ≥ 3 we have

dE ≥ 2l (2m − 2 − D 2m ).
Galois Rings and Pseudo-random Sequences 29

9 Biphase Sequences
In this section, we construct biphase sequences using the composition of the
Carlet map and inverse Gray map (see [23] for details).
Let a = a0 + 2a1 + 4a2 ∈ Z8 be 2-adic expansion of a. Define two maps Z1 ,Z2
from Z8 to Z4 and a map Z from Z8 to Z4 × Z4 :

Z(a) = (Z1 (a), Z2 (a)) = (a1 + 2(a0 + a2 ), a1 + 2a2 ),

where a0 , a1 , a2 ∈ Z2 .
For any integer k ≥ 1, extend this map (also denoted by Z) to the following
map from Zk8 to Zk4 × Zk4 :

Definition 1. Let c = (c1 , c2 , . . . , ck ) ∈ Zk8 and Z(cj ) = (aj , bj ) ∈ Z4 × Z4


where aj = Z1 (cj ) and bj = Z2 (cj ) for j = 1, 2, . . . , k. Define Z as follows:

Z(c) = (a, b), (32)

where a = (a1 , a2 , . . . , ak ) ∈ Zk4 and b = (b1 , b2 , . . . , bk ) ∈ Zk4 defined above.

Define the map MSBZ: Z8 → Z2 × Z2 via

MSBZ(u) = (MSB(Z1 (u)), MSB(Z2 (u))),

in other words
MSBZ(a0 + 2a1 + 4a2 ) = (a0 + a2 , a2 ),
obtained by taking the most significant bit in each component of Z (which are
elements of Z4 ).
In this section we consider the periodic sequences c0 , c1 , . . . of period 2(n − 1),
where n = 2m . Let β = 5ξ. Since 52 = 1 in Z8 , we have that

β n−1 = (5ξ)n−1 = 5,

and thus β 2(n−1) = 1.


Any two polynomials f (X), g(X) ∈ R[X] are considered equivalent if g(X) =
f (aXα) for some a ∈ Z∗8 and α ∈ T ∗ . For any positive integer D ≥ 4, let f ∈ SD
modulo the equivalence relation, and set

MSB(Z1 (Tr(f (β t )))), if 0 ≤ t < n − 1
ct =
MSB(Z2 (Tr(f (β t )))), if n − 1 ≤ t < 2(n − 1).

The same DFT technique as above can be applied to the functions


(−1)MSB(Z1 (u)) , (−1)MSB(Z2 (u)) , which map the element a0 + 2a1 + 4a2 of Z8
into the respectively (−1)a0 +a2 and (−1)a2 . Then the following holds

Lemma 5. Let ω = e2πi/8 be a primitive 8th root of 1, and set set ψk (u) = ω ku ,
where k is an integer. Then we have that

(−1)MSB(Z1 (u)) = μ1 ψ1 (u) + μ3 ψ3 (u) + μ5 ψ5 (u) + μ7 ψ7 (u),


30 P. Solé and D. Zinoviev

(−1)MSB(Z2 (u)) = μ5 ψ1 (u) + μ7 ψ3 (u) + μ1 ψ5 (u) + μ3 ψ7 (u),


where
1 1
(1 + ω − ω 2 + ω 3 ),
μ1 = μ3 = (1 + ω + ω 2 + ω 3 ),
4 4
1 1
μ5 = (1 − ω − ω 2 − ω 3 ), μ7 = (1 − ω + ω 2 − ω 3 );
4 4
and moreover √
(|μ1 | + |μ3 | + |μ5 | + |μ7 |)2 =2+ 2.
Proof. Recall the Fourier transformation formula on the additive group of Z8 .
Let μ be an arbitrary function from Z8 to the complex numbers. Then for any
u ∈ Z8 we have

7
1
7
μ(u) = μj ψj (u), where μj = μ(x)ψj (−x).
j=0
8 x=0

In particular, when μ(u) = (−1)MSB(Z1 (u)) and x = a0 + 2a1 + 4a2 we have


1 
μj = (−1)a0 +a2 ω −j(a0 +2a1 +4a2 ) .
8 a ,a ,a
0 1 2

Once simplified, we obtain


1
μj = (1 − ω −j )(1 + i−j )(1 + (−1)1−j ).
8
Substituting j = 0, 1, . . . , 7 and simplifying, the result follows (in particular μj =
0, j = 0, 2, 4, 6). The derivation for (−1)MSB(Z2 (u)) is analogous and omitted.


We now proceed to bound the imbalance.
Theorem 9. With notations as above, we have that
 n−2 
  √ √
 ((−1)MSB(Z1 (ut )) + (−1)MSB(Z2 (ut )) ) ≤ 2(2 + 2)1/2 (D − 1) 2m ,
t=0

where ut = Tr(f (β t ))
Proof. Applying Lemma 5, and changing the order of summation, we obtain that

n−2
((−1)MSB(Z1 (ut )) + (−1)MSB(Z2 (ut )) )
t=0

is equal to

n−2 
(μ1 +μ5 )(Ψ1 (f (β t ))+Ψ5 (f (β t )))+(μ3 +μ7 )(Ψ3 (f (β t ))+Ψ7 (f (β t ))) , (33)
t=0

where Ψk (u) = ψk (Tr(u)) = ψ(kTr(u)). Let Ψ = Ψ1 . Then the sum (33) equals
to
Galois Rings and Pseudo-random Sequences 31


n−2 
(μ1 +μ5 )(Ψ (f (β t ))+Ψ (5f (β t )))+(μ3 +μ7 )(Ψ (3f (β t ))+Ψ (7f (β t ))) . (34)
t=0

Recall that 
t f (ξ t ), if t is even
f (β ) =
5f (ξ t ), if t is odd
Thus, since 52 ≡ 1 (mod 8), we have


n−2 
n−2
(Ψ (f (β t )) + Ψ (5f (β t ))) = (Ψ (f (ξ t )) + Ψ (5f (ξ t ))).
t=0 t=0

Since 3 × 5 ≡ 7 (mod 8) and 7 × 5 ≡ 3 (mod 8), we have


n−2 
n−2
(Ψ (3f (β t )) + Ψ (7f (β t ))) = (Ψ (3f (ξ t )) + Ψ (7f (ξ t ))).
t=0 t=0

Recalling that
 n−2 
  √
 Ψj (Tr(f (β t ))) ≤ (D − 1) 2m (35)
t=0

and combining it with (5) the result follows.


We now proceed to bound the crosscorrelation.


Theorem 10. With notations as above, and for all phase shifts τ in the range
0 < τ < 2m − 1, let


n−2
Θ(τ ) = ((−1)MSB(Z1 (ut ))−MSB(Z1 (vt+τ )) + (−1)MSB(Z2 (ut ))−MSB(Z2 (vt+τ )) ),
t=0

where ut = Tr(f1 (β t )) and vt+τ = Tr(f2 (β t+τ )). We then have the bound
√ √
|Θ(τ )| ≤ 2(2 + 2)(D − 1) 2m .

Proof. For i = 1, 2, define


n−2
Θi (τ ) = ((−1)MSB(Z1 (uj ))+MSB(Zi (vj+τ )) .
j=0

Consider the contribution from Θ1 (τ ). Applying Lemma 5 to


(−1)MSB(Z1 (ut ))+MSB(Z1 (vt+τ )) , we obtain


7 
7 
n−2
Θ1 (τ ) = μj1 μj2 Ψj1 (f1 (β t ))Ψj2 (f2 (β t+τ )).
j1 =0 j2 =0 t=0
32 P. Solé and D. Zinoviev

By definition of Ψ , we have:

Ψj1 (f1 (β t ))Ψj2 (f2 (β t+τ )) = Ψ (g(β t )),

where g(X) = j1 f1 (X) + j2 f2 (Xξ τ ) or g(X) = j1 f1 (X) + j2 5f2 (Xξ τ ).


Note that if f (X) ∈ SD then f (Xβ τ ) ∈ SD since, by Lemma 3 the change
of variable X → Xβ τ does not increase the weighted degree. Moreover SD is an
R-linear space. Thus the polynomial g(X) belongs to SD along with f1 and f2 .
Following the ideas of Theorem 9, we reduce to the sums
 n−2   n−2 
    √
 Ψj1 (f1 (ξ t ))Ψj2 (f2 (ξ t+τ )) =  Ψ (g(ξ t )) ≤ (D − 1) 2m . (36)
t=0 t=0

Applying Lemma 5, we obtain


7 
7 
7 2 √
|μj1 μj2 | = |μj | = 2 + 2. (37)
j1 =0 j2 =0 j=0

Combining it with (36) the result follows. The constant 2 in the Theorem comes
from the contributions of Z1 and Z2 .

Acknowledgement. We thank Steven Galbraith, Kenny Paterson, Ana Salagean,


and Kai-Uwe Schmidt for helpful remarks that greatly improved the presentation
of this material.

References
1. Berlekamp, E.R.: Algebraic coding theory. McGraw-Hill, New York (1968)
2. Blake, I.F.: Codes over integer residue rings. Information and Control 29(4), 295–
300 (1975)
3. Bonnecaze, A., Solé, P., Calderbank, A.R.: Quaternary quadratic residue codes and
unimodular lattices. IEEE Trans. Inform. Theory IT-41(2), 366–377 (1995)
4. Carlet, C., Charpin, P., Zinoviev, V., etal.: Codes, bent functions and permutations
suitable for DES-like cryptosystems. Designs Codes and Cryptography 15, 125–156
(1998)
5. Dai, Z.-D.: Binary sequences derived from ML-sequences over rings I: period and
minimal polynomial. J. Cryptology 5, 193–507 (1992)
6. Fan, S., Han, W.: Random properties of the highest level sequences of primitive
Sequences over Z2e . IEEE Trans. Inform. Theory IT-49, 1553–1557 (2003)
7. Fan, P., Darnell, M.: Sequence Design for Communications Applications. John
Wiley and sons Inc, Chichester (1996)
8. Hammons Jr., A.R., Kumar, P.V., Calderbank, A.R., Sloane, N.J.A., Solé, P.: The
Z4 -linearity of Kerdock, Preparata, Goethals and related codes. IEEE Trans. In-
form. Theory IT-40, 301–319 (1994)
9. Helleseth, T., Kumar, P.V.: Sequences with low Correlation. In: Pless, V.S., Huff-
man, W.C. (eds.) Handbook of Coding theory, North Holland, vol. II, pp. 1765–
1853 (1998)
Galois Rings and Pseudo-random Sequences 33

10. Kumar, P.V., Helleseth, T., Calderbank, A.R.: An upper bound for Weil exponen-
tial sums over Galois rings and applications. IEEE Trans. Inform. Theory IT-41,
456–468 (1995)
11. Lam, A.W.: Theory and applications of spread spectrum systems. IEEE Press, Los
Alamitos (1994)
12. Lahtonen, J., Ling, S., Solé, P., Zinoviev, D.: Z8 -Kerdock codes and pseudo-random
binary sequences. Journal of Complexity 20(2-3), 318–330 (2004)
13. Litsyn, S.: Peak Power Control in Multicarrier Communications, Cambridge (2007)
14. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes, North-
Holland (1977)
15. McDonald, B.R.: Finite Rings with Identity. Marcel Dekker, New York (1974)
16. Paterson, K.G.: On Codes with Low Peak-to-Average Power Ratio for Multi-Code
CDMA. IEEE Trans. on Inform. Theory IT-50, 550–559 (2004)
17. Paterson, K.G., Tarokh, V.: On the existence and construction of good codes with
low peak-to-average power ratios. IEEE Trans. on Inform. Theory IT-46, 1974–1987
(2000)
18. Schmidt, K.-U.: PhD thesis, On Spectrally Bounded Codes for Multicarrier Com-
munications, Vogt Verlag, Dresden, Germany (2007),
[Link] schmidtk/#publications
19. Shankar, P.: On BCH codes over arbitrary integer rings. IEEE Trans. Inform.
Theory IT-25, 480–483 (1979)
20. Shanbhag, A., Kumar, P.V., Helleseth, T.: Improved Binary Codes and Sequence
Families from Z4 -Linear Codes. IEEE Trans. on Inform. Theory IT-42, 1582–1586
(1996)
21. Solé, P., Zinoviev, D.: The Most Significant Bit of Maximum Length Sequences
Over Z2l : Autocorrelation and Imbalance. IEEE Transactions on Information The-
ory 50, 1844–1846 (2004)
22. Solé, P., Zinoviev, D.: Low Correlation, High Nonlinearity Sequences for multi-code
CDMA. IEEE Transactions on Information Theory 52, 5158–5163 (2006)
23. Solé, P., Zinoviev, D.: Quaternary Codes and Biphase Sequences from Z8 -Codes.
Problems of Information Transmission 40(2), 147–158 (2004)
24. Solé, P., Zinoviev, D.: Weighted degree trace codes for PAPR reduction. In: Helle-
seth, T., Sarwate, D., Song, H.-Y., Yang, K. (eds.) SETA 2004. LNCS, vol. 3486,
pp. 406–413. Springer, Heidelberg (2005)
25. Wan, Z.X.: Quaternary Codes. World Scientific, Singapore (1997)
26. Wan, Z.X.: Lectures on Finite Fields and Galois Rings. World Scientific, Singapore
(2003)
Finding Invalid Signatures in Pairing-Based
Batches

Laurie Law1 and Brian J. Matt2,


1
National Security Agency, Fort Meade, MD 20755, USA
lelaw@[Link]
2
JHU Applied Physics Laboratory Laurel, MD, 21102, USA
[Link]@[Link]

Abstract. This paper describes efficient methods for finding invalid


digital signatures after a batch verification has failed. We present an
improvement to the basic binary “divide-and-conquer” method, which
can identify an invalid signature in half the time. We also present new,
efficient methods for finding invalid signatures in some pairing-based
batches with low numbers of invalid signatures. We specify these meth-
ods for the Cha-Cheon signature scheme of [5]. These new methods offer
significant speedups for Cha-Cheon batches as well as other pairing-based
signature schemes.

Keywords: Pairing-based signatures, ID-based signatures, Batch


verification.

1 Introduction
When a large number of digital signatures need to be verified, it is sometimes pos-
sible to save time by verifying many signatures together. This process is known as
batch verification. If the batch verification fails, then it is often necessary to iden-
tify the invalid (“bad”) signature(s) that caused the batch to fail. This process
can be time consuming, so batch sizes are typically chosen to be small enough
so that most batches will be expected to pass. “Divide-and-conquer” techniques
have been proposed for identifying invalid signatures in bad batches. These meth-
ods are faster than verifying each signature individually, requiring only O(log2 N )
verifications, where N is the number of signatures in the batch [11].

The views and conclusions contained in this presentation are those of the authors
and should not be interpreted as representing the official policies, either expressed
or implied, of the National Security Agency, the Army Research Laboratory, or the
U. S. Government.

Dr. Matt’s participation was through collaborative participation in the Communi-
cations and Networks Consortium sponsored by the U. S. Army Research Labora-
tory under the Collaborative Technology Alliance Program, Cooperative Agreement
DAAD-19-01-2-0011. The U. S. Government is authorized to reproduce and dis-
tribute reprints for Government purposes notwithstanding any copyright notation
thereon.

S.D. Galbraith (Eds.): Cryptography and Coding 2007, LNCS 4887, pp. 34–53, 2007.

c Springer-Verlag Berlin Heidelberg 2007
Finding Invalid Signatures in Pairing-Based Batches 35

Several digital signature schemes have recently been proposed that are based
on bilinear pairings [2,5,6]. This is because the mathematical properties of pair-
ings can be used to design signatures with improved features such as a shorter
length or the ability to derive the public verification key from the signer’s identity
(i.e., identity-based signatures). Identity-based signatures can reduce the number
of bits that need to be transmitted because the signer does not need to transmit
his public key or certificate to the verifier. Some identity-based signatures also
have short signature lengths, making them ideal for bandwidth-constrained envi-
ronments. However, verification of these signatures often involves bilinear pairing
operations, which are relatively expensive to compute. While it may be feasible
to verify a small number of pairing-based signatures, the number of verifica-
tions required to find the invalid signatures in a large batch can be prohibitive,
even with divide-and-conquer methods. Therefore, finding faster methods for
identifying invalid signatures in bad batches is very important for pairing-based
signature schemes.
All pairing-based schemes discussed in this paper are assumed to use bilinear
pairings on an elliptic curve E, defined over Fq , where q is a large prime. G1 and
G2 are distinct subgroups of prime order r on this curve, where G1 is a subset
of the points on E with coordinates in Fq and G2 is a subset of the points on
E with coordinates in Fqd , for a small integer d. The pairing e is a map from
G1 × G2 into Fqd .
In this paper, we present an improvement to the divide-and-conquer method
for finding invalid signatures in a bad batch. We also present new, efficient meth-
ods for finding invalid signatures in some pairing-based batches. We specify these
methods for the Cha-Cheon signature scheme of [5]. However, these methods are
applicable to other pairing-based schemes with similar form, such as BLS short
signatures [2] when all signatures in the batch are applied to the same message
or signed by the same signer.

2 Background

2.1 Batch Verification

Batch verification of digital signatures was introduced by Naccache et al. [10] to


verify modified DSA signatures. Yen and Laih [17] applied batch verification to
a variant of the Schnorr signature and improved its performance by removing a
requirement on small exponents that are generated by the verifier. A similiar test,
called the “small exponents test”, was one of three batch verification techniques
for modular exponentiation given in [1]. The small exponents test has been
applied to several signatures schemes including BLS short signatures [2,4].
In the small exponents test, the verifier chooses a new random value (a “small
exponent”) for each signature in the batch. Each random value is mathemati-
cally combined with its corresponding signature and then all the “randomized”
signatures are combined into two values that should be equal if and only if all
of the signatures in the batch are valid. The random values prevent an attacker
36 L. Law and B.J. Matt

from inserting invalid signatures that may cancel each other out, resulting in a
batch that appears valid.
Small exponents test:
Input: a security parameter l, a generator g of the group G of prime order q, and
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) with xi ∈ Zq and yi ∈ G.
Check: That g xi = yi for all i, 1 ≤ i ≤ n.
1. Choose n random integers r1 , . . . , rn in the range [0, 2l − 1].

n n
2. Compute x = xi ri and y = y i ri .
i=1 i=1
3. If g x = y then accept; else reject.

The probability that this test accepts a batch containing invalid signatures is
at most 2−l [1]. The order of G must be prime to prevent a weakness described
in [3]. Observe that we can set r1 = 1 without any impact on the security of this
test.

2.2 Cha-Cheon Signature Scheme


Cha and Cheon [5] proposed an identity-based signature scheme that can be
constructed with bilinear pairings. We describe their scheme using pairings that
map G1 × G2 into Fqd as defined in Section 1. H(m, U ) is a cryptographic hash
that maps a bit string m and a point U ∈ G1 to an integer between 1 and r.
1. Setup phase. The system manager selects an order r point T ∈ G2 and
randomly selects an integer s in the range [1, r − 1]. The manager computes
S = sT . The public system parameters are T and S, which are made available
to anyone that will need to verify signatures. The master secret key is s,
which is known to the system manager only.
2. Extract phase. Each user is given a key pair. The user’s public key, Q, is a
point in G1 that is derived from the user’s identity using a public algorithm.
The user’s private key, C = sQ is computed by the system manager and
given to the user through a secure channel.
3. Signing. To sign a message m, the signer randomly generates an integer t in
the range [1, r − 1] and outputs a signature (U, V ) where
U = tQ
V = (t + H(m, U ))C
4. Verification. To verify a signature (U, V ) of message m, the verifier derives
the signer’s public key Q from the purported signer’s identity and computes
h = H(m, U ). If e(U + hQ, S) = e(V, T ) then the signature is accepted.
Otherwise, the signature is rejected.

2.3 Fast Batch Verification for the Cha-Cheon Signature Scheme


Cheon et al. [6] proposed a batch signature scheme for a variant of the Cha-Cheon
scheme. Their scheme is partially aggregate, meaning that a portion of each sig-
nature can be combined into into a single short string. Aggregate signatures
Finding Invalid Signatures in Pairing-Based Batches 37

are shorter than batch signatures, but when a batch verification fails they do
not provide enough information to allow identification of the bad signatures.
Cheon et al. claimed that batch verification is not secure for Cha-Cheon and
demonstrated an attack. However, their attack is for a particular batch verifica-
tion scheme and they did not consider the batch verification methods of [1].
By applying the small exponents test to the Cha-Cheon signature we can
obtain an efficient batch verification method that can support verification of
multiple signatures by distinct signers on distinct messages. The verifier obtains
N messages mk , for k = 1 to N , and the signature (Uk , Vk ) and signer’s identity
for each message. The verifier derives each public key Qk from the signer’s iden-
tity. The verifier sets r1 = 1 and generates random values rk from [0, 2l − 1], for
k = 2 to N .
The batch is valid if
N  N 
 
e rk (Uk + H(mk , Uk ) · Qk ), S = e rk Vk , T . (1)
k=1 k=1

If this equality does not hold, then at least one signature in the batch is not
valid.
This batch verification requires only two pairings (and some other relatively
inexpensive operations). This is much more efficient than the aggregate scheme
of [6], which requires N + 1 pairings.

2.4 Divide-and-Conquer Methods


Although many papers have been written on efficient methods for batch veri-
fication, there has been less attention devoted to finding the invalid signatures
after a batch verification fails. One method for identifying invalid signatures was
applied to batches of RSA signatures [9]. It was shown in [15] that this approach
is not secure for RSA signatures. This method is similar to an independently
developed method we present in Section 4.1 for the special case of identifying a
single bad signature. Pastuszak et al. [11] investigated the “divide-and-conquer”
method, in which the set of signatures in a failed batch is repeatedly split into
z smaller sub-batches to verify. We refer to the case where z = 2 as “Simple
Binary Search”. They found that Simple Binary Search is more efficient than
naively testing each signature individually if there are fewer than N/8 invalid
signatures in the batch.
The following recursive algorithm describes the Simple Binary Search on a
batch of N messages and their corresponding digital signatures (called “mes-
sage / signature pairs”). On the initial call to Algorithm SBS(X, V ), X contains
the entire batch of N signatures and V ← true.
Note: verif y(X, V ) is a function of a list X = ((m1 , s1 ), . . . , (mn , sn )) of n
message / signature pairs and a flag V . If V = true then the list is verified,
and the function returns true if the verification passes and f alse otherwise. If
V = f alse, then no verification is performed and the function returns f alse.
38 L. Law and B.J. Matt

Algorithm SBS(X, V ). (Simple Binary Search)


Input: A list X of n message / signature pairs, and a flag V .
Output: A list of all invalid signatures in the batch.
1. If n = 1, then call verif y((m1 , s1 ), V ). If the verification passes (true), then
return. Otherwise (f alse), output (m1 , s1 ) and return.
2. If n = 1, then call verif y(X, V ). If verification passes (true), then return.
Otherwise (f alse), go to Step 3.
3. Divide the batch X into 2 sub-batches, lef t(X) with n/2 signatures, and
right(X) with n/2 signatures. Call SBS(lef t(X), true). If SBS(lef t(X),
true) or a descendant sub-instance of SBS finds an invalid signature, then
call SBS(right(X), true) otherwise call SBS(right(X), f alse).
Algorithm SBS can be illustrated with a perfect binary tree. The tree’s root
represents all N message / signature pairs. Each left child represents half of
the message / signature pairs from its parent node, and each right child repre-
sents the other half. The depth of the tree is k + 1, so there are N leaves, each
representing a single message / signature pair. A leaf is called “invalid” if its
corresponding signature is invalid. Note that if the left child of an invalid node
is valid, then its sibling is not verified because it must be invalid.

Costs for Simple Binary Search. If there is only one invalid signature in the
batch, there will be at most 2 verifications for each non-root level of the tree.
Therefore, as shown in [11], the maximum number of verifications to find a single
invalid signature after a batch has failed is 2log2 N .
If there are 0 < w < N/2 invalid signatures in a batch of N signatures, the
worst-case cost is 2(2log2 w − 1 + w(log2 N  − log2 w)) batch verifications
when the tree is perfectly balanced. If w > N/2 then the worst-case cost is the
same as the cost for w = N/2.
Pastuszak et al. [11] also observed that if the verifier knows in advance that
the batch contains exactly one invalid signature, then the number of verifica-
tions can be reduced to log2 N . In the following section, we will show how to
modify Simple Binary Search to find a single invalid signature with log2 N  ver-
ifications without advance knowledge of the number of invalid signatures. This
modified method also reduces the worst-case cost when there are multiple invalid
signatures.

3 Binary Quick Search


Batch verification tests typically compare two quantities, X and Y , and the
batch is accepted as valid if they are equal. In many batch signature meth-
ods [1,2,3,6,17,18] as well as the batch Cha-Cheon method of Section 2.3, an
equivalent test is to compute A = XY −1 and accept the batch as valid if A = 1.
In most of these signature schemes, XY −1 can be computed at least as quickly
as computing X and Y individually. When this equivalent form of verification is
used, the “A” values can be used to eliminate some of the verifications required
by divide-and-conquer methods.
Finding Invalid Signatures in Pairing-Based Batches 39

For a batch of N signatures, A can be written as the product A1 A2 . . . AN .


Let Ā = A1 A2 . . . Aj for a sub-batch of j < N signatures labeled 1 through
j. Suppose that A = 1 and Ā = 1. Since Ai = 1 if and only if signature i is
valid, we can instantly determine whether or not the remaining signatures j + 1
through N are a valid batch. If A = Ā then Aj+1 Aj+2 . . . AN = 1, so signatures
j + 1 through N can be accepted as valid. Otherwise, the set of signatures j + 1
through N must contain at least one invalid signature.
We can modify the Simple Binary Search method of the previous section to
obtain the Binary Quick Search method as follows. The initial verification and
verification of all lef t(X) sub-batches are replaced by a computation of Ā and
checking whether Ā = 1. Verification of all right(X) sub-batches are replaced
by a comparison of the A values for its left sibling and its parent. This means
that the expensive verification will never be needed for right sub-batches. When
required, the A value for invalid right sub-batches can be efficiently computed
from the A values of its left sibling and its parent.
The Binary Quick Search can be applied to pairing-based batch verification
schemes that are equivalent to accepting the batch as valid if α0 = 1, where
N  N 
 
α0 = e Bk , P e Dk , R . (2)
k=1 k=1

For example, the Cha-Cheon Batch verification method of Section 2.2, equa-
tion (1), is equivalent to this test, where
Bk = rk (Uk + H(mk , Uk )Qk ) ,
Dk = rk Vk ,
P = S, and
R = −T.
Binary Quick’s worst-case cost (beyond the initial batch verification) is half
that of Simple Binary Search. If there are 0 < w < N/2 invalid signatures in a
batch of N signatures, the worst-case cost for this new method is

2log2 w − 1 + w(log2 N  − log2 w)

batch verifications when the tree is perfectly balanced. If w > N/2 then the cost
is the same as the cost for w = N/2.

4 Finding Invalid Signatures in Pairing-Based Batches


The Binary Quick Search method will find invalid signatures in pairing-based
batches using no more than approximately w log2 N verifications. While faster
than previously known methods, Binary Quick Search’s lower bound is expensive
for large N when the verifications require pairing computations. In this section
we propose two alternative methods for finding invalid signatures in pairing-
based batch signature schemes such as the Cha-Cheon verification presented
above, when the verification can be written as in equation (2).
40 L. Law and B.J. Matt

These new methods make use of the following observation. We can insert
integer multipliers mj,k into equation (2) to obtain
N  N 
 
αj = e mj,k Bk , P e mj,k Dk , R
k=1 k=1

where m0,k = 1 for all k = 1 to N .


Let I be the set of indices of invalid signatures, and let

Xk = e(Bk , P ) e(Dk , R).

Using the properties of bilinear pairings, we have


N
mj,k
αj = Xk .
k=1

However, Xk = 1 if the k th signature is valid, so we have


 m
αj = Xk j,k .
k∈I

We can identify the set I of invalid signatures by looking for relationships


between the αj ’s. For example, if the k th signature is the only invalid signature
in the batch, then we have α1 = α0 m1,k .
Since we are working in groups of prime order, we are free to choose the values
for the scalars mj,k to maximize the efficiency of our methods. Note that these
scalars can have lengths much smaller than the security parameter l used in
generating the randomizers rk in the initial batch verification.

4.1 Exponentiation Method


We begin with the same initial batch verification as in the Binary Quick Search
method of Section 3, given by equation (2). If α0 = 1 then all signatures in the
batch are accepted as valid. Otherwise, the following algorithm will identify the
bad signatures in the batch:

Input: A bad batch of N message / signature pairs to be verified, along with the
identities of each signer.
Output: A list of the invalid signatures.
1. Perform an computation similar to the batch computation of equation (2),
but first multiply each Bk and Dk by the signature identifier k, (k = 1
through N ):
N  N 
 
α1 = e kBk , P e kDk , R .
k=1 k=1
Finding Invalid Signatures in Pairing-Based Batches 41

Search for an i, 1 ≤ i ≤ N , such that α1 = αi0 .


If such an i is found, then output “the ith signature is invalid” and exit. If
no match is found, then there are at least 2 bad signatures in the batch. Go
to the next step.
2. Compute N  N 
 
α2 = e k (kBk ), P e k (kDk ), R .
k=1 k=1
−ij
Search for an (i, j), 1 ≤ i ≤ N , 1 ≤ j ≤ N , i < j such that α2 = αi+j1 α0 .
If an (i, j) pair is found, then output “the ith and j th signatures are invalid”
and exit. If no match is found, then there are at least 3 bad signatures in
the batch. Set w ← 3 and go to the next step.
3. Compute
N  N 
   
w−1 w−1
αw = e k k Bk , P e k k Dk , R . (3)
k=1 k=1

For all w-subsets of signatures x1 , . . . , xw , x1 < x2 < . . . < xw


Check that
w
(−1)t−1 pt
αw = (αw−t ) (4)
t=1

where pt is the tth elementary symmetric polynomial in x1 , . . . , xw .


Equivalently, we can search for p1 , . . . , pw that will solve equation (4). Once
we know p1 , . . . , pw , it is easy to solve for the signature identifiers x1 , . . . , xw .
If a match is found, then output “signatures x1 , . . . , and xw are invalid” and
exit. If no match is found, then there are at least w + 1 bad signatures in the
batch. Set w ← w + 1 and repeat Step 3, or stop and switch to a different
method.
Cost. We will assume that the number of invalid signatures, w, is small. We need
to compute each αi , for i = 1 to w, and to then solve equation (4).
N 
To compute each αi in (3), we first need to compute the quantities k i Bk
k=1
N 

and k i Dk . Solinas [14] observed that these two quantities can be computed
k=1
for i = 1 to w with only 2w(N − 1) elliptic curve additions (see Appendix B).
Each αi computation will also require 2 pairings and 1 multiplication in Fqd .
For w ≥ 2, each step of this algorithm will require the inverse of αw−2 (other
inverses can be saved from previous steps), so there will be w − 1 inverse com-
putations in Fqd .
Finally, we need to solve equation (4) for x1 , . . . , xw , where 1 ≤ xi ≤ N . If
w = 1, then this equation reduces to the discrete logarithm problem in Fqd ,
where the exponents are from an interval of length N . This can be solved using
square-root
√ methods such as Shanks’ baby-step giant-step method [12] with only
2 N multiplications in Fqd .
42 L. Law and B.J. Matt

For w ≥ 2, we can solve equation (4) for x1 , . . . , xw with (w−1)!


8
N w−1 +
w−2
O(N ) multiplications in Fqd (see Appendix A).
The approximate upper bound for the total cost to find w invalid signatures
in a bad batch of N signatures using the exponentiation method is 2w pairings,
2w(N − 1) elliptic curve additions, w − 1 inverses in Fqd , and the number of
√ 8
multiplications in Fqd is 2 N for w = 1 or (w−1)! N w−1 for w ≥ 2.

4.2 Exponentiation with Sectors Method


The exponentiation method of the previous section works efficiently when there
is a single invalid signature, but the number of combinations of potentially invalid
signatures grows quickly as the number of bad signatures increases, even for just
a few invalid signatures. We can slow this growth by dividing the N signatures
in a failed batch into Z sectors of T ≈ N Z signatures each and modifying the
Exponentiation method of the previous section to identify bad sectors instead of
bad signatures. Once these bad sectors have been identified, we use a variant of
the Exponentiation Method to identify the bad signatures from within the bad
sectors. We call this the Exponentiation with Sectors Method.
We begin with the same initial batch verification of the entire batch as in the
Binary Quick Search method of Section 3, given by equation (2). If the batch
fails, we use the following algorithm to identify the bad signature(s).

Input: A bad batch of N message / signature pairs to be verified, along with the
identities of each signer.
Output: A list of the invalid signatures.
Stage 1. Divide the N signatures into Z sectors of T ≈ N Z signatures each. Label
the sectors s1 through sZ . This stage will identify the bad sectors.
1. Compute
⎛ ⎛ ⎞ ⎞ ⎛ ⎛ ⎞ ⎞

Z  Z 
β1 = e ⎝ j⎝ Bk ⎠, P ⎠ e ⎝ j⎝ Dk ⎠, R⎠ .
j=1 k∈sj j=1 k∈sj

Search for an i, 1 ≤ i ≤ Z, such that β1 = αi0 . If such an i is found, then


Sector i is the only bad sector. Set v ← 1 and go to Step 1 of Stage 2. If no
match is found, then there are at least 2 bad sectors in the batch. Set v ← 2
and go to the next step.
2. Compute
⎛ ⎛ ⎞ ⎞ ⎛ ⎛ ⎞ ⎞
Z  Z 
βv = e ⎝ j ⎝j v−1 Bk ⎠, P ⎠ e ⎝ j ⎝j v−1 Dk ⎠, R⎠ . (5)
j=1 k∈sj j=1 k∈sj

Search for a v-subset of sectors s1 , . . . , sv , s1 < s2 < . . . < sv


such that
v
(−1)t−1 pt
βv = (βv−t ) (6)
t=1
Finding Invalid Signatures in Pairing-Based Batches 43

where β0=α0 and pt is the tth elementary symmetric polynomial in s1 , . . . , sv .


If a v-subset is found, then go to Step 1 of Stage 2. If no match is found,
then there are at least v + 1 bad sectors. Set v ← v + 1 and repeat Step 2,
or stop and switch to a different method.

Stage 2. Now v bad sectors of T signatures each have been identified. Therefore,
there are at least v invalid signatures in the batch. Set w ← v. Stage 2 will
identify these invalid signatures.

1. Label the signatures in the bad sectors from 1 to vT . Compute γi for i = 1


to w as follows:
 vT   vT 
 
i i
γi = e k Bk , P e k Dk , R . (7)
k=1 k=1

Compute the inverses of α0 (if w is even) and γw−2 , γw−4 , . . . Search for a
w-subset of signatures x1 , . . . , xw , x1 < x2 < . . . < xw such that


w
(−1)t−1 pt
γw = (γw−t ) (8)
t=1

where γ0=α0 and pt is the tth elementary symmetric polynomial in x1 , . . . , xw .


If a match is found then output the list of w invalid signatures and exit. If no
match is found, then there are more that w invalid signatures in the batch.
Set w ← w + 1 and go to the next step.
2. Compute γw as in equation (7) with i = w. Compute the inverses of γ’s as
needed and search for a w-subset of signatures x1 , . . . , xw , x1 < x2 < . . . <
xw to solve equation (8). If a match is found then output the list of w invalid
signatures and exit. If no match is found, then there are more than w invalid
signatures in the batch. Set w ← w + 1 and repeat this step.

Cost. To simplify the cost computation, we will assume that there are w bad
sectors and exactly 1 invalid signature per sector (w = v), which is the worst case.
This is a reasonable assumption because batch verification is most useful when
most batches are valid. When a batch does fail, the number of invalid signatures
is expected to be very small (unless the batch was intentionally flooded with bad
signatures). In most cases, there is unlikely to be more than one bad signature
in any given sector.
The Stage 1 cost to identify w bad sectors in a batch with Z Sectors will be
the same as the cost for the exponentiation method of Section 4.1 to identify w
bad signatures in a batch of Z signatures. This cost is 2w pairings, 2w(Z − 1)
elliptic curve additions, w − 1 inverses in Fqd and the number of multiplications

in Fqd is 2 Z for w = 1 or (w−1)!8
Z w−1 for w ≥ 2.
In Stage 2, Step 1, we need to identify w invalid signatures in w distinct
bad sectors of T = N Z signatures each. (We are assuming w = v so we will be
done after Step 1.) There are wT signatures in the w bad sectors. The costs for
44 L. Law and B.J. Matt


vT
equation (7) are 2w(wT − 1) elliptic curve additions to compute k i Bk and
k=1

vT
k i Dk for i = 1 to w (see Appendix B), and 2w pairings to compute the γi ’s.
k=1
For equation (8), we will need to compute  w2 inverses in Fqd .
Finally, we need to solve equation (8) for x1 , . . . , xw , where 1 ≤ xi ≤ wT .
If w = 1, then this equation reduces to the discrete logarithm problem in Fqd ,
where the exponents are from an interval of length T . This can be solved using
square-root
√ methods such as Shanks’ baby-step giant-step method [12] with only
2 T multiplications in Fqd .
For w ≥ 2, we can solve equation (8) for x1 , . . . , xw with (w−1)! 8
(wT )w−1
multiplications in Fqd (see Appendix A).
The approximate upper bound for the total cost for both stages of the Ex-
ponentiation with Sectors method to identify w bad signatures in a batch of N
signatures (assuming exactly one bad signature per sector) is 4w pairings, 2w(Z+
Z −2) elliptic curve additions, 1.5w −1
wN
  1/2
 inverses in Fqd and  the number

of mul-
w−1 
N 8 wN
tiplications in Fqd is: 2 Z 1/2 + Z for w = 1 and (w−1)! Z w−1 + Z
for w ≥ 2.
To minimize the cost, we need to select the number of sectors, Z, to balance the
work in Stage 1 and Stage√2. We can minimize the work for the most likely case,
w = 1, by choosing
√ Z = N sectors. In this case the total cost is 4w pairings,
2w((w + 1) N − 2) elliptic curve additions, 1.5w − 1 inverses in Fqd and the
8(w w−1 +1) (w−1)/2
number of multiplications in Fqd is 4(N )1/4 for w = 1 and (w−1)! N
for w ≥ 2.

5 Performance
Table 1 summarizes the approximate upper bound for the number of opera-
tions to identify a small number of invalid signatures in a bad batch for each of
the three new methods presented in this paper, and compares them to the cost
of Simple Binary Search and to the cost of verifying each signature individu-
ally. For each batch method, we have omitted the cost of the initial verification
of the entire batch. Note that other computational techniques exist that may
lower these costs in some cases, such as techniques for computing products of
pairings [8].
As shown in Table 1, Binary Quick Search is twice as fast as the worst case
for Simple Binary Search. The Exponentiation and Sector methods use fewer
pairings than Binary Quick Search, but the number of multiplications in F (q d )
for these methods increases proportionally to a power of the batch size. For very
large batch sizes, Binary Quick Search will always be the best method. However,
for reasonably sized batches of mostly valid signatures, the Exponentiation or
Sector method will often be the most efficient.
Finding Invalid Signatures in Pairing-Based Batches 45

Table 1. Number of Operations

Method Pairings Inverses EC additions Multiplications


in Fqd in Fqd
N individual 2N  0 0 0
Simple Binary 4w log2 N 0 0 0
Binary Quick 2w log2 N 0 0 0
Exponentiation

w=1 2 0 2(N − 1) 2 N
w−1
w≥2 2w w−1 2w(N − 1) (w−1)! N
8

N Sectors

w=1 4 0 4( N − 1) 4N 1/4
√ 8(w w−1
+1) (w−1)/2
w≥2 4w 1.5(w) − 1 2w((w + 1) N − 2) (w−1)! N

Table 2. Number of multiplies in Fq , where r and q are 160-bit values and d = 6

Batch Size
w Method 16 64 256 1024 4096

N individual 3.18·1005 1.27·1006 5.08·1006 2.03·1007 8.13·1007

0 Initial Verify 6.29·1004 1.99·1005 7.45·1005 2.93·1006 1.17·1007

Simple Binary 1.46·1005 2.19·1005 2.92·1005 3.65·1005 4.38·1005


1 Binary Quick 7.30·1004 1.09·1005 1.46·1005 1.82·1005 2.19·1005
√Exponent. 1.87·10  1.99·10  2.43·10  4.17·10 1.10·1005
04 04 04 04

N Sectors 3.67·1004
3.69·1004
3.73·1004
3.82·10  3.97·1004 
04

Simple Binary 2.55·1005 4.01·1005 5.47·1005 6.93·1005 8.39·1005


2 Binary Quick 1.28·1005 2.01·1005 2.74·1005 3.47·1005 4.20·1005
√Exponent. 3.91·10  4.70·10  7.85·10  2.04·10 7.08·1005
04 04 04 05

N Sectors 7.48·1004
7.67·1004
8.07·1004
8.85·10  1.04·1005 
04

Simple Binary 3.28·1005 5.47·1005 7.66·1005 9.85·1005 1.20·1006


3 Binary Quick 1.64·1005 2.74·1005 3.83·1005 4.92·1005  6.02·1005 
√Exponent. 7.12·1004  3.05·1005 4.00·1006 6.30·1007 1.01·1009
N Sectors 1.20·1005
1.50·10  2.67·10  7.32·10
05 05 05
2.58·1006

Simple Binary 4.01·1005 6.93·1005 9.85·1005 1.28·1006 1.57·1006


4 Binary Quick 2.01·1005 3.47·1005  4.92·1005  6.38·1005  7.84·1005 
√Exponent. 1.56·1005  5.32·1006 3.36·1008 2.15·1010 1.37·1012
N Sectors 2.30·1005
8.14·1005
5.48·1006
4.28·1007
3.41·1008


Plus N point multiplications by a scalar the size of the group order r.
46 L. Law and B.J. Matt

Tables 2 and 3 compare methods for finding invalid signatures in bad batches
for Cases A and C of [7]. In Case A, the group order r is a 160-bit value, the
elliptic curve E is defined over Fq , where q is a 160-bit value, and the embedding
degree d = 6. In Case C, the group order r is a 256-bit value, q is a 256-bit value,
and the embedding degree d = 12. All costs are given in terms of the number of
multiplications (m) in Fq using the following estimates from [7]:

– For Case A, 1 pairing = 9120m, 1 multiplication in Fq6 = 15m, 1 inverse


in Fq6 = 44m (assuming 1 inverse in Fq = 10m), 1 elliptic curve addition
= 11m, and an elliptic point multiplication by a 160-bit value is 614m and by
an 80-bit value is 827m. The cost of a pair of elliptic point multiplications by
160-bit and 80-bit integers simultaneously is 2017m, using the Joint Sparse
Form [13].
– For Case C, 1 pairing = 43, 703m, 1 multiplication in Fq12 = 45m, 1 inverse
in Fq12 = 104m, 1 elliptic curve addition = 11m, and an elliptic point multi-
plication by a 256-bit value is 2535m and by an 128-bit value is 1299m. The
cost of a pair of elliptic point multiplications by 256-bit and 128-bit values
simultaneously is 3225m, using the Joint Sparse Form.

Table 3. Number of multiplies in Fq , where r and q are 256-bit values and d = 12

Batch Size
w Method 16 64 256 1024 4096

N individual 1.44·10 06
5.76·10 06
2.30·1007
9.21·10 07
3.68·1008

0 Initial Verify 1.58·1005 3.76·1005 1.24·1006 4.72·1006 1.86·1007

Simple Binary 6.99·1005 1.05·1006 1.40·1006 1.75·1006 2.10·1006


1 Binary Quick 3.50·1005
5.24·1005
6.99·1005
8.74·1005
1.05·1006
√Exponent. 8.81·10  8.95·10  9.45·10  1.13·10  1.83·1005
04 04 04 05

N Sectors 1.75·1005 1.76·1005 1.76·1005 1.77·1005 1.79·1005 

Simple Binary 1.22·1006 1.92·1006 2.62·1006 3.22·1006 4.02·1006


2 Binary Quick 6.12·1005
9.61·1005
1.31·1006
1.66·1006
2.01·1006
√Exponent. 1.81·10  2.01·10  2.78·10  5.89·10 1.83·1006
05 05 05 05

N Sectors 3.54·1005 3.59·1005 3.69·1005 3.88·1005  4.27·1005 

Simple Binary 1.57·1006 2.62·1006 3.67·1006 4.72·1006 5.77·1006


3 Binary Quick 7.87·1005
1.31·1006
1.84·1006
2.36·10  2.88·1006 
06

√Exponent. 3.09·10  1.00·10 1.21·10 1.89·1008 3.02·1009


05 06 07

N Sectors 5.54·1005
6.41·10  9.89·10  2.38·10
05 05 06
7.91·1006

Simple Binary 1.92·1006 3.32·1006 4.72·1006 6.12·1006 7.52·1006


4 Binary Quick 9.61·1005
1.66·10  2.36·10  3.06·10  3.76·1006 
06 06 06

√Exponent. 5.97·10  1.61·1007 1.01·1009 6.44·1010 4.12·1012


05

N Sectors 9.50·1005
2.70·1006
1.67·1007
1.29·1008
1.02·1009
Finding Invalid Signatures in Pairing-Based Batches 47

To indicate the best method for each batch size and number of invalid signatures,
the table entry with the lowest cost is given in brackets.
The results in Table 2 for Case A show that the Exponentiation method is the
most efficient method when there are one or two invalid signatures in a batch of
up to 256 signatures, with costs that are as much as 83% lower than the cost of
Binary Quick Search and 92% lower than Simple Binary Search. For batches of
1024 √ to 4096 signatures with one or two invalid signatures, the Sector method
with N sectors is the most efficient, with costs that are up to 82% lower cost
than Binary Quick Search and 91% lower than Simple Binary Search. If the batch
size is large enough, Binary Quick Search will be the most efficient method, and
the size at which this happens is smaller if there are more invalid signatures
in the batch. However, when there are 4 invalid signatures, the Exponentiation
method is still 22% faster than Binary Quick Search, and 61% faster than Simple
Binary, for batches of 16 signatures.
The results in Table 3 for Case C show that the benefit of using the Expo-
nentiation and Sector methods can be even more significant at higher security
levels. For example, the cost of the Exponentiation or Sector methods are up to
87% less than the cost of Binary Quick Search, and up to 94% less than the cost
of Simple Binary Search, for batches with up to 4096 signatures in which one or
two signatures are invalid.

6 Conclusion

We have presented two new methods for identifying invalid signatures in pairing-
based, batch signature schemes with low numbers of invalid signatures, and have
analyzed their performance. These methods are applicable to batch verification
schemes employing small exponent techniques such as the Cha-Cheon scheme
described here, or BLS [2] short signatures when all signatures in the batch are
applied to the same message or signed by the same signer. These new methods
offer significant speedups for such schemes.
We have presented Binary Quick Search, an improvement to previous “divide-
and-conquer” methods. This method is better suited for identifying larger num-
bers of invalid signatures, especially in large batches, than our other methods,
and has application to batch signature schemes that do not use pairings, includ-
ing [1,3,17].

References

1. Bellare, M., Garay, J., Rabin, T.: Fast Batch Verification for Modular Exponen-
tiation and Digital Signatures. In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS,
vol. 1403, pp. 236–250. Springer, Heidelberg (1998)
2. Boneh, D., Lynn, B., Shacham, H.: Short Signatures from the Weil Pairing. In:
Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Hei-
delberg (2001)
48 L. Law and B.J. Matt

3. Boyd, C., Pavlovski, C.: Attacking and Repairing Batch Verification Schemes. In:
Okamoto, T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 58–71. Springer, Hei-
delberg (2000)
4. Camenisch, J., Hohenberger, S., Pedersen, M.: Batch Verification of Short Sig-
natures. In: EUROCRYPT 2007. LNCS, vol. 4515, pp. 246–263. Springer, Hei-
delberg (2007), See also Cryptology ePrint Archive, Report 2007/172 (2007),
[Link]
5. Cha, J., Cheon, J.: An Identity-Based Signature from Gap Diffie-Hellman Groups.
In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 18–30. Springer, Heidel-
berg (2002)
6. Cheon, J., Kim, Y., Yoon, H.: A New ID-based Signature with Batch Verification,
Cryptology ePrint Archive, Report 2004/131 (2004),
[Link]
7. Granger, R., Page, D., Smart, N.P.: High Security Pairing-Based Cryptography
Revisited. In: Hess, F., Pauli, S., Pohst, M. (eds.) ANTS VII. LNCS, vol. 4076, pp.
480–494. Springer, Heidelberg (2006)
8. Granger, R., Smart, N.P.: On Computing Products of Pairings, Cryptology ePrint
Archive, Report 2006/172 (2006), [Link]
9. Lee, S., Cho, S., Choi, J., Cho, Y.: Efficient Identification of Bad Signatures in
RSA-Type Batch Signature. IEICE Transactions on Fundamentals of Electronics,
Communications and Computer Sciences E89-A(1), 74–80 (2006)
10. Naccache, D., M’Raihi, D., Vaudenay, S., Raphaeli, D.: Can D.S.A. be improved?
Complexity Trade-offs with the Digital Signature Standard. In: De Santis, A. (ed.)
EUROCRYPT 1994. LNCS, vol. 950, pp. 77–85. Springer, Heidelberg (1995)
11. Pastuszak, J., Michalek, D., Pieprzyk, J., Seberry, J.: Identification of Bad Signa-
tures in Batches. In: Imai, H., Zheng, Y. (eds.) PKC 2000. LNCS, vol. 1751, pp.
28–45. Springer, Heidelberg (2000)
12. Shanks, D.: Class Number, a Theory of Factorization and Genera. Proc. Symp.
Pure Math. 20, 415–440 (1969) (AMS 1971)
13. Solinas, J.: Low-Weight Binary Representations for Pairs of Integers, Technical
Report CORR 2001-41, Centre for Applied Cryptographic Research (2001)
14. Solinas, J.: Personal communication
15. Stanek, M.: Attacking LCCC Batch Verification of RSA Signatures, Cryptology
ePrint Archive, Report 2006/111 (2006), [Link]
16. Sury, B., Wang, T., Zhao, F.: Identities Involving Reciprocals of Binomial Coeffi-
cients. Journal of Integer Sequences 7, Article 04.2.8 (2004)
17. Yen, S., Laih, C.: Improved Digital Signature Suitable for Batch Verification. IEEE
Transactions on Computers 44(7), 957–959 (1995)
18. Yoon, H., Cheon, J.H., Kim, Y.: Batch verifications with ID-based signatures. In:
Park, C., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 223–248. Springer,
Heidelberg (2005)

A Solving Equations (4), (6) and (8) Using the Factor


Method
In equation (4) of Section 4.1, and equations (6) and (8) of Section 4.2, we need
to solve an equation of the form

w
pt
B= (At ) (9)
t=1
Finding Invalid Signatures in Pairing-Based Batches 49

where pt is the tth symmetric polynomial in x1 , x2 , . . . , xw , At ∈ Fqd and

1 ≤ xi ≤ M.
t−1
For equation (4): B = αw , M ← N and At = (αw−t )(−1) for t = 1 to w.
(−1)t−1
For equation (6): B = βw , M ← Z and At = (βw−t ) for t = 1 to w.
(−1)t−1
For equation (8): B = γw , M ← wN
Z and At = (γw−t ) for t = 1 to w.

A.1 w = 2: Factor Method


When w = 2, equation (9) reduces to
(x1 +x2 ) (x1 x2 )
B = A1 A2 . (10)

If we raise both sides of equation (10) to the 4th power and observe that 4x1 x2 =
(x2 + x1 )2 − (x2 − x1 )2 , equation (10) becomes
(x2 +x1 )2 (x2 −x1 )2
B 4 (A−4
1 )
(x2 +x1 )
(A2 −1 ) = (A2 −1 ) .
This can be written as
2 2
δ0 (δ1 )s (δ2 )(s) = (δ2 )(m) (11)

where
δ0 = B 4 , δ1 = A1 −4 , δ2 = A2 −1
s = x2 + x1 , m = x2 − x1
1 ≤ s ≤ 2M − 1
1 ≤ m ≤ M − 1.
To solve equation (11), the M − 1 possible values for the right hand side of the
equation, RHS(i) , are computed and stored in a search tree. Then the 2M − 1
values for the left side, LHS(i) , are computed and compared, each in log M time,
with the stored values. If a match is found, the values x1 and x2 can be easily
computed. To compute the values for the right hand side

1. Set RHS1 = δ2 and compute δ2 (2) .


2. For 2 ≤ i < M , compute
(2i−1) (2i−3) (2)
(a) (δ2 ) = (δ2 ) (δ2 ) and
(i) 2 (i−1) 2 (2i−1)
(b) RHS(i) = (δ2 ) = (δ2 ) (δ2 ) .

Computing the values for the left hand side of equation (11) is performed in two
2
stages. In the first stage, the values (δ2 )(i) , which have already been computed
for the right hand side, are re-used to reduce the number of multiplications.
50 L. Law and B.J. Matt

Stage 1:
(3) (3)2 (3)2
1. Compute ζ3 = δ0 δ1 , lookup δ2 and compute LHS3 = ζ3 δ2 .
2. For 4 ≤ i < M , compute
(a) ζ(i) = ζ(i−1) δ1 ,
(i)2 (i)2
(b) lookup δ2 and compute LHS(i) = ζ(i) δ2 .
Stage 2:
1. For M ≤ i < 2M compute
(a) ζ(i) = ζ(i−1) δ1 ,
(b) (δ2 )(2i−1) = (δ2 )(2i−3) (δ2 )(2) ,
2 2
(c) (δ2 )(i) = (δ2 )(i−1) (δ2 )(2i−1) and
2
(d) LHS(i) = ζ(i) (δ2 )(i) .

Cost.
Computing the values for the right hand side requires 2M multiplications. Com-
puting the values for the left hand side requires 2M multiplications for Stage 1
and 4M for Stage 2. The total cost of using the Factor method for w = 2 is no
more than 8M multiplications in Fqd .

A.2 w ≥ 3: Iterative Use of the Factor Method


When w ≥ 3, equation  (9) can
 be solved by reducing the problem to the w = 2
M −2
case for each of the possible combinations of the values of x3 , . . . , xw .
w−2
We can express pt , the tth elementary symmetric polynomial in x1 , . . . , xw as

pt = (x1 x2 )ut−2 + (x1 + x2 )ut−1 + ut .


If 0 ≤ i ≤ w − 2 then ui is the ith elementary symmetric polynomial in x3 , . . . , xw
and ui = 0 otherwise. Equation (9) becomes

w
((x1 x2 ) ut−2 +(x1 +x2 ) ut−1 + ut )
B= (At ) .
t=1

Rewriting this equation in the form of equation (11)

⎛ ⎞(s) ⎛ ⎞(s)2 ⎛ ⎞(m)2


w−2 ⎜w−1 ⎟ ⎜w ⎟ ⎜w ⎟
 ⎜ −4ut−1 ⎟ ⎜ ⎟ ⎜ −ut−2 ⎟
B4 (At )
−4ut⎜ (A ) ⎟ ⎜ (At )−ut−2 ⎟ = ⎜ (A ) ⎟ .
⎜ t ⎟ ⎜ ⎟ ⎜ t ⎟
t=1 ⎝ t=1 ⎠ ⎝t=2 ⎠ ⎝t=2 ⎠
           
δ0 δ1 δ2 δ2

The Factor method can be used with the values δ0 , δ1 and δ2 as shown above,
with 1 ≤ s ≤ 2x3 − 3 and 1 ≤ m ≤ x3 − 1.
Finding Invalid Signatures in Pairing-Based Batches 51

Cost.  
M −2
The Factor method is used a maximum of times to test equation (9).
w−2
Therefore, for w ≥ 2, the cost of all calls to the Factor method is bounded by

 
M −2 8
8M < M w−1
w−2 (w − 2)!
multiplications in Fqd .

We can establish a tighter upper bound if we consider the fact that x1 and
x2 must lie in the range [1, x3 − 1] instead of [1, M ]. The number 
of times the

M − x3
Factor method is called for any given value of x3 is no more than .
w−3
Therefore, excluding the cost of computing the δ’s, the cost of the using the
Factor method for w ≥ 3 is
M−(w−3)  
 M − x3
8 (x3 − 1)
w−3
x3 =3

8 
M−w+3
≤ x (M − x)w−3 (12)
(w − 3)! x3 =3

multiplications in Fqd .
The most significant term of (12) can be written as
⎛ ⎞

w−2   
8 w−3 1
M w−1 ⎝ (−1)j−1 ⎠.
(w − 3)! j−1 j + 1
j=1

However, by an identity given in [16], we have



w−2   
j−1 w−3 1 1
(−1) = .
j−1 j+1 (w − 1)(w − 2)
j=1

Therefore, the most significant term of (12) becomes


8
M w−1 . (13)
(w − 1)!

Cost of computing the δ’s. We also need to compute the different δ0 , δ1 and δ2
required for each use of the Factor method.

w−2
ut 
w−2
ut 
w−2
δ0 = B 4 (A−4
t ) , δ1 = (A−4
t+1 ) , δ2 = Aut+2
t
.
t=1 t=0 t=0
52 L. Law and B.J. Matt

ut
To compute δ0 in the following algorithm, all possible values of each (A−4 t ) are
computed and combined to produce the value of δ0 for each possible combination
of the values x3 , . . . , xw .
ut
Next, all possible values of (A−4 t )  are computed
 for each t. Since the max-
w−2 t
imum value of each ut is less than M , we can compute all possible
  t
ut w−2
values of (A−4
t ) in M t multiplications.
t
Finally, we compute the δ’s. We can compute all possible values of δ0 by
u1
computing all possible values for η1 = B 4 (A−4 1 ) and then computing all pos-
t ut
sible values for ηt = B 4
(Az ) , for 2 ≤ t ≤ w − 2, from ηt−1 and (A−4
−4 uz
t ) .

z=1 
M
Since there are less than possible values of x3 , . . . , xw , we can compute
w−2
 
M
δ0 = ηw−2 with (w − 3) multiplications.
w−2
The method computes δ1 and δ2 in a similar fashion. The total number of
multiplications required to compute the δ’s is bounded by
w−2  
 w − 2 
M 3 (w−3) w−2
t
3 M +(w−3) < 3 (M+1)w−2 + M . (14)
t w−2 (w−2)!
t=1

The total cost to iteratively use the Factor Method is the sum of equations (13)
and (14), but this cost is dominated by equation (13). Therefore, the approximate
number of multiplications in Fqd to solve equation (9) is

8
M w−1 (15)
(w − 1)!

for w ≥ 2.

B Faster Method for Computing Sums of the Form


N
(ktPk)
k=1

Let Pk , for k = 1 to N , be points on an elliptic curve. We want to compute


N
 t
SU Mt = k Pk (16)
k=1

for t = 1 to w. Computing these values in the obvious way would require w(N −1)
elliptic scalar multiplications by a log2 N bit-integer and w(N − 1) elliptic curve
additions. We show a method due to Solinas [14] to compute these w quantities
with only w(N − 1) elliptic curve additions.
Finding Invalid Signatures in Pairing-Based Batches 53

First, we compute
N 
 
t+k−1
Ut = Pk
t
k=1

for t = 1 to w. These w values can be computed from the Pk ’s as follows:


Vk ← Pk for 1 ≤ k ≤ N
For t = 0 to w
For k from N − 1 downto 1 do
Vk ← Vk + Vk+1
Next k
Ut ← V1
Next t

The cost of this algorithm is w(N − 1) elliptic curve additions. (We don’t count
the cost for the first time through the loop because the computations for t = 0
can be computed during the initial batch verification.)
We can now compute SU Mt (equation (16)) from U1 , . . . , Ut :


t
SU Mt = (−1)t−k (k!)st,k Uk (17)
k=1

where the st,k ’s are the Stirling numbers of the second kind. We can compute
SU Mt , for t = 1 to w, with only O(w2 ) operations:
SU M1 ← U1
s0 ← 0
s1 ← 1
For i = 2 to w
si ← 0
Next i
For t = 2 to w
SU Mt ← 0
For k from t downto 1 do
sk ← sk−1 + ksk
SU Mt ← k (−1)t−k sk Uk + SU Mt
Next k
Return SU Mt
Next t

For large N and small w, we can ignore the O(w2 ) operations. Therefore, the
total cost of computing SU Mt , for t = 1 to w, is approximately w(N − 1) elliptic
curve additions.
How to Forge a Time-Stamp
Which Adobe’s Acrobat Accepts

Tetsuya Izu, Takeshi Shimoyama, and Masahiko Takenaka

FUJITSU LABORATORIES Ltd.


4-1-1, Kamikodanaka, Nakahara-ku,
Kawasaki, 211-8588, Japan
{izu,shimo,takenaka}@[Link]

Abstract. This paper shows how to forge a time-stamp which the latest
version of Adobe’s Acrobat and Acrobat Reader accept improperly. The
target signature algorithm is RSASSA-PKCS1-v1 5 with a 1024-bit pub-
lic composite and the public key e = 3, and our construction is based
on Bleichenbacher’s forgery attack presented in CRYPTO 2006. Since
the original attack is not able to forge with these parameters, we used
an extended attack described in this paper. Numerical examples of the
forged signatures and times-stamp are also provided.

Keywords: Bleichenbacher’s forgery attack, RSASSA-PKCS-v1 5, time-


stamp forgery, Acrobat, Acrobat Reader.

1 Introduction

In the rump session of CRYPTO 2006, held on August 2006, Bleichenbacher pre-
sented a new forgery attack [6] against the signature scheme RSASSA-PKCS1-
v1 5 (PKCS#1v1.5 for short) defined in PKCS#1 [13] and RFC 3447 [16], a
cryptographic standard developed and maintained by RSA Laboratories [13].
The attack allows an adversary to forge a valid signature on an (almost) arbi-
trary message in a very simple way, if an implementation of the signature scheme
is loose, namely, a format check in the verification is not adequate. In fact, sev-
eral implementations of PKCS#1v1.5 including OpenSSL, Firefox2 and Sun’s
JRE (Java Runtime Environment) library had this vulnerability. In response to
Bleichenbacher’s attack, US-CERT published a vulnerability note on September
2006 [7], and these implementations resist the attack now.
Since Bleichenbacher’s presentation was limited to the case when the bit-
length of the public composite n (denoted by |n|) is 3072 and the public exponent
e is 3, applicability to other parameters was unclear. Though Tews showed the
applicability of the extended forgery attack when |n| = 1024 and e = 3 [19],
other cases such as e = 17, 65537 has not been discussed yet.
In this paper, we analyze Bleichenbacher’s forgery attack and show applica-
ble composite sizes for given exponents. Then we propose the extended attack
with assuming the same implementational error, which is a generalization of the

S.D. Galbraith (Eds.): Cryptography and Coding 2007, LNCS 4887, pp. 54–72, 2007.

c Springer-Verlag Berlin Heidelberg 2007
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 55

original attack and Tew’s extended attack. For fixed n and e, the success proba-
bility of the proposed attack is 2(|n|−15)/e−353 in the random oracle model. When
|n| = 1024 and e = 3, the proposed attack succeeds the forgery with probabil-
ity 2−16.6 which coincides the Tew’s experiment [19]. Note that the preliminary
version of the proposed attack was published in [9].
In the proposed attack, the success probability of a forgery on the chosen
message is 2(|n|−15)/e−353 . However, when the message is ‘redundant’, namely,
it includes supplementary data (such as a name of used tool, a name of author,
date) besides a body of the message and whose size is large enough, the adver-
sary can forge on the chosen message by changing the redundant part. As an
example of this scenario, we show how to forge a time-stamp defined in RFC
3161 [4] (in which 64-bit nonce space is available) with our proposed attack. Fi-
nally and most importantly, we check some verification client programs whether
they accept forged time-stamps by (1) the proposed attack and (2) a variant of
Bleichenbacher’s attack by Oiwa et al. [11]., As a result, a forged time-stamp
by the proposed attack embedded in a PDF (Portable Document Format) file
was improperly accepted by the latest version of Adobe’s Acrobat and Acrobat
Reader 1 .
The rest of this paper is organized as follows: in section 2, Bleichenbacher’s
forgery attack against PKCS#1v1.5 and analytic results are described. Then the
extended attack is proposed in section 3, and its application to the time-stamp
scheme is described in section 4. Some numerical examples of forged signatures
are in the appendix.

2 Bleichenbacher’s Attack
This section describes Bleichenbacher’s forgery attack [6] against RSASSA-
PKCS1-v1 5 (PKCS#1v1.5 for short) with the loose implementation. Let n be
an RSA composite whose size is denoted by |n| (in bit). In the followings, a
variable in the typewriter font denotes an octet string and a variable in the
Roman font denotes an integer. Two variables in the same letter correspond to
each other, namely, A is an octet representation of an integer A, vice versa.

2.1 RSASSA-PKCS1-v1 5
Let us introduce the signature scheme RSASSA-PKCS1-v1 5 defined in PKCS#1
[13]. For a given public composite n (a product of two large primes with same
size) such that |n| is a multiple of 8, a message m (to be signed) is encoded to
an integer M , an integer representation of an octet string M defined by

M = 00||01||PS||00||T||H, (PKCS#1v1.5 message format)

where PS is an octet string with ff such that |M| = |n| (and |PS| ≥ 64), T is an
identifier of the signature scheme and the hash function (Table 1), and H is an
1
The authors have already submitted the vulnerability report to a governmental or-
ganization.
56 T. Izu, T. Shimoyama, and M. Takenaka

Table 1. Identifiers of the algorithm and the hash function [13]

Hash Function Length (bit) Octet String


MD2 144 3020300c06082a864886f70d020205000410
MD5 144 3020300c06082a864886f70d020505000410
SHA-1 120 3021300906052b0e03021a05000414 (= TSHA1 )
SHA-256 152 3031300d060960864801650304020105000420
SHA-384 152 3041300d060960864801650304020205000430
SHA-512 152 3051300d060960864801650304020305000440

octet representation of the hash value H(m). Then, a signature s is generated


by s = M d mod n for the signer’s secret integer d.
On input the original message m, its signature s and the signer’s public expo-
nent e, a verifier obtains an octet string M representing an integer M  = se mod n
and checks whether it satisfies the format

M = 00||01||PS||00||T||H .

Then the verifier obtains a value H  , an integer representation of the octet string
H , and compares whether H  = H(m). If this equation holds, the signature is
accepted by the verifier.
In the implementation level, a part of the format check is sometimes inade-
quate by some reasons. For example, when an octet string

00||01||PS||00||T||H ||garbage.

(a garbage data is followed) is obtained by a verifier as a decoded message, it


should be rejected because it is in the illegal format. However, some implemen-
tations accept the string because they do not check the number of ff and they
stop the scan at the end of H’ (namely, they do not notice the existence of the
garbage). Such loose implementation is the target of Bleichenbacher’s forgery
attack described in the next subsection.

2.2 Outline of Bleichenbacher’s Attack

Next, let us introduce Bleichenbacher’s forgery attack [6] against PKCS#1 v1.5.
Here we assume that the hash function SHA-1 and parameters |n| = 3072 and
e = 3 are used [8]. In the attack, an adversary chooses a message m̄ with arbitrary
bit-length such that
a = 2288 − (T × 2160 + H(m̄))
is divisible by 3, where T is an integer representation of the octet string TSHA1
(as in Table 1). Note that such m̄ can be obtained by generating m̄ randomly (3
trials are required on average). The adversary also computes two integers

g = a2 /3 × 21087 − a3 /27 × 2102 , s̄ = 21019 − a/3 × 234 .


How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 57

Observe that
s̄e = (21019 − a/3 × 234 )3
= 23057 − a × 22072 + a2 /3 × 21087 − a3 /27 × 2102
= 23057 − 22360 + T × 22232 + H(m̄) × 22072 + g
= (2985 − 2288 + T × 2160 + H(m̄)) × 22072 + g.
Since an integer 2985 − 2288 + T × 2160 + H(m̄) corresponds to an octet string
00||01||ff...ff||00||T||H (the number of ff is different from that of the original
PS), s̄ is a forged signature on the message m̄, if an implementation of the
verification ignores the number of ff and the garbage g. In the forgery, the
adversary only requires to compute H(m̄), a and s̄. This is why the attack is
called “the pencil and paper attack” [6]. Note that the adversary does not use
modulus computations and thus integers n, d are not required in the forgery.
A numerical example of Bleichenbacher’s forgery attack with a 3072-bit com-
posite and e = 3 is shown in Table 7 in appendix A.

2.3 Analysis
This subsection analyzes Bleichenbacher’s forgery attack with general parame-
ters. Only SHA-1 is considered in the following, however, similar attacks and
analysis can be easily obtained for other hash functions. For simplicity, we con-
sider the public composite n with arbitrary length (rather than a multiple of 8).
Firstly, we consider the case with general n but e = 3. Since the padding
00||TSHA1 is 128-bit and the hash value is 160-bit, we use the same a as in the
original attack, namely a = 2288 − (T × 2160 + H(m̄)) such that 3|a. Let
s̄(α, β) = 2α − a/3 × 2β ,
be a forged signature. Then, we have
s̄(α, β)3 = 23α − a × 22α+β + g(α, β)
for the garbage g(α, β) = a2 /3 × 2α+2β − a3 /27 × 23β . Since s̄(α, β)3 should be in
the PKCS#1v1.5 format, we have 3α = |n|−15, namely, α = (|n|−15)/3 and |n|
should be divisible by 3. On the other hand, since the garbage should be smaller
than 22α+β , we have 2α+β > 576+α+2β−log2 3, namely, β < |n|/3−581+log2 3.
By substituting β ≥ 0 in this inequality, we have a condition on n that
|n| > 1743 − 3 log2 3 = 1738.24....
Consequently, Bleichenbacher’s attack with e = 3 is applicable to the case with
|n| ≥ 1739 with |n| is divisible by 3. More precisely, |n| can be parameterized by
|n| = 3k for k ≥ 580 and β is in a form β = 8 + 2 (0 ≤  ≤ 55) since PS is a
repetition of the octet string ff.
Next, let us discuss with general n and e. Similar to the above discussion, we
set s̄(α, β) = 2α − a/e × 2β for a = 2288 − (T × 2160 + H(m̄)) such that e|a and
α = (|n| − 15)/e. Then, we have
s̄(α, β)e = 2eα − a × 2(e−1)α+β + g(α, β)
58 T. Izu, T. Shimoyama, and M. Takenaka

for the garbage g(α, β) = a2 (e − 1)/(2e) × 2(e−2)α+2β + . . . . By the same discus-


sion, we have conditions on n that
 
2e
|n| > 576e + 15 − e log2
e−1

and |n| − 15 is divisible by e. Also, we have 0 ≤ β < |n|/3 − 581 + log2 3 and
β ≡ 2 (mod 8) on β. Especially, we have |n| = 17k + 15 (k ≥ 575) for e = 17,
and |n| = 65537k + 15 (k ≥ 1061) for e = 65537. Consequently, Bleichenbacher’s
attack for general e is far from feasible. Even if e = 3, Bleichenbacher’s attack
cannot be applicable to 1024-bit (since 1024 is smaller than 1739) or 2048-bit
composites (since n − 15 = 2033 is not divisible by 3, 17, 65537).

2.4 Oiwa et al.’s Variant


Recently, Oiwa et al. proposed a variant of Bleichenbacher’s attack [11]. In the
message format 00||01||PS||00||T||H, T||H can be described by {{OID, PF}, H} in
ASN.1 language, where {, ..., } denotes the enumerate type, OID is the hash
object ID and PF is the parameter field. In PKCS#1, PF is defined as NULL.
When PF is replaced by non-null data, though the message format is not accepted
in PKCS#1, it is acceptable by an ASN.1 parser. An idea of a variant attack by
Oiwa et al. [11] is to insert the garbage into the parameter field rather than to
the end of the message format. If message format is checked by generic ASN.1
parser, the forgery will be successful. In fact, they actually forged a signature
and found the vulnerability in GNUTLS ver 1.4.3 and earlier (though they are
resistant to Bleichenbacher’s attack).
By the same analysis, it is easily shown that Oiwa et al.’s variant has the
same ability to Bleichenbacher’s attack. Moreover, the same extension proposed
in the next section can be possible.

3 Extending Bleichenbacher’s Attack


The security of PKCS#1v1.5 relies on the hardness of factoring n and computing
the e-th root mod n. A key idea of Bleichenbacher’s forgery is to set the forged
signature s̄ in the special form so that upper bits of s̄e are in the PKCS#1v1.5
message format by using the garbage g. In this scenario, an adversary computes
the e-th power only, however, because of the speciality of the forged signature,
the public composites should be large as described in the previous section. In
this section, we extend Bleichenbacher’s attack by using computers rather than
pencils and papers. Our strategy is to obtain a forged signature in non-special
forms. To do so, for a given hash value H(m̄), we search s̄e such that the e-th
root over integer exists, by computing the e-th root over real numbers (note that
the e-th root computation over real number is easy with computers).
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 59

Fig. 1. Outline of the proposed forgery attack

3.1 Description of Proposed Forgery Attack


Let m̄ be a message and H(m̄) be its hash value by SHA-1. For given bit-length
of the composite |n| ≥ 369, define f as a function of m̄ by

f = 2|n|−15 + 15 × (2|n|−23 + · · · + 2|n|−79 ) + T × 2|n|−208 + H(m̄) × 2|n|−368


= (2192 − 2128 + T ) × 2|n|−208 + H(m̄) × 2|n|−368

where T denotes an integer representation of the octet string TSHA1 (Table 1).
Note that the integer 2192 − 2128 + T in the above equation represents √ an octet
e
string 00||01||ffffffffffffffff||00||T.
√  Next, compute√ the
e e-th root f as a
real number and its ceiling e
f . If the difference g = e
f − f is smaller than
√ √ 
2|n|−368 , the forgery succeeds, since the forged signature s̄ = e f + g = e f is
valid on the message m̄ with the garbage g. If g is not small, change the message
m̄ until the forgery succeeds.
Let us analyze the proposed forgery attack with general n and e. In the
failure case, some least significant
√ bits of H(m̄) in f , say t bits, differs from
the corresponding part in  e f e (in other words, these t bits coincide in the
successful case), where t is a parameter determined by n√and e (see Figure 1).
In the forgery, f is (|n| − 15)-bit
√ and the integer part of e f is (|n| − 15)/e-bit,
and the uncontrolled part of  f e is (e − 1)(|n| − 15)/e-bit. Thus we have a
e

condition |n| − 208 − (160 − t) > (e − 1)(|n| − 15)/e, namely,

|n| > (353 − t)e + 15. (1)

Here we implicitly used the random oracle assumption. Especially, when |n| =
1024 and e = 3, this condition implies that t > 50/3 ≈ 16.6. That is, in order to
forge a signature with 1024-bit composites, the proposed forgery attack succeeds
60 T. Izu, T. Shimoyama, and M. Takenaka

Table 2. A forged signature by the extended forgery attack (1024-bit, e = 3)

m̄ “00002e36”
H(m̄) 701f0dd6 f28a0bab 4b647db8 ddcbde40 1f810d4e
f 0001ffff ffffffff ffff0030 21300906 052b0e03 021a0500 0414701f 0dd6f28a
0bab4b64 7db8ddcb de401f81 0d4e0000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
s̄ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 0001428a 2f98d728 ae220823
1fc5cff6 ac440735 9b078378 24846240 4cebfc71 5690f34c 7119d1da 99227fd0
s̄e 0001ffff ffffffff ffff0030 21300906 052b0e03 021a0500 0414701f 0dd6f28a
0bab4b64 7db8ddcb de401f81 0d4e 06dd 391b3fd4 ace323ee de903694 dd78887f
5f8a73e0 5ea698ae 72a6bdfa cb7c359e 1f78cbee 96939eea 4d9b8f3e 47aebae3
90f4fe61 73ef7535 80c4cb88 edd95623 84b7e5ed ccc19fa3 ca64c0a2 a37e5000

with probability 2−16.6 , namely, 216.6 messages are required to forge a signature,
which is feasible in practice. Note that the proposed attack is a generalization
of the extension by Tews [19] in which 275992 ≈ 218.1 messages are required in
the experiment.
In the above construction, the number of the octet ff in f was fixed to 8 (this
is minimum for the forgery). Similar construction is possible with more octets
than 8, but requires larger composites instead.

Numerical Example. As an example of the proposed forgery attack, a forged


signature on the message m̄ =“00002e36” (as a binary data with big endian) with
|n| = 1024 and e = 3 is shown in Table 2, where underlined octets correspond to
the hashed value H(m̄) and masked octets correspond to the garbage g. Here,
the messages were incrementally generated (as integers) from “00000000”, and
0x00002e36 = 262144 = 213.53 messages were generated until the forgery
succeeds.

3.2 Special Cases


Let us consider two special cases of the proposed forgery attack, namely t = 0
or t = 160 cases.
When we set t = 0, the forgery attack always succeeds. In this case, the
condition (1) implies
|n| > 353e + 15. (2)
Even when e = 3, this condition implies that |n| > 1074 which is beyond 1024.
Also, we have |n| > 6017 for e = 17 and |n| > 23134577 for e = 65537. Since this
case only uses the garbage space, it allows a forgery on arbitrary chosen messages
with smaller composites than the original attack. Especially, this attack does not
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 61

require a condition on the target message m̄ and the attack always succeeds. As
a numerical example, a forged signature on the message m̄ (“[Link]”
[14]) with |n| = 1152 and e = 3 in Table 8 in appendix B, which succeed a
forgery for e = 3. Note that Bleichenbacher’s original attack cannot forge for
1152-bit composites nor the exponent e = 3.
On the other hand, when we set t = 160, the attack becomes most powerful
but the adversary can not control the hash value at all. In this case, the condition
(1) implies
|n| > 193e + 15 (3)
which is obtained by substituting t = 160 into the condition (1). Consequently,
we have |n| > 595 for e = 3, |n| > 3297 for e = 17 and |n| > 12648657 for
e = 65537. However, since the adversary can not control the hash value, the
success probability (in the sense that the adversary obtains the target message
m̄) is 2−160 which is beyond feasible. Another forged signature on the hash value

H = 7fa66ee7 e5cc4a9f bd6e13a8 11d298c2 6b9b3302

with |n| = 4096 and e = 17 in Table 9 in appendix B, which succeed a forgery


for e = 17, however, the success probability is 2−113 . Note that Bleichenbacher’s
original attack cannot forge for 1024-bit composites nor the exponent e = 17.

Table 3. A comparison of forgery attacks

Bleichenbacher’s Attack Proposed Attack


  t=0 General t t = 160
M (e) 576e + 15 − e log2 2e
e−1
353e + 15 (353 − t)e + 15 193e + 15
|n| − 15 is divisible by e
M (3) 1740 1075 1075 − 3t 595
M (17) 9790 6017 6017 − 17t 3297
M (65537) 37683790 23134577 23134577 − 65537t 12648657
Success Probability 1 1 2−t 2−160

3.3 Comparison
A comparison between the original and proposed attacks are shown in Table 3,
where M (e) denotes the minimum bit-length of the composites to which the
attack succeeds with a general exponent e. Since exponents e = 3, 17, 65537 are
widely used, corresponding values M (3), M (17), M (65537) are also included in
the comparison. As in the table, the proposed attack with t = 160 forges with
smallest composites. Especially, it only forges for |n| = 1024 (with e = 3).

4 Time-Stamp Forgery
In the previous section, we proposed a forgery attack against PKCS#1v1.5 based
on Bleichenbacher’s attack. When |n| = 1024 and e = 3, the success probability
62 T. Izu, T. Shimoyama, and M. Takenaka

Fig. 2. Outline of the time-stamp protocol

of a forgery on a chosen message is 2−16.6 . However, when the message is ‘re-


dundant’, namely, it includes supplementary data besides a body of the message
whose space is larger than 216.6 , the adversary can always forge a signature on
the chosen message by the proposed attack. In order to clarify this scenario, we
forge time-stamps by the proposed attack and (extended) Oiwa et al.’s attack,
since a time-stamp contains a 64-bit nonce. We also checked some time-stamp
verification applications whether they accept the forged time-stamp. As a result,
we would like to report that the latest version of Adobe’s Acrobat and Acrobat
Reader improperly accept the forged time-stamp (generated by the proposed
attack).

4.1 Time-Stamp Protocol


The time-stamp is a cryptographic scheme which assures the existence of a doc-
ument at a certain time. We briefly explain the time-stamp protocol defined in
RFC 3161 [4], which is a target of our forgery attack. When a user requires a
time-stamp on his document m, he sends its hash value H(m) with a 64-bit
random value (as a nonce) to a time-stamp server. The server firstly creates
a file TSTInfo which includes the given hash value H(m) (messageImprint), the
given random value (nonce) and the time information (genTime). Then the server
computes signedAttrs which includes a hash value of TSTInfo as messageDigest.
Finally, the time-stamp server generates a signature signature on signedAttrs
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 63

with his secret key and publishes it as a time-stamp on the document m and the
random value. By verifying the time-stamp, a verifier confirms the existence of
the document at the time described in genTime. An outline of the time-stamp
protocol is shown in Figure 2.
In the time-stamp protocol [4], PKCS#1v1.5 with SHA-1 can be used as a sig-
nature scheme. The center column in Table 4 shows an example of a time-stamp
issued by a trial time-stamp server maintained by Seiko Instruments Inc. [17] at
2006-09-21 [Link].64 on the document m (“[Link]” [15]) based on
PKCS#1v1.5 (|n| = 1024 and e = 3) with SHA-1. TSTInfo and signerInfos are
in PKCS#1v1.5 format. A verification result (se mod n) is also included in the
table, where underlined octets correspond to a hash value of signerInfos.

4.2 Applying Proposed Attack to Time-Stamp Protocol


Since the time-stamp is a kind of signature scheme, we can apply the proposed
forgery attack. A difference is that since a message has a 64-bit nonce, an adver-
sary can fix the message in the forgery attack: the adversary only has to change
the nonce until the forgery succeeds. Thus, the adversary can forge a chosen
message if 64-bit space is enough.
In the attack, an adversary firstly fixes the time to be stamped and sets
genTime. Then, on a chosen message, he randomly generates a 64-bit nonce
and computes messageDigest from chosen genTime and nonce. If he can forge a
signature on messageDigest, the time-stamp forgery is finished. When |n| = 1024
and e = 3, it is estimated that about 216.6 messageDigest, namely, 216.6 nonce are
required to forge a time-stamp. Since the nonce space is 64-bit, the adversary
always succeeds a forgery.
An example of a forged time-stamp is shown in the right-most column in
Table 4, where changed data (from a valid time-stamp) are only described. Here
underlined octets correspond to a hash value of signerInfos and masked octets
correspond to the garbage. If a verification implementation is loose, the forged
time-stamp will be accepted, namely, it is confirmed that the document existed
at 0001-01-01 [Link].0. Note that in this forgery, only genTime is changed
from a valid time-stamp for simplicity. The adversary can change the message,
if required.
In the above description, we only dealt with the proposed attack which is
based on Bleichenbacher’s attack. By the same approach, Oiwa et al.’s forgery
attack can be extended and applied to the time-stamp protocol with the same
ability (but the different implementation assumptions).

4.3 Checking Verification Implementations


We checked some time-stamp verification client programs whether they accept
two forged time-stamps generated by the proposed attack (denoted by TST-P)
and by Oiwa et al.’s variant (TST-O).
64 T. Izu, T. Shimoyama, and M. Takenaka

Table 4. Valid and forged time-stamps (1024-bit, e = 3)

A valid time-stamp A forged time-stamp


TSTInfo 3081f6
version 020101
policy 06090283 388c9b3d 01030230
21300906 052b0e03 021a0500
messageImprint 0414784a 21902486 c00c6dbb
a87ef918 7bdf08fc 3f9f
serialNumber 020357ee 1b
genTime 18123230 30363039 32313130 18123030 30313031 30313030
32393237 2e36345a 30303030 2e30305a
(“2006-09-21 [Link].64”) (“0001-01-01 [Link].0”)
accuracy 300a0201 00800201 f4810100
nonce 02082dff 78d3789b 9c97 02080000 00000001 bf67
tsa a08193a4 81903081 8d310b30
09060355 04061302 4a50311f
301d0603 55040a13 16536569
6b6f2049 6e737472 756d656e
74732049 6e632e31 14301206
0355040b 130b4368 726f6e6f
74727573 74312d30 2b060355
040b1324 536f7665 72656967
6e205469 6d652054 53205365
72766572 20534e3a 39314430
30363234 31183016 06035504
03130f44 656d6f54 53536f6e
53494931 3031
signerInfos
signedAttrs 3181f3
contentInfo 301a0609 2a864886 f70d0109
03310d06 0b2a8648 86f70d01
09100104 30230609 2a864886
f70d0109 043116
messageDigest 0414feca 1088d59e e69a1553 0414e8e6 4aa7ec9c 2fdb3d22
172fbc92 a8636195 6335 f7b5682a bcf9afd0 f9c7
eSSSigningCertificate 3081af06 0b2a8648 86f70d01
0910020c 31819f30 819c3081
99307f04 140bde7d 9e80ee5b
4d802804 b4c8382b ac8c4d0c
05306730 5aa45830 56310b30
09060355 04061302 4a50311f
301d0603 55040a13 16536569
6b6f2049 6e737472 756d656e
74732049 6e632e31 14301206
0355040b 130b4368 726f6e6f
74727573 74311030 0e060355
04031307 44656d6f 43413102
0900f6e2 0bb648ca fd8e3016
0414342d 517f9a5f 7ebfa4ab
4bcaffdf cf1ae903 30d6
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 65

Table 4. (continued)

signature (s) 3868bb42 9f71a918 e906f2e0 00000000 00000000 00000000


674798fd ffeef81d 5942a300 00000000 00000000 00000000
08c45e1b a8b2a966 3be95650 00000000 00000000 00000000
d6bb8501 06dea5c7 e373f820 00000000 00000000 00000000
f538f860 cf06bd78 313e51dc 00000000 00000000 00000000
070e03b5 f286bace 7dff72af 00000000 00000000 00000000
88e32a5a 6629ae72 65541ea3 00000000 00000000 00000000
28014181 b7a6424c aee030d6 0001428a 2f98d728 ae220823
548636af 20f5d6d4 f43936a2 1fc5cff6 ac440735 9b078378
f732e728 b1fbdfc6 a4cf5b0e 2484756e 10093903 9319a13e
3637bcd9 c8aa8f73 eb2d174d 90fe10aa
A verification result 0001ffff ffffffff ffffffff 0001ffff ffffffff ffff0030
(se mod n) ffffffff ffffffff ffffffff 21300906 052b0e03 021a0500
ffffffff ffffffff ffffffff 0414cb74 56c4991f 270c1044
ffffffff ffffffff ffffffff 5a7c525b 14836ec0 fe33 6a89
ffffffff ffffffff ffffffff ea7f9c08 e357202c 288839f2
ffffffff ffffffff ffffffff 0ce9cbd3 75925ad5 45f6ebf2
ffffffff ffffffff ffffffff 99a247b1 f995ae2e 7365203c
ffffffff ffffffff 00302130 ba83acf1 7ca3964d 1a204b0b
0906052b 0e03021a 05000414 2be547f6 91771716 55f5e7dd
26eee9a4 46e35f03 5c1f6857 51c0a4a3 6b06b235 6eb173da
95fc927d dc158653 482f36d7 5a1db768

Table 5. Verification results of forged time-stamps

Client Version TST-P TST-O


TrustPort Rejected Rejected
Chronotrust client program 1.0 Rejected Rejected
PFU time-stamp client tool V2.0L30 Rejected Rejected
Acrobat (Professional / Standard / Reader) 7.x, 8.x Accepted Rejected
e-timing EVIDENCE Verifier for Acrobat 2.30 (input error) (input error)

Target verification clients are as follows:


– TrustPort by AEC [2]
– Chronotrust client program by Seiko Instruments Inc. [18]
– Time-stamp client tool by PFU [12]
– Acrobat (Professional, Standard, Reader) by Adobe [1]
– e-timing EVIDENCE Verifier for Acrobat by Amano [3]
These clients are categorized by two groups: a time-stamp in the first group
should be held separately from the corresponding message while a time-stamp
in the second group should be embedded in the same file as the message.
66 T. Izu, T. Shimoyama, and M. Takenaka

Fig. 3. Verification with the forged time-stamp by Adobe’s Acrobat Reader 8.00 (1)

Verification results of above clients are summarized in Table 5. As in the table,


only Acrobat improperly accepts the forged time-stamp TST-P. More details are
described in the next subsection. Note that the output by e-timing EVIDENCE
Verifier for Acrobat are “input error”, (not “rejected”), since the time-stamp was
not generated by Amano’s time-stamp server. Thus, we couldn’t check whether
Amano’s client has the implementation error or not.

4.4 How to Deceive Adobe’s Acrobat


A Portable Document Format (PDF) is a de facto standard for describing dig-
ital documents developed by Adobe. The format allows a PDF file to embed
signatures and time-stamps in the same file. A verifier confirms the signatures
and time-stamps by specific tools (application softwares). In the current version
of PDF, the time-stamp based on RFC 3161 [4] can be used. Adobe’s Acrobat is
a widely used application for making digital files in PDF, and Adobe’s Acrobat
Reader is used for reading PDF files. In addition, Acrobat and Acrobat Reader
can verify the signatures and time-stamps embedded in the file.
We are motivated by an observation that Adobe has not published any re-
sponse to the vulnerability report VU#845620 [7]. A PDF file of the schedule
of the rump session of CRYPTO 2006 (held on August 2006) was used as a
target message. First, we obtain a time-stamp from a trial-server maintained
by Seiko Instruments Inc. [17] based on PKCS#1v1.5 (|n| = 1024 and e = 3)
with SHA-1. Then, we generate a forged time-stamp which asserts that the doc-
ument existed at 2006-04-01 [Link].0. In fact, Adobe’s Acrobat Reader 7.09
accepts the forged time-stamp as in Figure 3 and 4. As far as we examined,
the same vulnerability is found in Acrobat 7.09 and Acrobat Reader 7.09, 8.00.
Note that we have already submitted the vulnerability report to a governmental
organization.
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 67

Fig. 4. Verification with the forged time-stamp by Adobe’s Acrobat Reader 8.00 (2)

5 Concluding Remarks
This paper analyzes Bleichenbacher’s forgery attack against the signature scheme
RSASSA-PKCS1-v1 5 (PKCS#1v1.5) with the implementation error, and pro-
posed an extended attack. As an example, we forged a time-stamp with 1024-bit
composite and the exponent 3, and showed that the latest version of Adobe’s
Acrobat and Acrobat Reader accept the forged time-stamp improperly. Resist-
ing the original and proposed forgery attacks is easy and simple: fixing the loose
implementation is enough. Avoiding small exponents might be another counter-
measure, however, once a certificate with exponent e = 3 is issued by a trusted
party, one cannot refuse it when the verification is accepted.
As described in this paper, the proposed attack succeeds a forgery on an arbi-
trary message with redundancy when |n| = 1024 and e = 3. Though this paper
only dealt with the time-stamp, similar forgery is possible on other messages
with redundancy. One future work is a signature forgery on a message in PDF
or PS files and other is a certificate forgery. Searching loose implementations
which accept these forgeries is also required.

Acknowledgements
The authors would like to thank the program committee of the conference and
anonymous reviewers on the preliminary version of this paper for their helpful
comments and suggestions.
68 T. Izu, T. Shimoyama, and M. Takenaka

References
1. Adobe Systems Inc., Adobe Acrobat family,
[Link]
2. AEC, TrustPort. [Link]
3. Amano, E-timing EVIDENCE Verifier for Acrobat (in Japanese),
[Link]
4. Adams, C., Chain, P., Pinkas, D., Zuccherato, R.: Internet X.509 Public
Key Infrastructure: Time-Stamp Protocol (TSP), RFC 3161 (August 2001),
[Link]
5. Bleichenbacher, D.: Chosen Ciphertext Attacks against Protocols Based on RSA
Encryption Standard PKCS#1. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS,
vol. 1462, pp. 1–12. Springer, Heidelberg (1998)
6. Bleichenbacher, D.: Forging Some RSA Signatures with Pencil and Paper. In:
Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, Springer, Heidelberg (2006)
7. US-CERT, Multiple RSA Implementations Fail to Properly Handle Signatures,
Vulnerability Note VU#845620 (September 5, 2006),
[Link]
8. Finney, H.: Bleichenbacher’s RSA Signature Forgery Based on Implementation
Error. e-mail (August 27, 2006),
[Link]
9. Izu, T., Takenaka, M., Shimoyama, T.: Analysis on Bleichenbacher’s Forgery At-
tack. In: WAIS 2007, pp. 1167–1174. IEEE Computer Society, Los Alamitos (2007)
10. NTT Communications, Certificates for the Internal Credit Application CA. (in
Japanese) [Link]
11. Oiwa, Y., Kobara, K., Watanabe, H.: A New Variant for an Attack Against RSA
Signature Verification using Parameter Field. In: EUROPKI 2007 (June 2007)
12. PFU, PFU time-stamp service (in Japanese), [Link]
13. RSA Laboratories, RSA PKCS #1 v2.1: RSA Cryptography Standard (June 14,
2002)
14. A digital file of [13] in WORD format,
[Link]
15. A digital file of [13]in PDF format,
[Link]
16. RSA Laboratories, Public-Key Cryptography Standards (PKCS) #1:
RSA Cryptography Specifications Version 2.1, RFC 3447 (February 2003)
[Link]
17. Seiko Instruments Inc., a trial time-stamp service (in Japanese),
[Link]
18. Seiko Instruments Inc., Chronotrust (in Japanese),
[Link]
19. Tews, E.: Real World Exploit for Bleichenbacher’s Attack on SSL. e-mail submitted
to the Cryptography Mailing List (September 14, 2006),
[Link]
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 69

A Numerical Example of Bleichenbacher’s Forgery

We show a numerical example of Bleichenbacher’s original forgery attack [6].


For obtaining a valid signature, a 3072-bit composite n and a secret key d were
generated by OpenSSL as in Table 6. We used the hash function SHA-1 and a
public key e = 3 and chose a digital file “[Link]” [14] as a message m
(whose corresponding a is divisible by 3, fortunately).
A valid signature s on the message m, and a forged signature s̄ on the same
message m are shown in Table 7, where underlined octets correspond to the
hashed value H(m) and masked octets correspond to the garbage g. Comparing
se and s̄e in Table 7, all octets are same except the number of the octet ff and
the garbage. Thus, if an implementation ignores these differences, the forged
signature s̄ is accepted in the verification. Actually, OpenSSL 0.9.7f accepts the
forged signature s̄ on the message m.

Table 6. RSA parameters (3072-bit)

n d9057e4d 2e231c66 f0a35c2c b7eddb75 04b181d6 535b81b3 83eb4765 1d76950d


76c0c513 9efc0933 16255a5a a958007c 1b698c4c 2641418a dab6419f 8c8cf6a9
ac799a12 7b0ec916 b5837e9c 0ecb3dc3 9629427c 08b9b076 1014d3fb c2d6d26f
aade8a49 7aa8b03a 8e0fa396 6f6b54bd 2735a972 85cbaaed 4760ff5c 7c8b4fe2
3d6c053c 69d0fa64 ef3ec8ad 4fa03c16 9b8e5a68 466f7dbb 1f05f6ec caf9706c
d524b148 c41ccb67 512bcf40 b6456321 1a420f22 fedeaf1a 44ff940d eeec2117
9ce14bec 73b5b294 f0723d03 3a810ac3 a98dc56a a9e94eca 798c2033 3fa79eb8
ea10d25b cca36cc2 b14f4c53 3c42560c aafbb7c6 5d524591 68f8b4e0 99351f23
8f5fbf52 ee002fb8 240f7323 938207e3 59a17330 b7df56ef e8660f9a 5cc319ce
d3d93f25 84f5e42a 80f0acdd dec65d4d 629e2250 cbbb06f5 7ceab655 b22216d7
9120bdc9 216310be 4c3b81ea 92017a0b 8205e92d afb9c402 9b0f4603 2a847f67
ba0c271e a3c8f60d 5c48f4fe 22e0d3e9 3b72e9ce 1e5191bc 6167decd cde29c89
d 90ae5433 74176844 a06ce81d cff3e7a3 5876568e e23d0122 57f22f98 be4f0e08
f9d5d8b7 bf52b0cc b96e3c3c 70e555a8 12465d88 1980d65c 91ced66a 5db34f1b
c8511161 a75f30b9 ce57a9bd 5f32292d 0ec62c52 b07bcaf9 600de2a7 d739e19f
c73f06db a71b2027 095fc264 4a478dd3 6f791ba1 ae87c748 da40aa3d a85cdfec
28f2ae28 468b5198 9f7f3073 8a6ad2b9 bd09919a d99fa927 6a03f9f3 31fba048
8e187630 82bddcef 8b728a2b 242e4216 11815f6c a9e9ca11 83550d5e 9f48160e
83a1eb18 fd99ce23 eb14095c 333f0375 747bec29 cbe110e8 4aee7d3b 98b0e20a
53586ce9 319c9857 50fd3c8f 7cc6613f 773748a6 9aa5550c fb691771 f5921b52
dedacd6c cbcee703 1663f656 2019fcd7 2fcd66f5 2f6b5d86 f148f420 5eed94b1
170937ea 8cd536d2 932435c3 adb9d529 98ab1613 8a24f1e2 c9b1c7ad cd57713c
c08f4ccf d8a2dc47 681ef8c0 3fe709d9 52dd12ca edbba76c 21629613 fe8e0343
193e73a5 26533256 aedda14e 6517c092 52a66013 4a2acb98 2c5e2ec1 9fdd7cab
e 03
70 T. Izu, T. Shimoyama, and M. Takenaka

Table 7. Valid and forged signatures (3072-bit, e = 3)

m “[Link]” [14]
H(m)f7497dac 551ec010 2f0da8f1 bc8cad52 f93476c3
s 8e33fd97 65de866e 6af1c2ee 0beea1fc 26f7207c 3c9881ef f37876a0 6332d88c
526f8102 93d21d6e 392c248a 1d2b0d6f 2f8ade54 29420bdb 78bd384c 7ef5a52f
2249759e 1edef3f3 88f5d67f c53e8e68 f3dcb403 59716aca 1c3d911d 73fb031d
8cb7b0d3 c3b4a378 02ad1ad5 595859e9 1bd61f51 95e7c275 cc0bfe93 96aee5d2
69474578 7f8b2488 95fd7676 d1dbd964 50cf6ad6 10869c65 aa1520df 508a4376
354b27b5 49677f28 5bcd54e3 b4c3aaa9 1225a955 7e630201 3343b6f8 56de4cbd
af8e227e 4c755675 71c86627 af4ea910 8ecc1d1f 00331169 597d31b5 2028877c
3904b4c1 03077f11 fe4cf28a 79e41bf3 473083ce af4039ae aa92ac62 2826fc90
aef29c49 66bfc99c 01421130 d2b6313d 07031652 1862e9d5 fb3715e7 00fc168b
abc17ac4 c3b1a83c abe59ab6 34e29539 0c51fafa 685aeeb9 c53aa717 c2cb3960
eae314b8 ba09ef93 bef18bea 59502641 08e31ffc 569ed6aa b3f145f8 d0e82466
8d2ca851 e6a279c7 474387ea 3d300923 dbbaa193 a0baf928 2668fa60 469ecc14
se mod n 0001ffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff 00302130
0906052b 0e03021a 05000414 f7497dac 551ec010 2f0da8f1 bc8cad52 f93476c3
s̄ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
07ffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff fffffffe aaead6ea b6b2b18e
bd595822 b1555ac6 9f0ca790 717e556a e9678bec fb663c6e a19b4904 00000000
How to Forge a Time-Stamp Which Adobe’s Acrobat Accepts 71

Table 7. (continued)

s̄e 0001ffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff


ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ff003021 30090605
2b0e0302 1a050004 14f7497d ac551ec0 102f0da8 f1bc8cad 52f93476 c3 000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 2a9aa11c bb60cb35 cb569ddd 576c2729
34a1298d 905793b0 24ba9a39 7f041398 a7622310 78e8099f 87faed46 0fbb8f46
67ace20c a1940f81 bced58bf 9ac3671c a2551f73 4cb80ec1 7fffffff ffffffff
ffffffff fffffffd a285694c d9347ab7 528d15f9 d0dbf0cc 704f592f da3facc6
210397ee 5d034b6d 269467e8 329d478c 53a8e99d 80f0732a 05d709d4 00e7ada7
7ddc41a8 e640296f b2a8eae6 f4888211 591f0578 a07d6ec4 f147f08e ccb06340
4439cb38 fc8144b0 cb0e382b 65583078 a7e9b040 00000000 00000000 00000000

B Numerical Example of Proposed Forgery with t = 0


As an example of the proposed attack with t = 0, we show a forged signa-
ture on the message m̄ (”[Link]” [14]) with |n| = 1152 and e = 3
in Table 8. Here, underlined octets correspond to the hashed value H(m̄) and
masked octets correspond to the garbage g. Note that this case succeeds a
forgery for 1152-bit composites while the original attack cannot. Also note that
a certificate with |n| = 1152 and e = 3 is used in practice [10].

Table 8. A forged signature by the extended forgery attack (1152-bit, e = 3, t = 0)

m̄ “[Link]” [14]
H(m̄) f7497dac 551ec010 2f0da8f1 bc8cad52 f93476c3
s̄ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
07ffffff ffffffff feaaead6 eab6b2b1 8e848b2b fc6229a1 298029f9 27529629
bb642126 87226bf8 913ab27d 52295002
s̄e 0001ffff ffffffff ffff0030 21300906 052b0e03 021a0500 0414f749 7dac551e
c0102f0d a8f1bc8c ad52f934 76c3 0000 008e30ab d25ce35d 65cd0c25 1fc29df3
37419efd 4d08694d f3b45d86 42970cbe ef3cb225 c0e88433 552da1d0 dc35aaa1
73f1189f e0b341fc 56d5c5ea 45db5483 15e79d2a 71b6235a 44891287 00bb02f9
ffabe940 83af15c8 eabb0c30 2fefc008
72 T. Izu, T. Shimoyama, and M. Takenaka

C Numerical Example of Proposed Forgery with t = 160

As an example of the proposed attack with t = 160, we show a forged signature


on the hash value H(m̄) with |n| = 4096 and e = 17 in Table 8. Note that this
case succeeds a forgery for e = 17 while the original attack cannot. However, the
adversary cannot obtain the message m̄ in practice.

Table 9. A forged signature by the extended forgery attack (4096-bit, e = 17)

H(m̄) 7fa66ee7 e5cc4a9f bd6e13a8 11d298c2 6b9b3302


s̄ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00010aa7 58ccbbf7 7d970c35 9e1c3dc0 f20d32ad 2cf9e18a 463ea7c6 346e7f90
s̄e 0001ffff ffffffff ffff0030 21300906 052b0e03 021a0500 04147fa6 6ee7e5cc
4a9fbd6e 13a811d2 98c26b9b 3302 448a 78e5e262 89a4190f 7d18916a 7aaaf897
feeb1e94 5866a030 208c1f48 2c906901 5f70eb66 97253c87 49790ff7 c175fc06
bddf8bb4 d2ba1cdd c626336a dda2165c dc3f425a 12cc59bc be11883e bbccc73a
0d130b94 83ac2a29 19850778 f066ff4f 374e7a96 f4fb3343 fd397d9c f7a1b8ce
16340da6 f9876f1f cca76cb4 7bfb368b a95a5842 e99c0bfb a2de62cf dbf2c635
c2c268f3 2dc228f7 2f0ebfe2 776dae35 3b82b9d9 474777ed c85eed79 e147fa2b
7500f1d4 23189a7b 9b08abb6 0df908f0 7c1c0fbb 528b3e22 df358b24 8bef05b8
f2449d0b f3fb6dc6 31a809ed 31000210 3df7ae2e 80f3f822 ae5a9f69 2948a2b5
a4529bf0 2b30fc99 1874a25f 28b5de4d 4f9c76cc 419a6848 4536e2fe 2771af8b
989e5fef 1a3aaeea f1694ebf 36e8685c 7f65eff8 b99d956b 676b5a5d f68c4519
330b4b7b 82037bd7 502d7823 4e952ba7 b9662cc2 e4389d00 76e16a47 e3dad8af
e7f86e37 f164aa90 b377dfbc 9d5cc1a4 e1a966fe 3902fea5 2526240b 99ecf6b3
ced8e16e 2d085131 e5ca1676 25459ca0 0821ff8e 03cde17d 3509de96 cbe40f6f
97d5dd5b b7c977fa be2be4f5 79abcbf7 7093ad52 c346371b 5b2708fc 8b831412
9a023cfc 6b2ff020 105db3ac ef80a605 e3c1ea94 d0af9790 00000000 00000000
Efficient Computation of the Best Quadratic
Approximations of Cubic Boolean Functions

Nicholas Kolokotronis1,2, Konstantinos Limniotis1 ,


and Nicholas Kalouptsidis1
1
Department of Informatics and Telecommunications
National and Kapodistrian University of Athens
TYPA Buildings, University Campus, 15784 Athens, Greece
{nkolok, klimn, kalou}@[Link]
2
Department of Computer Science and Technology
University of Peloponnese
End of Karaiskaki Street, 22100 Tripolis, Greece
nkolok@[Link]

Abstract. The problem of computing best quadratic approximations of


a subset of cubic functions with arbitrary number of variables is treated
in this paper. We provide methods for their efficient calculation by means
of best affine approximations of quadratic functions, for which formulas
for their direct computation, without using Walsh-Hadamard transform,
are proved. The notion of second-order nonlinearity is introduced as the
minimum distance from all quadratic functions. Cubic functions, in the
above subset, with maximum second-order nonlinearity are determined,
leading to a new lower bound for the covering radius of the second order
Reed-Muller code R(2, n). Moreover, a preliminary study of the second-
order nonlinearity of known bent functions constructions is also given.

Keywords: Boolean functions; bent functions; covering radius; second


order nonlinearity; low-order approximations; Reed-Muller codes.

1 Introduction
Boolean functions play a prominent role in the security of cryptosystems. Their
most important cryptographic applications are in the analysis and design of s-
boxes for block ciphers, as well as, filter/combining functions for stream ciphers
[23]. In general, resistance of cryptosystems to various cryptanalytic attacks
depends on the properties of the Boolean functions used. The nonlinearity of
Boolean functions is one of the most significant cryptographic properties; it is
defined as the minimum distance from all affine functions, and indicates the
degree to which attacks based on linear cryptanalysis [21], and best affine ap-
proximations [9], are prevented. For an even number of variables the maximum
possible nonlinearity can only be attained by the so-called bent functions [25].

This work was partially supported by a Greece–Britain Joint Research & Technology
Programme, co-funded by the Greek General Secretariat for Research & Technology
(GSRT) and the British Council, under contract GSRT 132 – C.

S.D. Galbraith (Eds.): Cryptography and Coding 2007, LNCS 4887, pp. 73–91, 2007.

c Springer-Verlag Berlin Heidelberg 2007
74 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

The appearance of recent attacks, like algebraic [7] and low order approximation
attacks [12,16,18], necessitates the design of Boolean functions that cannot be
approximated efficiently by low degree functions. This problem has also been
studied in [24], where an algorithm to determine good low order approximations
by repetitive Hamming sphere sampling was presented. The computation of the
best rth order approximations and rth order nonlinearity are known to be diffi-
cult tasks, even for the case r = 2 [4]; in particular, the second order nonlinearity
is unknown, with the exception of some special cases, or when the number of
variables is adequately small.
Motivated by the aforementioned need, in this paper we focus on the efficient
computation of best quadratic approximations of a certain class of cubic Boolean
functions. This led to generalizing notions that are familiar from best affine ap-
proximations, e.g. nonlinearity is extended to 2nd order nonlinearity, defined as
the minimum distance from quadratic functions. Explicit formulas are proved
that compute all best affine (resp. quadratic) approximations of the quadratic
(resp. cubic) functions, without use of the Walsh-Hadamard transform, by ap-
plying the Shannon expansion formula. The results obtained are general, and
most importantly, they hold for an arbitrary number of variables; they focus on
the theoretical aspects of the subject rather than the algorithmic ones. Cubic
functions with maximum (within the subset considered in this paper) second or-
der nonlinearity are determined, yielding a lower bound on the covering radius
of R(2, n). It is shown that their structure is similar to that of quadratic bent
functions. Finally, a preliminary analysis of the second-order nonlinearity for
known constructions of bent functions is given, indicating potential weaknesses
when parameters are not properly chosen. Since several constructions of crypto-
graphic primitives, based on quadratic and cubic functions, have been proposed
in the literature (see e.g. [1] and [11] respectively) due to their efficient imple-
mentation, our results are of high cryptographic value when such functions need
to be applied.
The paper is organized as follows: Section 2 provides the background and in-
troduces the notation. The properties of best affine approximations of quadratic
functions are treated in Section 3, whereas Section 4 studies best quadratic ap-
proximations of cubic Boolean functions and determines efficient ways for their
computation. Constructions of bent functions are analyzed in Section 5, and
Section 6 summarizes our conclusions.

2 Preliminaries
Let f : Fn2 → F2 be a Boolean function, where F2 = {0, 1} is the binary field.
The set of Boolean functions on n variables is denoted by Bn ; they are commonly
expressed in their algebraic normal form (ANF) as

f (x1 , . . . xn ) = aj xj11 · · · xjnn , aj ∈ F 2 (1)
j ∈ Fn
2

where j = (j1 , . . . , jn ) and the sum is performed modulo 2. The algebraic degree
of function f is defined as deg(f ) = max{wt(j) : aj = 1}, where wt(j) is the
Efficient Computation of the Best Quadratic Approximations 75

Hamming weight of vector j. The terms of degree k ≤ deg(f ) that appear in


(1) are called the kth degree part of f ∈ Bn ; its truth table is the binary vector
f = (f (0, 0, . . . , 0), f (1, 0, . . . , 0), . . . , f (1, 1, . . . , 1)) of length 2n , also denoted by
f for simplicity. If deg(f ) ≤ r, it is well-known that f is codeword of the rth
order binary Reed-Muller code R(r, n) [20]. The Boolean function f ∈ Bn is
said to be balanced if wt(f ) = 2n−1 , and its Hamming distance from g ∈ Bn is
defined as wt(f + g). Any Boolean function admits the following decomposition.
Definition 1. Let j1 , . . . , jk be integers such that 1 ≤ j1 < · · · < jk ≤ n and
k < n. Further, let r = r1 + 2r2 + · · · + 2k−1 rk be the binary expansion of the
integer 0 ≤ r < 2k . The expression
2
k  k
−1 

f (x1 , . . . , xn ) = (xji + r̄i ) fr (2)
r=0 i=1

where r̄i = ri + 1 denotes the complement of ri , and each fr ∈ Bn−k does not
depend on xj1 , . . . , xjk , is called the kth order Shannon’s expansion formula of
f with respect to the variables xj1 , . . . , xjk [17].
The sub-functions f0 , . . . , f2k −1 in (2) are uniquely determined from f by setting
xji = ri , 1 ≤ i ≤ k. Subsequently, we write f = f0 J · · · J f2k −1 instead of (2),
where J = {j1 , . . . , jk }, and f = f0 j f1 when J = {j}. If J = {n − k + 1, . . . , n},
the truth table of f (x1 , . . . , xn ) is constructed by concatenating the truth tables
of the sub-functions fr (x1 , . . . , xn−k ); this case will be denoted by f = f0  · · · 
f2k −1 and simply referred to as the kth order Shannon’s expansion formula of f .
The Walsh-Hadamard transform χ f (a) of the Boolean function f ∈ Bn at
a ∈ Fn2 is the real-valued function given by [20]

f (a) =
χ χf (x)(−1)a,x = 2n − 2 wt(f + φa ) (3)
x ∈ Fn
2

where χf (x) = (−1)f (x) and φa (x) = a, x = a1 x1 + · · · + an xn . Clearly f is


f (0) = 0. The nonlinearity of f is its minimum distance
balanced if and only if χ
from all affine functions, and is determined by
  1
NLf = min wt(f + v) = 2n−1 − maxn | χf (a)| . (4)
v ∈ R(1,n) 2 a ∈ F2

Any affine function v such that wt(f + v) = NLf is called a best affine ap-
proximation of f and is denoted by λf , whereas the set comprising of the best
affine approximations of f is denoted by Af ⊆ R(1, n). The definition of the
nonlinearity leads directly to the following well-known result [4].
Lemma 1. Let f ∈ Bn and v ∈ R(1, n). Then, λ ∈ Af if and only if λ + v ∈
Af +v . Further, |Af +v | = |Af |, i.e. both sets have the same cardinality.
An equivalent statement of Lemma 1, which is subsequently used, is that λf +v =
λf + v for any linear function v. Likewise, the minimum distance between f and
76 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

all quadratic functions is called second-order nonlinearity of f and is denoted by


NQf ; it is given by  
NQf = min wt(f + u) . (5)
u ∈ R(2,n)

From (4), (5) we clearly obtain that NQf ≤ NLf . Likewise, any quadratic func-
tion u with the property wt(f + u) = NQf is said to be a best quadratic approx-
imation of f and is denoted by ξf , whereas the set comprising of best quadratic
approximations of f is denoted by Qf ⊆ R(2, n).

3 Best Affine Approximations of Quadratic Functions


Let f ∈ Bn be a quadratic function and x = (x1 , . . . , xn ); then, f can be written
as f = xQxT + LxT +  for some upper triangular binary matrix Q, binary
vector L, and a constant  ∈ F2 , where xQxT is the quadratic part of f . It is
well-known that (see e.g. [20, pp. 434–442]) the rank of the symplectic matrix
Q + QT equals 2h, for some 1 ≤ h ≤ n/2
. Then, by Dickson’s theorem there
exists a nonsingular matrix R = (ri,j )ni,j=1 such that under the transformation
of variables g = xR, function f becomes


h
f = g0 + g2i−1 g2i , deg(g0 ) ≤ 1 and deg(gj ) = 1 (6)
i=1

where g0 = g̃ + L(R−1 )T g T + , for some linear function g̃ obtained from the


quadratic part of f , and {g1 , . . . , g2h } are linearly independent linear functions
n
(actually, we have gj = i=1 ri,j xi ). Since h only depends on the quadratic part
of function f , it is denoted by hf ; by convention hf = 0 if f ∈ R(1, n). As seen
below, the Walsh spectra of f are fully determined by the value of hf , as well
as its nonlinearity.

Theorem 1 ([20]). Let Bf = {f +v : v ∈ R(1, n)} for a fixed quadratic Boolean


function f ∈ R(2, n). Then wt(f + v) is three-valued
 
wt(f + v) ∈ 2n−1 , 2n−1 ± 2n−hf −1

where each value 2n−1 ±2n−hf −1 occurs 22hf times, and 2n−1 occurs the remain-
ing 2n+1 − 22hf +1 times.

According to (4), the nonlinearity of any quadratic function f ∈ Bn equals


2n−1 − 2n−hf −1 . The following statement allows the direct computation of all
best affine approximations of a quadratic function.

Theorem 2. Let the Boolean function f ∈ R(2, n) be given by (6). Then, for
b = (b1 , . . . , b2h ), its best affine approximations are given by

 
2h 
h

Af = λbf ∈ R(1, n) : λbf = g0 + bi gi + b2i−1 b2i , b ∈ F2h
2 .
i=1 i=1
Efficient Computation of the Best Quadratic Approximations 77

Proof. First note that the affine Boolean functions λbf are pairwise distinct since
from hypothesis we have that {g1 , . . . , g2h } are linearly independent; thus |Af | =
22h . Furthermore, for all b ∈ F2h
2 , we have that


h


f + λbf = g2i−1 g2i + b2i−1 g2i−1 + b2i g2i + b2i−1 b2i
i=1

h



= g2i−1 + b2i g2i + b2i−1 . (7)
i=1

Since {g1 + b2 , . . . , g2h + b2h−1 } are also linearly independent, we get that for any
choice of b ∈ F2h b
2 it holds wt(f + λf ) = 2
n−1
− 2n−1−h [20], which by Theorem
1 is the minimum distance between f and all affine functions. The fact that the
number of best affine approximations of f is 22h (equal to the number of different
λbf constructed here) concludes our proof.

Example 1. Let f ∈ B5 be the quadratic function given by f (x1 , . . . , x5 ) =


(x1 + x3 )(x2 + x5 ) + x2 + x4 . Its best affine approximations are

λ0f = x2 + x4 λ1f = x1 + x2 + x3 + x4
λ2f = x4 + x5 λ3f = x1 + x3 + x4 + x5 + 1

by Theorem 2. Note that one of the solutions is the linear part of f .


The complexity of determining Af by Theorem 2 is O(n3 ) and rests with finding


(6) [19,20]; thus, the above method is more efficient than the application of the
fast Walsh transform, whose complexity is O(n2n ), for computing all the best
affine approximations.

Proposition
 1. Let the Boolean  functions f1 , . . . , fs ∈ R(2, n) be given by (6)
and si=1 gi,1 , . . . , gi,2hfi be linearly independent, s ≥ 2. Further, let bi =
(bi,1 , . . . , bi,2hfi ) be binary vectors of length 2hfi , 1 ≤ i ≤ s. Then
2hf1 +···+2hfs
λbf1 +···+fs = λbf11 + · · · + λbfss , ∀ b = (b1 , . . . , bs ) ∈ F2 . (8)

Proof. Clearly, (8) holds for s = 1; we will only prove its validity for s = 2,
since the general case is then established by induction on s. Let us write f =
f1 + f2 , and denote hf , hfi as h, hi respectively. By hypothesis we get fi =
hi 2
gi,0 + j=1 gi,2j−1 gi,2j ; since i=1 {gi,1 , . . . , gi,2hi } are linearly independent we
obtain h = h1 + h2 , and from Theorem 2 the best affine approximations of f ,
for all b ∈ F2h
2 , are given by


2h1 
2h2 
h
λbf = g1,0 + g2,0 + bj g1,j + b2h1 +j g2,j + b2i−1 b2i = λcf11 + λcf22
j=1 j=1 i=1

where c1 = (b1 , . . . , b2h1 ) ∈ F2h


2 , c2 = (b2h1 +1 , . . . , b2h1 +2h2 ) ∈ F2 .
1 2h2


78 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

4 Best Quadratic Approximations of Cubic Functions

In this section we develop a detailed theoretical framework for efficiently com-


puting the best quadratic approximation of cubic Boolean functions. First, we
need to introduce the following classification.

Definition 2. The Boolean function f ∈ R(3, n) is said to be a class-m function


if there exists a set J = {j1 , . . . , jm } with 1 ≤ j1 < · · · < jm ≤ n such that each
cubic term of f (Ax + b) involves at least one variable with index in J, and
m > 0 is the smallest integer that is obtained by a suitable invertible affine
transformation of f .

The above definition implies that a cubic function belongs to exactly one class,
due to the minimality of m (however J is not always unique). An important sub-
set of class-m functions are the separable class-m functions whose cubic terms
involve exactly one variable with index in J; all others are referred to as insep-
arable. The equivalence classes for cubic functions of certain weight found in
[13,14] are separable. An important property of class-m cubic Boolean functions
is given below.

Proposition 2. All class-m cubic Boolean functions f ∈ R(3, n), with J =


{j1 , . . . , jm }, admit the following properties

1. Let J ⊂ J with cardinality k, 1 ≤ k ≤ m − 1. From the decomposition


f = f0 J · · · J f2k −1 we have that all fi ∈ Bn−k are class-(m − k) cubic
Boolean functions with the same cubic part;
2. Moreover, m is the least integer such that f = f0 J · · · J f2m −1 with
deg(fi ) < deg(f ) = 3 for all 0 ≤ i < 2m .

Proof. The proof is provided in Appendix A.


Proposition 2 leads to an alternative definition of class-m cubic functions; that


is, m > 0 is the least integer such that a proper choice of m variables leads to a
decrease in the degree of sub-functions, if we apply mth order Shannon expansion
with respect to these variables. Next we focus on the separable cubic functions,
and prove a series of results to determine their best quadratic approximations.

Lemma
m 2. Let f ∈ R(3, n) be a separable class-m function, with cubic part c =
i=1 xji qi , where qi ∈ Bn−m is quadratic function not depending on variables
with index in J = {j1 , . . . , jm }. Then, from the decomposition f = f0 J · · · J
f2m −1 we get that

fr = q + r, p + lr , 0 ≤ r < 2m (9)

for a quadratic q ∈ Bn−m and affine lr ∈ Bn−m Boolean functions, where r =


(r1 , . . . , rm ) is the binary representation of r, and p = (q1 , . . . , qm ).
Efficient Computation of the Best Quadratic Approximations 79

Proof. The Boolean function is written as f = c + q + l, where c, q, and l is


its cubic, quadratic, and linear part respectively. By hypothesis, f is a class-
m cubic Boolean function, and therefore according to Definition 2 we neces-
sarily have that q1 , . . . , qm = 0 and linearly independent. Indeed, let us as-
sume that there exist a1 , . . . , am ∈ F2 , not all of them being zero, such that
a1 q1 + · · · + am qm = 0; without loss of generality let am = 1. Therefore, we
have c = (xj1 + a1 xjm )q1 + · · · + (xjm−1 + am−1 xjm )qm−1 , and there exists an
invertible linear transformation (mapping xji + ai xjm to xji , for 1 ≤ i < m,
and all the remaining variables to themselves) such that f become class-(m − 1)
cubic Boolean function—contradiction. The quadratic and linear parts of f can
be similarly written as


m−1 
m 
m 
m
q= xji xjk i,k + xji li + q  and l= xji i + l (10)
i=1 k=i+1 i=1 i=1

for some quadratic q  ∈ Bn−m and linear functions l , li ∈ Bn−m that do not
depend on xj1 , . . . , xjm . Let us next introduce the auxiliary functions
 s   s−1 s   s 
   
s 
s   
g = xji qi + xji xjk i,k + xji li + q + xji i + l
i=1 i=1 k=i+1 i=1 i=1

where the parentheses are used to indicate its cubic, quadratic, and linear parts
respectively (note that g m = f whereas g 0 = q  + l ), and


s 
m
hsi = xjk k,i + rk i,k , 0≤s<i≤m
k=1 k=i+1

where rk ∈ F2 . By applying Shannon’s expansion formula recursively, we obtain



at the first step f = f0 jm f1 , where frm = g m−1 + rm (qm + lm + m + hm−1
m ) for
rm = 0, 1. Further expansion of these sub-functions gives f = (f0 jm−1 f1 ) jm
(f2 jm−1 f3 ), where


m


fr = g m−2 + ri qi + li + i + hm−2
i , 0≤r<4
i=m−1

and r = rm−1 + 2rm is the binary expansion of r. If we continue this way, we


get the decomposition f = f0 J · · · J f2m −1 after m − 2 steps, which for all
0 ≤ r < 2m leads to
 

m m

 m
 
fr = q + ri qi + l + ri li + i + k=i+1 rk i,k (11)
i=1 i=1

and r = r1 + 2r2 + · · · + 2m−1 rm is the binary expansion of r. The claim is proved



by noting that the expression inside
m the parentheses corresponds to lr in (9), q
corresponds to q, and r, p = i=1 ri qi .

80 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

We next introduce a commonly used partial ordering of elements of the vector


space Fn2 . For all a, b ∈ Fn2 we write a  b if and only if ai ≤ bi for all 1 ≤ i ≤ n.

Lemma 3. Let f = f0  · · ·  f2m −1 be a Boolean function in Bn where m > 0


and all sub-functions fr ∈ R(2, n − m) have the same quadratic
part fr = q + lr ,
for 0 ≤ r < 2m . Then, deg(f ) = 2 if and only if we have rc lr = c for some
c ∈ F2 if wt(c) = 2 and zero if wt(c) ≥ 3, c ∈ Fm
2 .

Proof. The proof is provided in Appendix B; however, it is seen that it also


holds for the case f = f0 J · · · J f2m −1 with J = {j1 , . . . , jm }. It should be
noted that the family of affine functions lr ∈ Bn−m introduced in (11) satisfies
the conditions implied by this Lemma.

Lemma 4. For all s ≥ 1 and vectors a = (a1 , . . . , as ) ∈ Zs we have that

 
s


2r,a = 1 + 2ai . (12)
r ∈ Fs2 i=1

Proof. Note that (12) holds for s = 1; suppose that it holds for s = k ≥ 1 and
all a = (a1 , . . . , ak ) ∈ Zk . Then, for s = k + 1 we have from (12) that


 r,a

k


2(r,t),(a,b) = 1 + 2b 2 = 1 + 2b 1 + 2ai
(r,t) ∈ Fk+1
2
r ∈ Fk
2
i=1

by the induction hypothesis, for all (a, b) = (a1 , . . . , ak , b) ∈ Zk+1 .


Subsequently, we present the main result of our paper.


Theorem 3. With the notation of Lemma 2, assume f ∈ Bn is a class-m
cubicfunction, and let
m  qi ∈ Bn−m be given by (6). If all linear functions in
i=1 g i,1 , . . . , g i,2hqi are linearly independent, then the best quadratic approx-
imations of f have one of the following forms

ξfs = ξf,0
s
J · · · J ξf,2
s
m −1 , s ∈ Fm
2 (13)
s
where ξf,r = q + s, p + lr + λr+s,p for all r ∈ Fm
2 .

Proof. For fixed s ∈ Fm 2 the sub-functions in (13) have the same quadratic part
q + s, p and yield a quadratic function ξfs by Lemma 3. Indeed, by Proposition
m
1 we have λr+s,p = i=1 (ri + si )λqi , whereas (11) leads to

 m
 



s
ξf,r = 2wt(c) q + s, p + l + 2wt(c)−1 ci li + i + ai λqi
rc i=1


m−1 
m
u,v , if wt(c) = 2 (cu = cv = 1)
+ 2wt(c)−2 ci cj i,j =
i=1 j=i+1
0, if wt(c) > 2
Efficient Computation of the Best Quadratic Approximations 81


since rc (ri + si )λqi equals 2wt(c) si λqi if ci = 0 (in which case ai = 2si ), and
2wt(c)−1 λqi otherwise (where ai = 1). Thus ξfs are quadratic functions for all
s ∈ Fm2 ; their distance from f is given by
 
wt(f + ξfs ) = wt(r + s, p + λr+s,p ) = wt(r, p + λr,p )
r ∈ Fm
2 \{s} r ∈ Fm
2 \{0}

which is independent of s, and therefore all ξfs , s ∈ Fm


2 , are equidistant from f .
By Theorem 1, the definition of nonlinearity, and NL0 = h0 = 0 by convention,
we have 1 ≤ hr,p ≤ (n − m)/2
for r = 0 and

 m 
 
1 1
wt(f + ξfs ) =2 n−1
− 2 n−m−1−hr,p
=2 n−1
−2 n−1
+
r ∈ Fm i=1
2 2hqi +1
2

m  
since i=1 gi,1 , . . . , gi,2hqi are linearly independent by hypothesis, and thus
hr,p = hr1 q1 +···+rm qm = r1 hq1 + · · · + rm hqm from Proposition 1 (whereas
the last equality is derived from Lemma 4). Next, we prove that the distance
of any other quadratic function from f is greater than wt(f + ξfs ). Assume
there exists u ∈ R(2, n), which does not coincide with ξfs for any s ∈ Fm 2 , and
u = (q  + l0 ) J · · · J (q  + l2 m −1 ) where lr satisfy the condition of Lemma 3.
Consequently q̃ = q  + q, q1 , . . . , qm are linearly independent, otherwise u would
be identical to one of ξfs ; indeed, if q  = q + s, p for some s ∈ Fm 2 then we would
have

wt(f + u) = wt(r + s, p + lr + lr )
r ∈ Fm
2 \{s}

≥ wt(r + s, p + λr+s,p ) = wt(f + ξfs )
r ∈ Fm
2 \{s}

where equality holds if and only if we set lr = lr + λr+s,p , for all r ∈ Fm
2 by
Lemma 1. Thus, in order to minimize wt(f + u) we get u = ξfs . Hence we only
consider q  = q + s, p for all s ∈ Fm
2 . Then, we likewise find that
 
wt(f + u) = wt(q̃ + r, p + l̃r ) ≥ 2n−1 − 2n−m−1−hq̃+r,p
r ∈ Fm
2 r ∈ Fm
2

where ˜ lr = lr + lr and equality holds if and only if ˜lr = λq̃+r,p , for all r ∈ Fm
2 by
Lemma 1. The weight of f +u is minimized if q̃ is chosen such that all hq̃+r,p take
their minimum possible value. The fact q̃ + r, p = 0 implies that hq̃+r,p ≥ 1
for all r ∈ Fm 2 ; still, not
all hq̃+r,p can be made simultaneously equal to 1.
We write q̃ as q̃ = g̃0 + 1≤j≤hq̃ g̃2j−1 g̃2j , where the linear part g̃0 is obtained
by applying Dickson’s theorem on q̃. Let us define di as the number of g̃j that
are not linearly independent from gi,j of qi , and ei as the number of products
g̃2j−1 g̃2j shared by q̃, qi , 1 ≤ i ≤ m. It is then seen that 0 ≤ di ≤ 2 min{hq̃ , hqi }
and 0 ≤ ei ≤ di /2
. Further, it can be proved that hq̃+qi ≥ hq̃ + hqi − di and
82 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

the lower bound is always attained1 if di is even and ei = di /2. Thus wt(f + u)
can be minimized mbyletting q̃ have common
 products g̃2j−1 g̃2j with q1 , . . . , qm .
By hypothesis i=1 gi,1 , . . . , gi,2hqi are linearly independent, leading to (for
all r ∈ Fm 2 )

    
m

m m
hq̃+r,p = hq̃ + ri hqi − 2ei and 0 ≤ ri ei ≤ min hq̃ , ri hqi .
i=1 i=1 i=1
m
hqi −2ei +1

Hence, we see that wt(f + u) ≥ 2 −2 n−1 n−1−hq̃
i=1 1/2 + 1/2 and
we need to examine if there exists a particular choice of parameters e1 , . . . , em
such that wt(f + u) < wt(f + ξfs ), which is equivalent to


m
1 + 2hqi
wt(f + u) < wt(f + ξfs ) ⇔ 2hq̃ −(e1 +···+em ) <1. (14)
i=1
2ei + 2hqi −ei

Since it holds 0 ≤ ei ≤ min{hq̃ , hqi }, all terms (2hqi + 1)(2hqi −ei + 2ei )−1 in
(14) are greater than or equal to 1, where equality is attained if either ei = 0
or ei = hqi = min{hq̃ , hqi }; the latter case is valid for all 1 ≤ i ≤ m only if
hq̃ ≥ max{hq1 , . . . , hqm }. Moreover, ei = 0 implies that q̃ does not have common
products with qi , whereas ei = hqi that q̃ is written as the sum of qi and another
quadratic function. Since by hypothesis q̃ = r, p, we either have 0 < ei <
min{hq̃ , hqi } for some m1 ≤ i ≤ m, or that q̃ has a product whose functions
do not depend on i=1 gi,1 , . . . , gi,2hqi , hence 2hq̃ −(e1 +···+em ) > 1, due to
0 < e1 + · · · + em < min{hq̃ , hq1 + · · · + hqm }. Therefore, in any case we get
wt(f + u) > wt(f + ξfs ).

m  
By assuming that all i=1 gi,1 , . . . , gi,2hqi are linearly independent (see Re-
mark 1 below how such a constraint can be relaxed) and hqi ≥ 1, for 1 ≤ i ≤ m,
the fact hq1 +· · ·+hqm = hq1 +···+qm ≤ (n−m)/2
leads to hqi ≤ (n−3m)/2
+1.
Thus, we have the following result.
Corollary 1. With the notation of Theorem 3, for any separable class-m cubic
function f ∈ R(3, n) we have m ≤ n/3
and
m  
1 1
NQf = 2 n−1
−2 n−1
+ (15)
i=1
2 2hqi +1

for some 1 ≤ hqi ≤ (n − 3m)/2


+ 1.
Remark 1. The applicability of the results in Theorem 3 and Corollary 1 can be
more general than currently stated. In particular, let us consider an m × m non-
  
singular matrix P = (pi,j )m
i,j=1 , Q = (q1 , . . . , qm ), and vector Q = (q1 , . . . , qm )
1
From the above, q̃ + qi has hq̃ + hqi − 2ei products (excluding common ones) of the
 
form g2j−1 g2j , where di − 2ei of the terms gj are not linearly independent (l.i.); they
can be chosen (from the di − 2ei l.i. “parity check” equations) so that they reside in
different products, forming u . Thus hq̃+qi = hq̃ + hqi − 2ei − (di − 2ei ) + hu , where
0 ≤ hu ≤ di /2 − ei , leading to hq̃ + hqi − di ≤ hq̃+qi ≤ hq̃ + hqi − ei − di /2.
Efficient Computation of the Best Quadratic Approximations 83

m  
such that Q = QP . Then, from the linear independence of gi,1 , . . . , gi,2hqi
m i=1
we get hqj = i=1 pi,j hqi , and
 m 

m  
m
hr1 q1 +···+rm qm
 = rj pi,j hqi = ri hqi = hr1 q1 +···+rm
 q
m
i=1 j=1 i=1

m
where ri = j=1 rj pi,j mod 2, for 1 ≤ i ≤ m. Hence, the results obtained in
Theorem 3 and Corollary 1 would still hold in this case, if we replace hqi with
the ith element of vector Q P −1 .

In the simple case of the class-1 Boolean functions, Corollary 1 gives that NQf =
2n−2 − 2n−2−hq1 with 1 ≤ hq1 ≤ (n − 1)/2
. Next, we prove that the maximum
second-order nonlinearity attained by a separable class-m cubic function grows
with m ≤ n/3
(see also Table 1 in Appendix C). Hence, class-1 functions
constitute the most cryptographically weak class with respect to second-order
nonlinearity. The security offered by class-m cubic functions, for large values of
m, is also attributed to the difficulty of finding a set J such that all the 2m
sub-functions in f = f0 J · · · J f2m −1 are quadratic (in order to find the best
quadratic approximations via the Theorem 3); the complexity of this step grows
exponentially with m, for fixed number of variables n.

Theorem 4. With the notation of Theorem 3, the maximum 2nd-order nonlin-


earity of a separable class-m cubic function f ∈ R(3, n) grows with m. Further,
the maximum 2nd-order nonlinearity of separable class- n/3
cubic functions is
given by 2n−1 − 12 6n/3 .

Proof. In order to establish the first part we need to determine the class of sepa-
rable cubic functions with the highest second-order nonlinearity. m
From Corollary
1 we see that this is maximized if and only if the product i=1 1/2 + 1/2hqi +1
depending on m, hq1 , . . . , hqm is minimized. Since each product term is an inte-
ger less than 1, the number m of terms must be sufficiently large. However, the
constraints on the values taken by each hqi need also be considered. Let H be
the set of distinct integers from hq1 + 1, . . . , hqm + 1, and suppose a − r, a + r ∈ H
for some a > r + 1 > 1. Then, we have the property
       2
1 1 1 1 1 1 1 1 1 1
+ + > ··· > + + > +
2 2a−r 2 2a+r 2 2a−1 2 2a+1 2 2a

from which we derive that maxa∈H {a} − mina∈H {a}, and the cardinality of
H, should be relatively small. Moreover, by noting that the sequence {(1/2 +
1/2a )i }i≥0 is purely decreasing for any a ∈ H (since then a ≥ 2), we conclude
that the highest possible second-order nonlinearity achieved by separable class-m
cubic Boolean functions, by Corollary 1, is given by
 bm  m−bm
  n−1 1 1 1 1
maxclass-m NQ = 2 n−1
−2 + +
2 2am +2 2 2am +1
84 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

 bm  m
2am +1 + 1 1 1
= 2n−1 − 2n−1 + (16)
2am +1 + 2 2 2am +1
where am = (n − m)/2m
and bm = (n − m)/2
mod m, as a result of letting
bm functions qi have hqi = am + 1, and the remaining m − bm have hqi = am .
It is clear from (16) that for small values of m, the integer am is large and
therefore the contribution of (2am +1 + 1)(2am +1 + 2)−1 ≈ 1 is negligible (bm is
also small). So, the maximum second-order nonlinearity attained by separable
class-m cubic functions grows with m ≤ n/3
. If m = n/3
, then we have
am = 1, bm = (n mod 3)/2
, and (16) becomes

(n mod 3)/2 n/3
NQf = 2n−1 − 2
(n mod 3)/2 −1 53 6 = 2n−1 − bn 12 6n/3

where the term bn equals 1 if n ≡ 0 (mod 3), (4/3)1/3 if n ≡ 1 (mod 3), and
(250/243)1/3 if n ≡ 2 (mod 3). Thus, bn ≈ 1 in all cases.

The above result also leads to a lower bound on the covering radius of the 2nd
order binary Reed-Muller code R(2, n), that is ρ(2, n) ≥ ρ3 (2, n) ≥ 2n−1 − 12 6n/3
(since we only considered cubic Boolean functions satisfying the conditions of The-
orem 3).√The lower bound given behaves well with respect to the upper bound
2n−1 − 15 2n/2−1 that has been proved in [5] (see Fig. 1 in Appendix C, and the
analysis therein). The cubic part of any separable class- n/3
Boolean function is
equivalent (under some transformation of variables y = xR) to the following
n
3 −1 

y3i−2 y3i−1 y3i + y3 n3 −2 y3 n3 −1 y3 n3 + a y3 n3 +1 y3 n3 +2
i=1

where a = 1 if n ≡ 2 (mod 3) and zero otherwise; they can be considered as


a natural extension of bent functions (they have similar representation and the
maximum possible distance from all functions of degree one less). Next we prove
a formula for directly computing ξfs from f ; comparison of (7), (17) illustrates
similarities on the way that best affine and quadratic approximations are ob-
tained in terms of the Boolean function f .
Proposition 3. With the notation of Theorem 3, the best quadratic approxima-
tions ξfs of the separable class-m cubic function f are given by

m
ξfs = f + (xji + si )(qi + λqi ) , s ∈ Fm
2 . (17)
i=1

Proof. From the proof of Theorem 3 and Definition 1, we have that for all s ∈ Fm 2
the best quadratic approximation ξfs of the class-m cubic function f , with cubic
m
part c = i=1 xji qi , is such that

s
s
ξfs + f = ξf,0 + f0 J · · · J ξf,2 m −1 + f2m −1

 
m
= (xj1 + r̄1 ) · · · (xjm + r̄m ) (ri + si )(qi + λqi )
r ∈ Fm
2 i=1
Efficient Computation of the Best Quadratic Approximations 85

 
due to the linear independence of the functions in m i=1 gi,1 , . . . , gi,2hqi and
the fact that we may write λr1 q1 +···+rm qm = r1 λq1 + · · · + rm λqm , for all r ∈ Fm 2 .
By writing the above expression as the sum of those terms for which rm = 0 and
those for rm = 1, then simple calculations give

 
m−1
ξfs + f = (xj1 + r̄1 ) · · · (xjm−1 + r̄m−1 ) (ri + si )(qi + λqi )
r ∈ Fm−1 i=1

2

+ (xjm + sm )(qm + λqm ) (xj1 + r̄1 ) · · · (xjm−1 + r̄m−1 )


r ∈ Fm−1
2

 
m−1
= (xj1 + r̄1 ) · · · (xjm−1 + r̄m−1 ) (ri + si )(qi + λqi )
r ∈ Fm−1 i=1
2

+ (xjm + sm )(qm + λqm )



since r (xj1 + r̄1 ) · · · (xjm−1 + r̄m−1 ) = 1 corresponds to the constant all-one
Boolean function. Repeated application of the above steps will lead to (17).
m
Remark 2. Given class-m cubic Boolean function f = i=1 xji qi + q + l, where
q, l are its quadratic and linear parts respectively, and q1 , . . . , qm satisfying con-
ditions of Theorem 3, its best quadratic approximations are directly computed
by means of Proposition 3 as follows
   

m


m
ξfs = q+ si qi + xji λqi + l+ si λqi , s ∈ Fm
2
i=1 i=1

where the parentheses indicate its quadratic and linear part respectively.

Example 2. Let f ∈ B8 be the


class-2 Boolean function
f (x1 , . . . , x8 ) = (x1 +
x3 )(x2 +x7 )(x3 +x5 )+(x4 +x7 ) x5 (x6 +x8 )+(x7 +x8 )x8 . It is seen that it satisfies
the conditions of Theorem 3, and therefore its best quadratic approximations
(from Proposition 3) are given by

ξf = s1 q1 + s2 q2 + (x1 + x3 + s1 )λq1 + (x4 + x7 + s2 )λq2 , si ∈ F 2

where q1 = (x2 + x7 )(x3 + x5 ) and q2 = x5 (x6 + x8 ) + (x7 + x8 )x8 . From Section


3 we know that the best affine approximations of q1 , q2 are

λq1 = a1 (x2 + x7 ) + a2 (x3 + x5 ) + a1 a2 , ai ∈ F 2 ,


λq2 = b1 x5 + b2 (x6 + x8 ) + b3 (x7 + x8 ) + b4 x8 + b1 b2 + b3 b4 , b i ∈ F2 .

Then, its second-order nonlinearity is equal to NQf = 27 − 22 · 3 · 5 = 68. Note


that NQf depends only on the choice of the cubic terms, which is a well-known
result [4].

86 N. Kolokotronis, K. Limniotis, and N. Kalouptsidis

5 Implications on Bent Functions Constructions


Many constructions of bent functions [25], having maximum nonlinearity 2n−1 −
2n/2−1 for even n, are based on the concatenation of sub-functions with low de-
gree. Next, we analyze the second-order nonlinearity of known bent constructions
and demonstrate the applicability of our results.

5.1 Maiorana-McFarland Construction


Let n be a positive integer, φ : Fn2 → Fn2 be a permutation of the elements of
the vector space Fn2 , and g : Fn2 → F2 be an arbitrary Boolean function. Then
f : Fn2 × Fn2  F2n
2 → F2 , which is given by
 
f (x, y) = x, φ(y) + g(y) , x, y ∈ Fn2 (18)

is a Maiorana-McFarland function [8,22]; it is known that all functions of this


form are bent (of highest possible degree n if g is properly chosen). Furthermore,
f can be considered as the concatenation of 2n affine sub-functions of Bn [3]. As
shown next, the second-order nonlinearity of cubic Boolean functions obtained
by (18) is very low in certain cases.

Proposition 4. Let f ∈ B2n be a cubic function given by (18) and let φ be a


linear invertible mapping. If g is separable satisfying the condition of Theorem
3, then NQf ≤ maxclass- n/3 {NQ}.
 
Proof. Since φ is linear, the expression x, φ(y) contains only quadratic terms,
leading to deg(g) = 3. By the fact g ∈ Bn we get that the maximum second-order
nonlinearity that f ∈ B2n can attain is maxclass- n/3 {NQ}, which is far below
maxclass- 2n/3 {NQ}.

The trade-off between nonlinearity and second-order nonlinearity is easily seen


even for small values of n; e.g. if n = 3 then g becomes a class-1 cubic Boolean
function, while the nonlinearity and second-order nonlinearity of f equals 28 and
8 respectively, according to Corollary 1.

5.2 Charpin-Pasalic-Tavernier Construction


The construction of cubic bent functions proposed in [6] is based on the con-
catenation of two quadratic semi-bent functions fb , fc ∈ Bn for odd n, vectors
b, c satisfying wt(b) ≡ wt(c) (mod 2), and each function given by